Tinybird Customer Story

Factorial builds real-time data products with Tinybird

Factorial turned to Tinybird to build 12 new user-facing analytics features in just a few months.
Start building
No credit card required

“Tinybird allows you to have the same kind of technology that the Facebooks, Googles, and LinkedIns of the world have, but without any of the infrastructure or maintenance. It's truly first class.”

Marc GonzĂĄlez

Marc GonzĂĄlezDirector of Data at Factorial

18.853TBprocessed per month

7,675requests per day

Founded in 2016, Factorial took aim at the stagnant HR software market with their own unique twist. Factorial is present in over 65 countries, having raised $220 million in venture funding, and employs over 950 people. The company’s goal is to bring modern HR solutions to more than 8,000 businesses worldwide, automating mundane HR tasks so that People leaders can focus on people, not paperwork.

1. Enabling new scenarios with real-time data

Using Tinybird, Factorial has improved its data freshness and reduced query latency, leading to significantly faster user feature launches. Factorial’s decision enabled them to accelerate their time to market and build great new customer experiences like Job Catalog, Audit Log, and Attendance without sacrificing reliability.

2. Real-time analytics and more use cases

Like many companies, Factorial began with a traditional batch pipeline using MySQL, Parquet, AWS Glue, Amazon S3, and Amazon Athena that was easy to set up in its early days, guided by the team’s previous experience building scalable data architectures.

This was a simple and effective setup, and at the time, their data team consisted of a single engineer. However, as the product evolved, developers increasingly needed to use this data to build user-facing features, and this architecture did not satisfy two non-negotiable requirements: data freshness and low query latency.

Although the lake house architecture proved easy to implement and served internal reporting use cases perfectly, the schedule-driven pipelines to load the data made it difficult for developers to work with. When interacting with user-facing analytical features, users demand up-to-date data. In many cases, the data available in the lake house was stale, in some cases days, or at best, hours, old. This dramatically reduced the value of the data to an end user and thus made it unattractive for developers to use the data to power new product features.

The developers made their requirements clear: they needed fresh data.

And they needed access to their data with low latency.

That’s what brought them to Tinybird.

“Our existing data engineering team is relatively small given the size of our company. We don’t have the time or manpower to worry about setting up and maintaining a complex streaming data mesh architecture. Tinybird handles our real-time infrastructure at scale and allows us to build new real-time applications using our existing skills.”

Marc GonzĂĄlez

Marc GonzĂĄlezDirector of Data at Factorial

3. Unifying batch and streaming data using Tinybird

To solve for data freshness, the team decided to switch to capturing changes in real-time from MySQL rather than a batch process like running a snapshot on a schedule. They used the MySQL CDC Source (Debezium) Connector for Confluent Cloud to implement Change Data Capture (CDC) over their production MySQL. The MySQL CDC Source (Debezium) Connector captures changes from a database and writes the changes to Apache KafkaÂź.

Additionally, Factorial’s data team saw that Kafka could serve as a reliable buffer for data; if there were any failures downstream, data could buffer in Kafka and be retried. They assessed alternative tools, such as Amazon Kinesis and Google Pub/Sub. Still, they discovered that the offset semantics in Kafka proved more flexible, allowing them to more easily resume data consumers from a previous message in the stream.

Ultimately, the Factorial team chose to use Confluent Cloud, a fully managed Kafka implementation built and operated by the original creators of Kafka.

By sending the MySQL CDC stream to Kafka, Factorial eliminated the batch process that introduced significant latency to their data. To complement their streaming pipeline, they also needed a system that would allow their developers to combine their fresh Kafka streams with historical data to power user-facing applications and which could handle analytical queries in the order of milliseconds.

To make real-time analytical queries over data streams available to developers, Factorial chose Tinybird. Tinybird took away all of the operational overhead of managing real-time analytics data infrastructure at scale in production while fulfilling the strict requirements for latency and freshness. Tinybird allows Factorial to ingest from Confluent Cloud in real-time, using Tinybird’s native Confluent connector.

Data streams from Confluent arrive in Tinybird and are enriched with historical data that lives in the platform, avoiding the limitations of stateful stream processors. This means that nearly all data processing can be performed at the time of ingestion, with the result being materialized and ready for developers to push into production features. With only two data engineers, Factorial reduced the average query time for production queries from minutes to sub-50 milliseconds. Tinybird APIs are integrated directly into Factorial’s user-facing product, greatly simplifying their application architecture and saving on further infrastructure and tooling costs.

4. Speed wins: Tinybird fuels time-to-market, a key differentiator for Factorial

Factorial completed a POC and launched their first production feature in one month, and over the next six months launched more than 12 user-facing product features that are powered by this real-time pipeline, with many more to come.

“When we switched to Tinybird, we ran a PoC and shipped our first feature to production in a month. Since then, we’ve shipped 12 new user-facing features in just a few months. There’s no way we could have done this without Tinybird.”

Marc GonzĂĄlez

Marc GonzĂĄlezDirector of Data at Factorial

Do you like this content?

Build fast data products, faster.

Try Tinybird and bring your data sources together and enable engineers to build with data in minutes. No credit card required, free to get started.
Need more? Contact sales for Enterprise support.