Dec 05, 2024

Building an Insights page for your SaaS: from idea to production

I've helped so many people add an Insights page or dashboard to their app. Here are the steps to take your user-facing analytics from idea to prod.
Gonzalo Gómez
Sales Engineer

Intro: as a Data Engineer I have helped many users complete a variation of the journey I'm about to describe, so I'm compiling here the recommended steps to go from idea to production when embedding an Insights (or "Analytics") page into your SaaS application. I will be using Tinybird, of course, but the thought process can be applied to any tool or set of tools.

You're a developer at an AI startup. Your team has built a successful chat application where users interact with various LLMs. Everything works great, but there's a growing problem - your users keep asking:

  • "How many tokens am I using?"
  • "Which models are we using the most?"
  • "Can I see usage patterns across my team?"

Your first instinct was probably to analyze the data in PostgreSQL - it's already there and you know it well. But after a few attempts, the limitations become clear: you'd need complex ETL pipelines to get the data into a proper analytics database, meaning you'd have to choose between fresh data or quick queries. Not ideal when your users want to see what is happening right now and don't want to wait for the charts to load.

So let's walk through how to build these analytics features using Tinybird.

Understanding the data

Your application generates a log entry for each chat interaction.

You also have user data that tracks organizations and subscription tiers:

What users actually need

After talking with several users (and your product team), two clear requirements emerged:

  1. Token Usage Timeline: A chart showing token consumption over time
    • "I need to see if we're approaching our limits" - Enterprise Admin
    • "I want to track which team members are using it most" - Team Lead
  2. Model Usage Distribution: Understanding which models are preferred
    • "We want to optimize our costs by using the right models" - Startup CTO
    • "I need to justify upgrading to GPT-4" - Staff Developer

To deliver these insights, you'll build a web dashboard. This dashboard will feature both charts, complete with a legend and interactive filters. Users can easily select time frames like the last week or the past 30 days, or even define a custom time range. They'll also have the ability to filter by user. For your internal team, the dashboard includes an Organization selector, so Customer Support members can view data just as a customer's organization admin would.

Building the analytics feature

Start simple

First, let's create a basic version that just works with a subset of data. No need to think about optimizations just yet. Only a prototype to test the end-to-end flow.

Get data in

You'll need to create:

  • A data source for log events
  • A data source for users

You can use a sample data or generate a test set that feels real. For the data mocking, LLMs can help (as they did with these scripts):

Generate 1000 users across 300 organizations:

Simulate realistic usage patterns. 100,000 events is OK for a development stage:

Create the API Endpoints

Now comes the interesting part: building the APIs to power our charts. You need two Tinybird Pipes:

  • calls_time_series.pipe
  • total_calls_per_model.pipe

Remember to include parameters for time range, user, and organization.

Prototype the dashboard

You'll need to:

  1. Create a new Insights page in your app
  2. Add both visualization components
  3. Connect them to your newly created API endpoints

In terms of API security, a JWT with fixed org_id is better than static tokens for your users, so develop a way to generate them and refresh them when your users log in. You can have a static token for your support team.

PS: If you're not an expert with frontend, check out Tinybird Charts for a reference implementation.

Test, correct, and ensure quality

You now have a working app. Test it:

  • Can an admin see their whole organization?
  • Do the numbers add up correctly?

If so, add tests based on the mock data to be sure future iterations won't break the behavior.

Optimize for scale

Once you've validated that your charts and data are working correctly, it's time to optimize. There are two main approaches to consider:

Schema and SQL optimization

Focus on the fundamentals:

  • Use the smallest possible data types for your columns
  • Apply basic query optimization techniques

Most importantly: choose the right sorting keys for your Data Sources.
Following these basic principles helps you get the most from your resources without complex changes.

Intermediate tables

Sometimes these tips are not enough, and you will need to create intermediate tables to prepare your data to be consumed with lower latency and fewer resources. You have two options for creating these intermediate Data Sources:

  1. Materialized Views:
    - Process data at ingestion time
    - Keep aggregations always up to date
    - Require a mindset shift to holding and merging intermediate states
  2. Copy Pipes:
    - More flexible data transformation
    - Scheduled updates
    - Good for complex transformations where MVs aren't the right approach

Pro tip: If you can implement something using Materialized Views, prefer them over Copy Pipes. They're typically faster and more efficient since they process data at ingestion time, instead of requiring a scheduled ETL.

Go to production

You've done the hard work - your analytics project is structured, optimized, and thoroughly tested. Now it's time to deploy to production.

Deploy to the Production Workspace, update tokens, deploy the Insights page, and see what happens.

Note: you tested with static data, but for a real-life scenario where you will be streaming the requests from the application, Tinybird's Events API is the easiest way to set it up. But if you already have Kafka, Kinesis, Pub/Sub, or something like that, check out the Tinybird docs for references.

Monitor the project

With your analytics in production, monitoring becomes crucial. You'll want to keep an eye on several key areas:

  • Ingestion: are events arriving as expected? Is there any unusual latency? Are you seeing any data quality issues?
  • Query: how are your endpoints performing under load? Are there any slow queries that need optimization? Are any users abusing the system? (you can add rate limits to your JWTs)
  • Async jobs: are the scheduled imports being executed as expected?

Lessons learned

  1. Start simple: Make a quick prototype to test end to end. You can always optimize after.
  2. Optimize before prod: Problems show up at scale, so be ready.
  3. Monitor from day one: Quality, volumes, errors... you should be able to detect and fix issues before end users notice.

What's next?

This is just the beginning. Users will want more features:

  • Cost analysis
  • Usage predictions
  • Custom alerts
  • Export capabilities
  • ...

But you now have a solid foundation to build on.

Resources

Conclusion

Building analytics features doesn't have to be overwhelming. Start with real user needs, build incrementally, and optimize based on actual usage.

Do you like this post?

Related posts

Multi-tenant analytics for SaaS applications
Operational Analytics in Real Time with Tinybird and Retool
Using Tinybird for real-time marketing at Tinybird
A new dashboard for Tinybird Analytics
Product design: how our SaaS integrates with git
Try out Tinybird's closed beta
Changelog: Revamping the API endpoints workflow and boosting your productivity
More Data, More Apps: Improving data ingestion in Tinybird
Real-time analytics API at scale with billions of rows
Tinybird is out of beta and open to everyone

Build fast data products, faster.

Try Tinybird and bring your data sources together and enable engineers to build with data in minutes. No credit card required, free to get started.
Need more? Contact sales for Enterprise support.