Billing and limits concepts

Tinybird plans are sized and billed according to available resources and usage, with limits that you can exceed temporarily.

Read on to understand the key concepts behind your bill and plan limits.

Active vCPU minutes / hours

Developer plans bill vCPU usage using active minutes. An active minute is when any operation used a vCPU for a minute. Multiple operations executed within the same minute still count as a single active minute. When using fractioned vCPUs, an active minute is proportional to the fraction, for example 30 seconds of 0.5 vCPU.

Plan sizes come with a number of active hours, which is the number of active minutes you can use divided by 60. If you consume all your active minutes, the overage is billed at a fixed rate per minute. Usage bursts allows you to temporarily exceed the vCPU usage limit. See vCPU burst mode.

Queries per second

Queries per second (QPS) is the number of queries per second that your plan includes. Calls to API endpoints and queries sent to the Query API count towards your QPS allowance. Queries made from the UI are excluded from the limit.

Plan sizes come with a number of included QPS, which is the maximum number of queries per second that your plan allows without incurring additional costs. If you are in a paid plan and you exceed the QPS allowance, the platform will support the traffic peaks (up to a ceiling of 4x the QPS included in your plan) but the requests above the QPS allowance will be subject to additional costs at a fixed rate per request. If you go beyond that plan's ceiling, you will be rate limited for those requests. See QPS Overages and QPS Ceiling.

You'll receive emails alerting you about QPS overages as well as when the accumulated overage costs go beyond 20% of your plan's fixed fee (due to extra QPS or Active vCPU minutes, if that were the case). If your consumption grows and upgrading to the next plan would be cheaper, we will email you as well with the recommendation.

If you're in the Free plan, you're probably still exploring the platform and how to get value for your use case, so we grant you some margin for peaks. See QPS burst mode for Free plan

vCPU burst mode

This mode allows you to temporarily exceed your vCPU limit. If you temporarily exceed your limits, you won't be billed and you'll receive an email alerting you of the situation and suggesting to increase your plan size.

Your operations can take x2 vCPU time per minute allowed in you plan for real-time operations. For batch operations, like populates or copies, we allow the whole operation to run until it reaches a platform limit, like maximum available memory.

For example, for a populate that needs 180 seconds of CPU time in a minute, if you are on a Developer Plan S-1 where we allow operations to run up to 120 seconds per minute, the operation will finish and the limit won't be triggered.

QPS Overages and QPS Ceiling

Once you are over your plan's QPS allowance, you can keep making requests up to 4 times your plans' QPS (the plan's QPS ceiling), and those requests above the allowance are subject to additional costs at a fixed rate per request. This applies to API endpoint and Query API requests.

Example:

  • You are in an S-1/4 Developer plan, which includes 10 QPS
  • Your project is growing strong and you get a sudden traffic peak of 20 QPS for a short period (let's say during 120 seconds)
  • You are under your plan's ceiling (in this case 40), so you're not rate limited.
  • There have been 10 (20 minus 10) requests over your plan's allowance for 120 seconds, which equals 1,200 requests over QPS included in your plan that will be billed at a small fixed rate per request in the next bill
  • You can stay in your current plan

You keeping growing during the next weeks:

  • You start to have a pretty regular use of ~40 QPS
  • Requests over 40 QPS (the plan's ceiling) will be rate-limited
  • After some days, you receive an email about QPS Overages and current overage costs, with a plan recommendation
  • You can upgrade to a higher Developer plan that includes 40 QPS (or whatever you need), and from that instant, your traffic falls within the new plan's ceiling.

QPS burst mode for Free plan

Your operations, while on the Free plan, can take x2 QPS per second for API endpoint requests or SQL queries.

To better understand how burst mode works, consider the following:

  • Your free plan allows 10 queries per second as the normal rate (leak rate).
  • You have a burst capacity of 20 QPS, meaning you can temporarily handle up to 20 queries per second for short bursts.
  • Each second, your bucket can "leak" 10 queries and can temporarily hold more if needed.

Here's how burst mode works in practice:

  • If you send 15 queries in one second, 10 are processed at the normal rate and 5 use burst capacity.
  • If you send 25 queries in one second, 10 are processed at normal rate, 10 use burst capacity, and 5 are rejected (over the 20 QPS burst limit).
  • The burst capacity "leaks" back to normal levels at a rate of 10 QPS, meaning after a burst, your capacity gradually returns to the standard rate.

The leaking mechanism ensures you can handle occasional traffic spikes while maintaining overall performance and preventing system overload.

For example, if you use your full burst capacity of 20 QPS, it takes about 1 second of processing at the normal 10 QPS rate before you can burst again. This helps balance flexibility for traffic spikes with consistent system performance.

Max threads

Max threads refers to the maximum number of concurrent threads that can be used to execute queries in API Endpoints and Query API calls. Each thread represents a separate processing unit that can handle part of a query independently.

Having more threads available means your queries can be processed with higher parallelism, potentially improving overall query performance when dealing with multiple concurrent requests. The number of max threads available depends on your plan:

  • Free plan: Limited to 1 thread
  • Developer and Enterprise shared plans: Limited by the threads included in your plan
  • Enterprise dedicated plans: Limited by the underlying infrastructure

While more threads can improve concurrent query processing, the final performance also depends on factors like your vCPU limit and the complexity of your queries.

Next steps