# Tinybird Documentation (FWD)

Generated on: 2025-07-11T09:32:11.569Z

URL: https://www.tinybird.co/docs/forward
Last update: 2025-06-04T08:32:07.000Z
Content:
---
title: "Get started with Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Follow this step-by-step tutorial to get started with Tinybird."
---


# Get started with Tinybird [¶](https://www.tinybird.co/docs/forward#get-started-with-tinybird)

Copy as MD Follow these steps to install Tinybird on your machine and deploy your first data project in five minutes.

See [Core concepts](/docs/forward/get-started/concepts) for a complete overview of Tinybird.

## Before you begin [¶](https://www.tinybird.co/docs/forward#before-you-begin)

To get started, you need the following:

- A container runtime, like Docker or Orbstack
- Linux or macOS

## Deploy a new project in five minutes [¶](https://www.tinybird.co/docs/forward#deploy-a-new-project-in-five-minutes)

1
### Create a Tinybird account [¶](https://www.tinybird.co/docs/forward#create-a-tinybird-account)

If you don't already have a Tinybird account, you can create one at [cloud.tinybird.co](https://cloud.tinybird.co/) -- it's free!

2
### Install and authenticate [¶](https://www.tinybird.co/docs/forward#install-and-authenticate)

Run the following command to install the Tinybird CLI:


- For macOS and Linux
- For Windows

curl https://tinybird.co | sh Then, authenticate with your Tinybird account using `tb login`:

# Opens a browser window so that you can authenticate
tb login In the browser, create a new workspace or select an existing one.

3
### Run Tinybird Local [¶](https://www.tinybird.co/docs/forward#run-tinybird-local)

After you've authenticated, run `tb local start` to start a local Tinybird instance in a Docker container, allowing you to develop and test your project locally.

# Starts the container
tb local start 4
### Create a project [¶](https://www.tinybird.co/docs/forward#create-a-project)

Pass an LLM prompt using the `--prompt` flag to generate a customized starter project. For example:

tb create --prompt "I am developing the insights page for my app. I am tracking my users activity and \
want to show them a line chart and a widget with the total amount of actions they did with time \
range filters. It is a multitenant app, so organization_id is a required parameter for all endpoints" The previous prompt creates a project in the current directory.

5
### Run the development server [¶](https://www.tinybird.co/docs/forward#run-the-development-server)

To start developing, run the `dev` command and start editing the data files within the created project directory. This command starts the development server and also provides a console to interact with the database. The project will automatically rebuild and reload upon saving changes to any file.

tb dev 6
### Deploy to Tinybird Cloud [¶](https://www.tinybird.co/docs/forward#deploy-to-tinybird-cloud)

To deploy to Tinybird Cloud, create a deployment using the `--cloud` flag. This prepares all the resources in the cloud environment.

# Prepares all resources in Tinybird Cloud
tb --cloud deploy 7
### Open the project in Tinybird Cloud [¶](https://www.tinybird.co/docs/forward#open-the-project-in-tinybird-cloud)

To open the project in Tinybird Cloud, run the following command:

# Opens the deployed project in Tinybird Cloud
tb --cloud open Go to **Endpoints** and select an endpoint to see stats and snippets.

## Next steps [¶](https://www.tinybird.co/docs/forward#next-steps)

- Familiarize yourself with Tinybird concepts. See[  Core concepts](/docs/forward/get-started/concepts)  .
- Learn about datafiles, like .datasource and .pipe files. See[  Datafiles](/docs/forward/dev-reference/datafiles)  .
- Get data into Tinybird from a variety of sources. See[  Get data in](/docs/forward/get-data-in)  .
- Browse the Tinybird CLI commands reference. See[  Commands reference](/docs/forward/dev-reference/commands)  .


---

URL: https://www.tinybird.co/docs/forward/work-with-data
Last update: 2025-07-04T07:52:33.000Z
Content:
---
title: "Work with data · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to process and copy data in Tinybird."
---


# Work with data [¶](https://www.tinybird.co/docs/forward/work-with-data#work-with-data)

Copy as MD When your data is in Tinybird, you can process it and explore it in different ways. The main way to do it is via pipes.

- [  Pipes](/docs/forward/work-with-data/pipes)   are a way to query and transform data. You can read data from a data source, apply transformations, and use the result from another pipe, like an endpoint. Pipes can be specialized into endpoints, materialized views, copy pipes, and so on.

## Publish data [¶](https://www.tinybird.co/docs/forward/work-with-data#publish-data)

When your data is in Tinybird, you can publish it as API Endpoints.

- [  Endpoints](/docs/forward/work-with-data/publish-data/endpoints)   are a way to publish data as REST API Endpoints. They can be used in your frontend, backend, or any other application and can be secured with JWT.

## Optimize [¶](https://www.tinybird.co/docs/forward/work-with-data#optimize)

Tinybird provides several ways to process and copy data between data sources to optimize your endpoints:

- [  Copy pipes](/docs/forward/work-with-data/optimize/copy-pipes)   capture the result of a pipe at a specific point in time and write it to a target data source. They can run on a schedule or run on demand, making them ideal for event-sourced snapshots, data experimentation, and deduplication with snapshots.
- [  Materialized views](/docs/forward/work-with-data/optimize/materialized-views)   continuously re-evaluate a query as new events are inserted, maintaining an always up-to-date derived dataset. Unlike copy pipes which create point-in-time snapshots, materialized views provide real-time transformations of your data.

Each approach has its own strengths and use cases:

- Use copy pipes when you need scheduled or on-demand snapshots of your data.
- Use materialized views when you need continuous, real-time transformations.

## Lineage [¶](https://www.tinybird.co/docs/forward/work-with-data#lineage)

Lineage visualizes how your data sources, endpoints, materialized views, and pipes connect and relate to each other.

You can select an item to see its details in the side panel.

## Explorations [¶](https://www.tinybird.co/docs/forward/work-with-data#explorations)

[Explorations](/docs/forward/work-with-data/explorations) is a conversational UI feature in Tinybird that lets you explore and interact with your data using natural language.

## Analytics agents [¶](https://www.tinybird.co/docs/forward/work-with-data#analytics-agents)

The Tinybird [MCP server](/docs/forward/analytics-agents/mcp) enables AI agents to connect directly to your Tinybird workspace to execute queries or use your endpoints as tools using the [Model Context Protocol](https://modelcontextprotocol.io/).

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data#next-steps)

- Learn about[  Pipes](/docs/forward/work-with-data/pipes)  .
- Learn about[  Publish data](/docs/forward/work-with-data/publish-data)  .
- Learn about[  Optimize](/docs/forward/work-with-data/optimize)  .
- Learn about[  Explorations](/docs/forward/work-with-data/explorations)  .
- Learn about[  analytics Agents](/docs/forward/analytics-agents)  .


---

URL: https://www.tinybird.co/docs/forward/test-and-deploy
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "Test and deploy · Tinybird Docs"
theme-color: "#171612"
description: "Test and deploy your Tinybird project. Make changes and deploy to Tinybird Cloud."
---


# Test and deploy your project [¶](https://www.tinybird.co/docs/forward/test-and-deploy#test-and-deploy-your-project)

Copy as MD After you've created your project, you can iterate on it, test it locally, and deploy it to Tinybird Cloud.

Tinybird makes it easy to iterate, test, and deploy your data project like any other software project.

## Development lifecycle [¶](https://www.tinybird.co/docs/forward/test-and-deploy#development-lifecycle)

The following steps describe the typical development lifecycle.

1
### Develop locally [¶](https://www.tinybird.co/docs/forward/test-and-deploy#develop-locally)

Create the project using the Tinybird CLI. See [tb create](/docs/forward/dev-reference/commands/tb-create).

Start a dev session to build your project and watch for changes as you edit the [datafiles](/docs/forward/dev-reference/datafiles) . See [tb dev](/docs/forward/dev-reference/commands/tb-dev) . For example, you might need to optimize an endpoint, add a new endpoint, or change the data type of a column.

If you make schema changes that are incompatible with the previous version, you must use a [forward query](/docs/forward/test-and-deploy/evolve-data-source#forward-query) in your .datasource file. Otherwise, your deployment will fail due to a schema mismatch. See [Evolve data sources](/docs/forward/test-and-deploy/evolve-data-source).

2
### Validate and test [¶](https://www.tinybird.co/docs/forward/test-and-deploy#validate-and-test)

With a set of datafiles, you need to test your project to ensure it behaves as expected.

There are several ways to validate and test your changes. See [Test your project](/docs/forward/test-and-deploy/test-your-project).

3
### Stage your changes [¶](https://www.tinybird.co/docs/forward/test-and-deploy#stage-your-changes)

After you've built and tested your project, you can create a staging deployment locally or in Tinybird Cloud. See [Deployments](/docs/forward/test-and-deploy/deployments).

You can run commands against the staging deployment using the `--staging` flag.

4
### Promote your changes [¶](https://www.tinybird.co/docs/forward/test-and-deploy#promote-your-changes)

After successfully creating your staging deployment, promote it to live to make the changes available to your users. See [tb deployment promote](/docs/forward/dev-reference/commands/tb-deployment#tb-deployment-promote).

Tinybird only supports one live and one staging deployment per workspace.

## Next steps [¶](https://www.tinybird.co/docs/forward/test-and-deploy#next-steps)

- Make changes to your data sources and test them. See[  Evolve data sources](/docs/forward/test-and-deploy/evolve-data-source)  .
- Test your project locally. See[  Test your project](/docs/forward/test-and-deploy/test-your-project)  .
- Learn more about[  deployments](/docs/forward/test-and-deploy/deployments)  .
- Learn about datafiles, like .datasource and .pipe files. See[  Datafiles](/docs/forward/dev-reference/datafiles)  .


---

URL: https://www.tinybird.co/docs/forward/pricing
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Pricing · Tinybird Docs"
theme-color: "#171612"
description: "Choose the Tinybird plan that fits your needs. Available plans are Free, Developer, and Enterprise."
---


# Tinybird Pricing Plans [¶](https://www.tinybird.co/docs/forward/pricing#tinybird-pricing-plans)

Copy as MD Plans describe what your organization can do in Tinybird, the compute resources at your disposal, and how usage is billed.

To learn about Tinybird pricing, see [Pricing](https://www.tinybird.co/pricing).

## Available plans [¶](https://www.tinybird.co/docs/forward/pricing#available-plans)

Tinybird offers the following plan options: Free, Developer, and Enterprise.

| Plan | What you get | When to upgrade |
| --- | --- | --- |
| Free | - All Tinybird features, with few limitations.
  - Use Tinybird without time limits.
  - You can upgrade at any time. | - You need more resources.
  - You're ready for production.
  - You want Tinybird support. |
| Developer | - More resources and less limits.
  - You can select a machine size.
  - Can deal with load peaks. | - You need more performance.
  - You need wider limits.
  - You need private networking. |
| Enterprise | - Get all the resources you need.
  - No limits in dedicated plans.
  - Detailed usage reports and tracking. | - You can adjust resources when needed.
  - You can add private networking. |

You can upgrade from Free and Developer plans at any time.

All plans include:

- Unlimited seats
- Unlimited source connections
- Unlimited tables
- Unlimited pipes
- Unlimited endpoints
- Multi-cloud, multi-region support
- RBAC and Workspace admin controls. See[  Organizations](./administration/organizations)  .
- SOC2 Type II and HIPAA compliance. See[  Compliance](./compliance)  .

## Free [¶](https://www.tinybird.co/docs/forward/pricing#free)

The Free plan provides you with a production-grade instance of Tinybird, including ingest connectors, real time querying, and managed endpoints.

See [Free plan limits](./pricing/limits#free-plan-limits) for more information on limits.

### Support [¶](https://www.tinybird.co/docs/forward/pricing#free-support)

Support for the Free plan is available through the [Community Slack](/docs/community) , which is monitored by the Tinybird team on a best-effort basis.

### Upgrade [¶](https://www.tinybird.co/docs/forward/pricing#free-upgrade)

To upgrade to a Developer plan:

1. Go to**  Organization settings**  ,**  Billing**  .
2. Select**  Upgrade**  .
3. Select your vCPU size and fill out the details.
4. Confirm the upgrade.

To upgrade to an Enterprise plan, [contact sales](https://www.tinybird.co/contact-us).

## Developer [¶](https://www.tinybird.co/docs/forward/pricing#developer)

The Developer plan provides you with a database within a large shared cluster. All customers on the cluster share resources. Developer plans include 25 GB of storage.

When signing up for a Developer plan, you can select the size of the shared machine, up to 3 vCPUs. If you need more capacity, see [Enterprise](https://www.tinybird.co/docs/forward/pricing#enterprise) , which starts at 4 vCPUs.

See [Billing, Developer plans](./pricing/billing#developer-plans) for more information on how Developer plans are billed. See [Developer plan limits](./pricing/limits#developer-plan-limits) for more information on limits.

### Support [¶](https://www.tinybird.co/docs/forward/pricing#developer-support)

The Developer plan includes standard support through [support@tinybird.co](mailto:support@tinybird.co).

### Resize and upgrade [¶](https://www.tinybird.co/docs/forward/pricing#developer-resize-and-upgrade)

You can resize Developer plans when needed. When increasing or reducing capacity, resizes are instantaneous.

To resize your plan:

1. Go to**  Organization settings**  ,**  Billing**  .
2. Select**  Change size**  .
3. Select a bigger or a smaller size. Self service plans can be up to 3 vCPUs.
4. Confirm the resize.

To upgrade to an Enterprise plan, [contact sales](https://www.tinybird.co/contact-us).

You can resize your Developer plan only once every 24 hours.

### Downgrade [¶](https://www.tinybird.co/docs/forward/pricing#developer-downgrade)

You can downgrade back to a Free plan by canceling your Developer plan.

To cancel your Developer plan:

1. Go to**  Organization settings**  ,**  Billing**  .
2. Select the**  More actions (⋯)**   button and then select**  Cancel plan**  .
3. Review the information and confirm. Remaining costs are charged automatically upon cancellation.

After canceling your Developer plan, Free plan limits apply to your Workspaces.

## Enterprise [¶](https://www.tinybird.co/docs/forward/pricing#enterprise)

The Enterprise plan provides you with a Tinybird cluster with at least two database servers. On a Enterprise plan, you're the only customer on your cluster. Your queries and outputs are more performant as a result.

Minimum storage for Enterprise plans is 1 TB. The support package is also mandatory. See [Billing, Enterprise plans](./pricing/billing#enterprise-plans) for more information on how dedicated plans are billed and how to monitor usage.

### Support [¶](https://www.tinybird.co/docs/forward/pricing#enterprise-support)

Enterprise plans include Premium support, which includes:

- Professional services hours.
- A dedicated Slack channel.
- Email support.

### Adjust resources [¶](https://www.tinybird.co/docs/forward/pricing#enterprise-adjust-resources)

To adjust the limits and resources of your Enterprise plan or request a different size, [contact sales](https://www.tinybird.co/contact-us).

## Next steps [¶](https://www.tinybird.co/docs/forward/pricing#next-steps)

- Familiarize yourself with[  key concepts](./pricing/concepts)   to better understand your bill and plan limits.
- Read the[  billing docs](./pricing/billing)   to understand which data operations count towards your bill, and how to optimize your usage.
- Learn about[  limits](./pricing/limits)   and how to adjust them.


---

URL: https://www.tinybird.co/docs/forward/monitoring
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Monitoring · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to monitor your Tinybird data platform."
---


# Monitoring [¶](https://www.tinybird.co/docs/forward/monitoring#monitoring)

Copy as MD Tinybird is built around the idea of data that changes or grows continuously. Use the built-in Tinybird tools to monitor your data ingestion and API Endpoint processes.

- [  Analyze endpoint performance](/docs/forward/monitoring/analyze-endpoints-performance)
- [  Health checks](/docs/forward/monitoring/health-checks)
- [  Measure endpoint latency](/docs/forward/monitoring/latency)
- [  Monitor Workspace jobs](/docs/forward/monitoring/jobs)
- [  Organization Consumption](/docs/forward/monitoring/organization-consumption)
- [  Service Data Sources](/docs/forward/monitoring/service-datasources)


---

URL: https://www.tinybird.co/docs/forward/install-tinybird
Last update: 2025-06-04T08:32:07.000Z
Content:
---
title: "Install Tinybird Local · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to install Tinybird Local on your machine."
---


# Install Tinybird [¶](https://www.tinybird.co/docs/forward/install-tinybird#install-tinybird)

Copy as MD Tinybird consists of CLI tools and a containerized version of Tinybird that you can install on your machine or deploy on a cloud provider.

For a quick start guide, see [Deploy a new project in five minutes](/docs/forward).

## Prerequisites [¶](https://www.tinybird.co/docs/forward/install-tinybird#prerequisites)

To install and use Tinybird on your machine, you need to have the following prerequisites:

- [  A free Tinybird account](https://cloud.tinybird.co/)
- A container runtime, like Docker or Orbstack
- Linux or macOS

## Install Tinybird on your machine [¶](https://www.tinybird.co/docs/forward/install-tinybird#install-tinybird-on-your-machine)

To install Tinybird, run the following command:


- For macOS and Linux
- For Windows

curl https://tinybird.co | sh This installs the Tinybird CLI and the [Tinybird Local container](/docs/forward/install-tinybird/local).

## Authenticate with Tinybird Cloud [¶](https://www.tinybird.co/docs/forward/install-tinybird#authenticate-with-tinybird-cloud)

To authenticate your local environment with Tinybird Cloud, run the following command:

tb login This opens a browser window where you can sign in to Tinybird Cloud.

## Install on a cloud provider [¶](https://www.tinybird.co/docs/forward/install-tinybird#install-on-a-cloud-provider)

To install Tinybird on a cloud provider, see [Self-managed cloud](/docs/forward/install-tinybird/self-managed).

## Next steps [¶](https://www.tinybird.co/docs/forward/install-tinybird#next-steps)

- Follow the[  Deploy a new project in five minutes](/docs/forward)   quick start guide.
- Familiarize yourself with the core concepts of Tinybird. See[  Concepts](/docs/forward/get-started/concepts)  .
- Learn about datafiles and their format. See[  Datafiles](/docs/forward/dev-reference/datafiles)  .


---

URL: https://www.tinybird.co/docs/forward/get-data-in
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "Get data into Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to get data into Tinybird."
---


# Get data into Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in#get-data-into-tinybird)

Copy as MD You can bring your data into Tinybird from a variety of sources:

- Events in NDJSON and JSON format. See[  Events API](/docs/forward/get-data-in/events-api)  .
- Data from local or remote files. See[  Ingest data from files](/docs/forward/get-data-in/local-file)  .
- Data from Kafka. See[  Ingest data from Kafka](/docs/forward/get-data-in/connectors/kafka)  .
- Data from S3 buckets. See[  Ingest data from S3](/docs/forward/get-data-in/connectors/s3)  .


---

URL: https://www.tinybird.co/docs/forward/dev-reference
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "Tinybird developer reference · Tinybird Docs"
theme-color: "#171612"
description: "The Tinybird developer reference covers all available commands and tools for developers."
---


# Developer reference [¶](https://www.tinybird.co/docs/forward/dev-reference#developer-reference)

Copy as MD This section provides a comprehensive reference for all commands and tools available in Tinybird. Use this as your go-to resource for understanding available functionality, options, and arguments.

- [  CLI commands](/docs/forward/dev-reference/commands)
- [  Database errors](/docs/forward/dev-reference/list-of-errors)
- [  Datafiles](/docs/forward/dev-reference/datafiles)
- [  Template functions](/docs/forward/dev-reference/template-functions)


---

URL: https://www.tinybird.co/docs/forward/compliance
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Compliance and certifications · Tinybird Docs"
theme-color: "#171612"
description: "Tinybird is committed to the highest data security and safety. See what compliance certifications are available."
---


# Compliance and certifications [¶](https://www.tinybird.co/docs/forward/compliance#compliance-and-certifications)

Copy as MD Data security and privacy are paramount in today's digital landscape. Tinybird's commitment to protecting your sensitive information is backed by the following compliance certifications, which ensure that we meet rigorous industry standards for data security, privacy, and operational excellence.

## SOC 2 Type II [¶](https://www.tinybird.co/docs/forward/compliance#soc-2-type-ii)

Tinybird has obtained a SOC 2 Type II certification, in accordance with attestation standards established by the American Institute of Certified Public Accountants (AICPA), that are relevant to security, availability, processing integrity, confidentiality, and privacy for Tinybird's real-time platform for user-facing analytics. Compliance is monitored continually—with reports published annually—to confirm the robustness of Tinybird's data security. This independent assessment provides Tinybird users with assurance that their sensitive information is being handled responsibly and securely.

## HIPAA [¶](https://www.tinybird.co/docs/forward/compliance#hipaa)

Tinybird supports its customers' Health Insurance Portability and Accountability Act (HIPAA) compliance efforts by offering Business Associate Agreements (BAAs). Additionally, Tinybird's offering allows customers to process their data constituting personal health information (PHI) in AWS, Azure, or Google Cloud—entities which themselves have entered into BAAs with Tinybird.

## Trust center [¶](https://www.tinybird.co/docs/forward/compliance#trust-center)

To learn more about Tinybird security controls and certifications, visit the [Tinybird Trust Center](https://trust.tinybird.co/).


---

URL: https://www.tinybird.co/docs/forward/analytics-agents
Last update: 2025-07-04T07:52:33.000Z
Content:
---
title: "Analytics agents · Tinybird Docs"
theme-color: "#171612"
description: "Build analytics agents with Tinybird."
---


# Analytics agents [¶](https://www.tinybird.co/docs/forward/analytics-agents#analytics-agents)

Copy as MD Every Tinybird workspace is a fully managed remote MCP server you can instantly connect to LLMs, agents and other MCP clients such as Cursor, Windsurf or Claude.

The Tinybird remote MCP server enables AI agents to connect directly to your workspace to use endpoints as tools or execute queries. The [Model Context Protocol](https://modelcontextprotocol.io/) gives AI assistants access to your analytics APIs, data sources, and endpoints through a standardized interface.

This integration is ideal when you want AI agents to autonomously query your data, call your analytics endpoints, or build data-driven applications without requiring manual API integration.

Our server only supports Streamable HTTP as the transport protocol. If your MCP client doesn't support it, you'll need to use the `mcp-remote` package as a bridge.

## Before you start [¶](https://www.tinybird.co/docs/forward/analytics-agents#before-you-start)

Before connecting to the Tinybird MCP server, ensure:

- You have a Tinybird account and workspace
- Find your Auth Token with appropriate scopes for your use case. Details below.
- Your MCP client supports either Streamable HTTP or can use the `mcp-remote`   bridge

### Authentication and Token Requirements [¶](https://www.tinybird.co/docs/forward/analytics-agents#authentication-and-token-requirements)

You'll need an Auth Token with the following scopes depending on which tools you want to access:

**Static tokens**

Use the `admin token` to access to all available tools or resource-scoped tokens for granular access to endpoint tools.

Copy your static tokens from the [dashboard](https://cloud.tinybird.co/tokens)

**JSON Web tokens (JWTs)**

Use them to have granular access to endpoint tools, multi-tenancy, concurrency control and more.

Learn more about authentication tokens [here](/docs/administration/tokens)

## Quickstart [¶](https://www.tinybird.co/docs/forward/analytics-agents#quickstart)

Get a [token](https://cloud.tinybird.co/tokens) and use this URL in your MCP client or agent framework:

https://mcp.tinybird.co?token=TINYBIRD_TOKEN Replace `TINYBIRD_TOKEN` with your actual Auth Token. Use resource-scoped static tokens or JWTs for fine-grained access control.

### MCP Clients Requiring Bridge (Cursor, Windsurf, Claude Desktop) [¶](https://www.tinybird.co/docs/forward/analytics-agents#mcp-clients-requiring-bridge-cursor-windsurf-claude-desktop)

For clients that don't support Streamable HTTP natively:

// Get your TINYBIRD_TOKEN from https://cloud.tinybird.co/tokens
{
  "mcpServers": {
    "tinybird": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://mcp.tinybird.co?token=TINYBIRD_TOKEN"
      ]
    }
  }
} All new Tinybird tokens have embedded the Tinybird API host, for old tokens you can provide your API host as a query param `https://mcp.tinybird.co?token=TINYBIRD_TOKEN&host=https://api.tinybird.co` . Get the list of Tinybird API hosts [here](/api-reference#current-tinybird-regions)

## See also [¶](https://www.tinybird.co/docs/forward/analytics-agents#see-also)

- Learn about the[  MCP server](/docs/analytics-agents/mcp)   tools
- Check some[  example snippets](/docs/analytics-agents/mcp-server-snippets)
- Learn about[  best practices](/docs/analytics-agents/best-practices)   to use MCP server for analytics agents.


---

URL: https://www.tinybird.co/docs/forward/administration
Last update: 2025-06-16T08:31:59.000Z
Content:
---
title: "Platform · Tinybird Docs"
theme-color: "#171612"
description: "Manage your organization and workspace settings, create tokens, invite users, and more."
---


# Administration [¶](https://www.tinybird.co/docs/forward/administration#administration)

Copy as MD Effective administration of your Tinybird account involves managing several key components:

- [  Organizations](./administration/organizations)   : Create and manage organizations to group workspaces and team members.
- [  Workspaces](./administration/workspaces)   : Set up isolated environments for your data projects.
- [  Auth Tokens](./administration/tokens)   : Generate and control access tokens for secure API authentication.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/query-parameters
Last update: 2025-06-10T14:43:18.000Z
Content:
---
title: "Using query parameters · Tinybird Docs"
theme-color: "#171612"
description: "Query parameters are great for any value of the query that you might want control dynamically from your applications."
---


# Using query parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#using-query-parameters)

Copy as MD Query parameters define any value of a query that you might want control dynamically from your applications. For example, you can get your API endpoint to answer different questions by passing a different value as query parameter.

Using dynamic parameters means you can do things like:

- Filtering as part of a `WHERE`   clause.
- Changing the number of results as part of a `LIMIT`   clause.
- Sorting order as part of an `ORDER BY`   clause.
- Selecting specific columns for `ORDER BY`   or `GROUP BY`   clauses.

## Define dynamic parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#define-dynamic-parameters)

To make a query dynamic, start the query with a `%` character. That signals the engine that it needs to parse potential parameters.

After you have created a dynamic query, you can define parameters by using the following pattern `{{<data_type>(<name_of_parameter>[,<default_value>, description=<"This is a description">, required=<True|False>])}}` . For example:

##### Simple select clause using dynamic parameters

%
SELECT * FROM TR LIMIT {{Int32(lim, 10, description="Limit the number of rows in the response", required=False)}} The previous query returns 10 results by default, or however many are specified on the `lim` parameter when requesting data from that API endpoint.

Boolean data type does not support the `description` or `required` arguments.

## Use Pipes API endpoints with dynamic parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#use-pipes-api-endpoints-with-dynamic-parameters)

When using a data Pipes API endpoint that uses parameters, pass in the desired parameters.

Using the previous example where `lim` sets the amount of maximum rows you want to get, the request would look like this:

##### Using a data Pipes API endpoint containing dynamic parameters

curl -d https://api.tinybird.co/v0/pipes/tr_pipe?lim=20&token=.... You can specify parameters in more than one node in a data pipe. When invoking the API endpoint through its URL, the passed parameters are included in the request.

You can't use query parameters in materialized views.

## Leverage dynamic parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#leverage-dynamic-parameters)

As well as using dynamic parameters in your API endpoints, you can then leverage them further downstream for monitoring purposes.

When you pass a parameter to your queries, you can build pipes to reference the parameters and query the Service data sources with them, even if you don't use them in the API endpoints themselves.

Review the [Service Data Sources docs](../monitoring/service-datasources) to use the available options. For example, using the `user_agent` column on `pipe_stats_rt` shows which user agent made the request. Pass any additional things you need as a parameter to improve visibility and avoid, or get insights into, incidents and Workspace performance. This process helps you forward things like user agent or others from any app requests all the way through to Tinybird, and track if the request was done in the app and details like which device was used.

##### Example query to the pipe_stats_rt Service data source leveraging a passed 'referrer' parameter

SELECT
  toStartOfMinute(start_datetime) as date,
  count(),
  parameters['referrer']
FROM tinybird.pipes_stats_rt
WHERE 
  (
    pipe_id = '<pipe_id_here>' and status_code != 429)
    or
    pipe_name = '<pipe_name_here>' and status_code != 429)
  )
  and
  start_datetime > now() - interval - 1 hour
GROUP BY date, parameters['referrer']
ORDER BY count() DESC, date DESC
## Available data types for dynamic parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#available-data-types-for-dynamic-parameters)

You can use the following data types for dynamic parameters:

- `Boolean`   : Accepts `True`   and `False`   as values, as well as strings like `'TRUE'`  , `'FALSE'`  , `'true'`  , `'false'`  , `'1'`   , or `'0'`   , or the integers `1`   and `0`  .
- `String`   : For any string values.
- `DateTime64`  , `DateTime`   and `Date`   : Accepts values like `YYYY-MM-DD HH:MM:SS.MMM`  , `YYYY-MM-DD HH:MM:SS`   and `YYYYMMDD`   respectively.
- `Float32`   and `Float64`   : Accepts floating point numbers of either 32 or 64 bit precision.
- `Int`   or `Integer`   : Accepts integer numbers of any precision.
- `Int8`  , `Int16`  , `Int32`  , `Int64`  , `Int128`  , `Int256`   and `UInt8`  , `UInt16`  , `UInt32`  , `UInt64`  , `UInt128`  , `UInt256`   : Accepts signed or unsigned integer numbers of the specified precision.

### Use column parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#use-column-parameters)

You can use `column` to pass along column names of a defined type as parameters, like:

##### Using column dynamic parameters

%
SELECT * FROM TR 
ORDER BY {{column(order_by, 'timestamp')}}
LIMIT {{Int32(lim, 10)}} Always define the `column` function's second argument, the one for the default value. The alternative for not defining the argument is to validate that the first argument is defined, but this only has an effect on the execution of the API endpoint. A placeholder is used in the development of the Pipes.

##### Validate the column parameter when not defining a default value

%
SELECT * FROM TR
{% if defined(order_by) %}
ORDER BY {{column(order_by)}}
{% end %}
### Pass arrays [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#pass-arrays)

You can pass along a list of values with the `Array` function for parameters, like so:

##### Passing arrays as dynamic parameters

%
SELECT * FROM TR WHERE 
access_type IN {{Array(access_numbers, 'Int32', default='101,102,110')}}
## Send stringified JSON as parameter [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#send-stringified-json-as-parameter)

Consider the following stringified JSON:

"filters": [
    {
        "operand": "date",
        "operator": "equals",
        "value": "2018-01-02"
    },
    {
        "operand": "high",
        "operator": "greater_than",
        "value": "100"
    },
    {
        "operand": "symbol",
        "operator": "in_list",
        "value": "AAPL,AMZN"
    }
] You can use the `JSON()` function to use `filters` as a query parameter. The following example shows to use the `filters` field from the JSON snippet with the stock_prices_1m sample dataset.

%
SELECT symbol, date, high
FROM stock_prices_1m
WHERE
    1
    {% if defined(filters) %}
        {% for item in JSON(filters, '[]') %}
            {% if item.get('operator', '') == 'equals' %}
                AND {{ column(item.get('operand', '')) }} == {{ item.get('value', '') }}
            {% elif item.get('operator') == 'greater_than' %}
                AND {{ column(item.get('operand', '')) }} > {{ item.get('value', '') }}
            {% elif item.get('operator') == 'in_list' %}
                AND {{ column(item.get('operand', '')) }} IN splitByChar(',',{{ item.get('value', '') }})
            {% end %}
        {% end %}
    {% end %} When accessing the fields in a JSON object, use the following syntax:

item.get('Field', 'Default value to avoid SQL errors').
### Pagination [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#pagination)

You paginate results by adding `LIMIT` and `OFFSET` clauses to your query. You can parameterize the values of these clauses, allowing you to pass pagination values as query parameters to your API endpoint.

Use the `LIMIT` clause to select only the first `n` rows of a query result. Use the `OFFSET` clause to skip `n` rows from the beginning of a query result. Together, you can dynamically chunk the results of a query up into pages.

For example, the following query introduces two dynamic parameters `page_size` and `page` which lets you control the pagination of a query result using query parameters on the URL of an API endpoint.

##### Paging results using dynamic parameters

%
SELECT * FROM TR
LIMIT {{Int32(page_size, 100)}}
OFFSET {{Int32(page, 0) * Int32(page_size, 100)}} You can also use pages to perform calculations such as `count()` . The following example counts the total number of pages:

##### Operation on a paginated endpoint

%
SELECT count() as total_rows, ceil(total_rows/{{Int32(page_size, 100)}}) pages FROM endpoint_to_paginate The addition of a `LIMIT` clause to a query also adds the `rows_before_limit_at_least` field to the response metadata. `rows_before_limit_at_least` is the lower bound on the number of rows returned by the query after transformations but before the limit was applied, and can be useful for response handling calculations.

To get consistent pagination results, add an `ORDER BY` clause to your paginated queries.

## Advanced templating using dynamic parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#advanced-templating-using-dynamic-parameters)

To build more complex queries, use flow control operators like `if`, `else` and `elif` in combination with the `defined()` function, which helps you to check if a parameter whether a parameter has been received and act accordingly.

Tinybird's templating system is based on the [Tornado Python framework](https://github.com/tornadoweb/tornado) , and uses Python syntax. You must enclose control statements in curly brackets with percentages `{%..%}` as in the following example:

##### Advanced templating using dynamic parameters

%
SELECT
  toDate(start_datetime) as day,
  countIf(status_code < 400) requests,
  countIf(status_code >= 400) errors,
  avg(duration) avg_duration
FROM
  log_events
WHERE
  endsWith(user_email, {{String(email, 'gmail.com')}}) AND 
  start_datetime >= {{DateTime(start_date, '2019-09-20 00:00:00')}} AND
  start_datetime <= {{DateTime(end_date, '2019-10-10 00:00:00')}}
  {% if method != 'All' %} AND method = {{String(method,'POST')}} {% end %}
GROUP BY
  day
ORDER BY
  day DESC
### Validate presence of a parameter [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#validate-presence-of-a-parameter)

##### Validate if a param is in the query

%
select * from table
{% if defined(my_filter) %}
where attr > {{Int32(my_filter)}}
{% end %} When you call the API endpoint with `/v0/pipes/:PIPE.json?my_filter=20` it applies the filter.

### Default parameter values and placeholders [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#default-parameter-values-and-placeholders)

Following best practices, you should set default parameter values as follows:

##### Default parameter values

%
SELECT * FROM table
WHERE attr > {{Int32(my_filter, 10)}} When you call the API endpoint with `/v0/pipes/:PIPE.json` without setting any value to `my_filter` , it automatically applies the default value of 10.

If you don't set a default value for a parameter, you should validate that the parameter is defined before using it in the query as explained previously.

If you don't validate the parameter and it's not defined, the query might fail. Tinybird populates the parameter with a placeholder value based on the data type. For instance, numerical data types are populated with 0, strings with `__no_value__` , and date and timestamps with `2019-01-01` and `2019-01-01 00:00:00` respectively. You could try yourself with a query like this:

##### Get placeholder values

%
  SELECT 
      {{String(param)}} as placeholder_string,
      {{Int32(param)}} as placeholder_num,
      {{Boolean(param)}} as placeholder_bool,
      {{Float32(param)}} as placeholder_float,
      {{Date(param)}} as placeholder_date,
      {{DateTime(param)}} as placeholder_ts,
      {{Array(param)}} as placeholder_array This returns the following values:

{
  "placeholder_string": "__no_value__",
  "placeholder_num": 0,
  "placeholder_bool": 0,
  "placeholder_float": 0,
  "placeholder_date": "2019-01-01",
  "placeholder_ts": "2019-01-01 00:00:00",
  "placeholder_array": ["__no_value__0","__no_value__1"]
}
### Test dynamic parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#test-dynamic-parameters)

Any dynamic parameters you create appears in the UI. Select **Test new values** to open a test dialog populated with the default value of your parameters. The test dialog helps you test different Pipe values than the default ones without impacting production environments.

Use the View API page to see API endpoint metrics resulting from that specific combination of parameters. Close the dialog to bring the Pipe back to its default production state.

When testing parameters, you can modify both the SQL code and the parameters.

### Cascade parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#cascade-parameters)

Parameters with the same name in different Pipes are cascaded down the dependency chain.

For example, if you publish Pipe A with the parameter `foo` , and then Pipe B which uses Pipe A as a data source also with the parameter `foo` , then when you call the API endpoint of Pipe B with `foo=bar` , the value of `foo` will be `bar` in both Pipes.

### Throw errors [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#throw-errors)

The following example stops the API endpoint processing and returns a 400 error:

##### Validate if a param is defined and throw an error if it's not defined

%
{% if not defined(my_filter) %}
{{ error('my_filter (int32) query param is required') }}
{% end %}
select * from table
where attr > {{Int32(my_filter)}} The `custom_error` function is an advanced version of `error` where you can customize the response and other aspects. The function gets an object as the first argument, which is sent as JSON, and the status_code as a second argument, which defaults to 400.

##### Validate if a param is defined and throw an error if it's not defined

%
{% if not defined(my_filter) %}
{{ custom_error({'error_id': 10001, 'error': 'my_filter (int32) query param is required'}) }}
{% end %}
select * from table
where attr > {{Int32(my_filter)}}
## Limits [¶](https://www.tinybird.co/docs/forward/work-with-data/query-parameters#limits)

You can't use query parameters in nodes that are published as [Materialized Views](../work-with-data/optimize/materialized-views) , only as API endpoints or in on-demand copies or sinks.

You can use query parameters in scheduled sinks and copies, but must have a default. That default is used in the scheduled execution. The preview step fails if the default doesn't exist.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data
Last update: 2025-06-26T06:36:23.000Z
Content:
---
title: "Work with data · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to process and copy data in Tinybird."
---


# Publish data [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data#publish-data)

Copy as MD When your data is in Tinybird, you can publish it as API Endpoints.

These Endpoints can be used in your frontend, backend, or any other application. Can be secured with JWT.

In addition to publishing API Endpoints, you can use Sinks to periodically send data to external destinations like S3, GCS, or Kafka.

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data#next-steps)

- Learn about[  Endpoints](/docs/forward/work-with-data/publish-data/endpoints)  .
- Learn about[  Sinks](/docs/forward/work-with-data/publish-data/sinks)
- Learn about[  Guides](/docs/forward/work-with-data/publish-data/guides)  .


---

URL: https://www.tinybird.co/docs/forward/work-with-data/pipes
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "Pipes · Tinybird Docs"
theme-color: "#171612"
description: "Pipes help you to bring your SQL queries together to achieve a purpose, like publishing an endpoint."
---


# Pipes [¶](https://www.tinybird.co/docs/forward/work-with-data/pipes#pipes)

Copy as MD A pipe is a collection of one or more SQL queries. Each query is called a node.

Use Pipes to build features over your data. For example, you can write SQL that joins, aggregates, or otherwise transforms your data and publish the result as an endpoint.


## Nodes [¶](https://www.tinybird.co/docs/forward/work-with-data/pipes#nodes)

A node is a container for a single SQL `SELECT` statement. Nodes live within pipes, and you can have many sequential nodes inside the same pipe. They allow you to break your query logic down into multiple smaller queries. You can then chain nodes together to build the logic incrementally.

A query in a node can read data from a data source, other nodes inside the same pipe, or from endpoint nodes in other pipes. Each node can be developed and tested individually. This makes it much easier to build complex query logic in Tinybird as you avoid creating large monolithic queries with many subqueries.

## Create generic pipes [¶](https://www.tinybird.co/docs/forward/work-with-data/pipes#create-generic-pipes)

Pipes are defined in a .pipe file without a specific `TYPE` . See [Pipe files](/docs/forward/dev-reference/datafiles/pipe-files).

## List your pipes [¶](https://www.tinybird.co/docs/forward/work-with-data/pipes#list-your-pipes)

To list your pipes, run the following command:

tb pipe ls
## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/pipes#next-steps)

- Learn about[  Endpoints](/docs/forward/work-with-data/publish-data/endpoints)  .
- Learn about[  Copy pipes](/docs/forward/work-with-data/optimize/copy-pipes)  .
- Learn about[  Materialized views](/docs/forward/work-with-data/optimize/materialized-views)  .


---

URL: https://www.tinybird.co/docs/forward/work-with-data/optimize
Last update: 2025-05-07T14:53:51.000Z
Content:
---
title: "Optimize · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to optimize your project in Tinybird."
---


# Optimize [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize#optimize)

Copy as MD When your data is in Tinybird, you can create intermediate data sources to preprocess data and make the endpoints faster. This can be done by using materialized views or copy pipes.

- **  Copy pipes**   capture the result of a pipe at a specific point in time and write it to a target data source. They can run on a schedule or run on demand, making them ideal for event-sourced snapshots, data experimentation, and deduplication with snapshots.
- **  Materialized views**   continuously re-evaluate a query as new events are inserted, maintaining an always up-to-date derived dataset. Unlike copy pipes which create point-in-time snapshots, materialized views provide real-time transformations of your data.

Each approach has its own strengths and use cases:

- Use copy pipes when you need scheduled or on-demand snapshots of your data.
- Use materialized views when you need continuous, real-time transformations.

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize#next-steps)

- Learn about[  Copy pipes](/docs/forward/work-with-data/optimize/copy-pipes)  .
- Learn about[  Materialized views](/docs/forward/work-with-data/optimize/materialized-views)  .


---

URL: https://www.tinybird.co/docs/forward/work-with-data/explorations
Last update: 2025-05-07T15:03:28.000Z
Content:
---
title: "Explorations · Tinybird Docs"
theme-color: "#171612"
description: "Explore and understand your data with natural language in Tinybird."
---


# Explorations [¶](https://www.tinybird.co/docs/forward/work-with-data/explorations#explorations)

Copy as MD Explorations is a conversational UI feature in Tinybird that lets you explore and interact with your data using natural language.

- Use a notebook-style interface to explore your data with tables and charts.
- Ask questions in natural language using the AI chat, which generates contextualized SQL for your queries.

Explorations are sandbox environments where you can test queries on your ingested data. For example, use Explorations to quickly query real-time production data, debug existing queries, or understand your Workspace resources.

To start using Explorations, go to [Tinybird Cloud](https://cloud.tinybird.co/) , log in to your account and select **Explorations** in the sidebar.

To create a new Exploration, submit your own prompt or select one of the built-in suggestions. To run a query using SQL instead of natural language (without using the LLM), select the **Write a query** option.

![Explorations initial screen](/docs/_next/image?url=%2Fdocs%2Fimg%2Fexplorations-empty.png&w=3840&q=75)

## How it works [¶](https://www.tinybird.co/docs/forward/work-with-data/explorations#how-it-works)

Based on your prompt, a large language model (LLM) generates contextualized SQL with detailed explanations to help you understand the results. SQL queries are structured in nodes. Each Exploration can have one or more nodes. Nodes in Explorations are similar to nodes in Pipes. A node can query other nodes within the same Exploration. You can edit and rerun any SQL generated by the LLM in the SQL editor.

The chat history contains the entire conversation, and you can ask follow-up questions at any time.

Similar to other LLM-based interfaces, each Exploration maintains the context of previous messages. For unrelated questions or data, create a new Exploration for better results.

#### Fix errors [¶](https://www.tinybird.co/docs/forward/work-with-data/explorations#fix-errors)

If the AI-generated query is incorrect, a **Fix error** option appears. Click it to ask the AI to fix the error.

#### Results as table or time series [¶](https://www.tinybird.co/docs/forward/work-with-data/explorations#results-as-table-or-time-series)

You can quickly switch between table and time series visualization by clicking the Switch to time series/table button in the top-right of the SQL editor.

For time series visualization, the initial configuration is inferred from your query results if possible. You can further configure your time series.

- The X axis should use date or datetime data.
- The Y axis should use numeric data.

![Example of an exploration with chat history on the left and generated SQL and table results on the right](/docs/_next/image?url=%2Fdocs%2Fimg%2Fexplorations-example.png&w=3840&q=75)

## Define your own workspace rules [¶](https://www.tinybird.co/docs/forward/work-with-data/explorations#define-your-own-workspace-rules)

You can fine-tune how the assistant responds to get faster and more relevant results. Inside an Exploration, click the **workspace rules** link below the chat prompt input to add your custom rules.

## Limitations & Privacy [¶](https://www.tinybird.co/docs/forward/work-with-data/explorations#limitations-privacy)

Explorations are available to all roles within a Workspace.

The Explorations AI chat is only available in Tinybird Cloud. Tinybird anonymously processes a limited sample of your data to deliver faster, more accurate results.

Currently, Explorations cannot be shared or exported. Each user can only access their own Explorations.

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/explorations#next-steps)

- Learn about[  Pipes](/docs/forward/work-with-data/pipes)  .
- Learn about[  Copy pipes](/docs/forward/work-with-data/optimize/copy-pipes)  .
- Learn about[  Materialized views](/docs/forward/work-with-data/optimize/materialized-views)  .


---

URL: https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "Configure local testing · Tinybird Docs"
theme-color: "#171612"
description: "Set up fixtures and test suites for your Tinybird project."
---


# Configure local testing [¶](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#configure-local-testing)

Copy as MD Testing your data project locally ensures that your resources are working as expected before you deploy your project to Tinybird.

There are several ways of generating test data for your local project. You can:

- Check the deployment before creating it. See[  Check deployment](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#check-deployment)  .
- Create fixtures and store them with your datafiles. See[  Fixture files](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#fixture-files)  .
- Generate mock data using the `tb mock`   command. See[  Generate mock data](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#generate-mock-data)  .
- Use Tinybird's ingest APIs to send events or files. See[  Call the ingest APIs](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#call-the-ingest-apis)  .
- Generate test suites using the `tb test`   command. See[  Create a test suite](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#create-a-test-suite)  .

## Check deployment [¶](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#check-deployment)

After you finish developing your project, run `tb deploy --check` to validate the deployment before creating it. This is a good way of catching potential breaking changes. See [tb deploy](/docs/forward/dev-reference/commands/tb-deploy) for more information.

tb deploy --check    
Running against Tinybird Local

» Validating deployment...

» Changes to be deployed...

-----------------------------------------------------------------
| status   | name         | path                                |
-----------------------------------------------------------------
| modified | user_actions | datasources/user_actions.datasource |
-----------------------------------------------------------------

✓ Deployment is valid
## Fixture files [¶](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#fixture-files)

Fixtures are NDJSON files that contain sample data for your project. Fixtures are stored inside the `fixtures` folder in your project.

my-app/
├─ datasources/
│  ├─ user_actions.datasource
│  └─ ...
├─ fixtures/
│  ├─ user_actions.ndjson
│  └─ ... Every time you run `tb build` , the CLI checks for fixture files and includes them in the build. Fixture files must have the same name as the associated .datasource file.

## Generate mock data [¶](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#generate-mock-data)

The `tb mock` command creates fixtures based on your data sources. See [tb mock](/docs/forward/dev-reference/commands/tb-mock) for more information.

For example, the following command creates a fixture for the `user_actions` data source.

tb mock user_actions

» Creating fixture for user_actions...
✓ /fixtures/user_actions.ndjson created
... You can use the `--prompt` flag to add more context to the data that is generated. For example:

tb mock user_actions --prompt "Create mock data for 23 users from the US"`
## Call the ingest APIs [¶](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#call-the-ingest-apis)

Another way of testing your project is to call the local ingest APIs:

- Send events to the local[  Events API](/docs/api-reference/events-api)   using a generator.
- Ingest data from files using the[  Data sources API](/docs/api-reference/datasource-api)  .

Obtain a token using `tb token ls` and call the local endpoint:

- cURL (Events API)
- Local file (Data sources API)
- Remote file (Data sources API)

curl \
      -X POST 'http://localhost:7181/v0/events?name=<your_datasource>' \
      -H "Authorization: Bearer <your_token>" \
      -d $'<your_data>' As you call the APIs, you can see errors and warnings in the console. Use this information to debug your datafiles.

## Create a test suite [¶](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#create-a-test-suite)

Once your project builds correctly, you can generate a test suite using [tb test](/docs/forward/dev-reference/commands/tb-test).

For example, the following command creates a test suite for the `user_action_insights_widget` pipe.

# Pass a pipe name to create a test
tb test create user_action_insights_widget Then, customize the tests to fit your needs.

You can use the `--prompt` flag to add more context to the data that is generated. For example:

tb test create user_action_insights_widget --prompt "return user actions filtering by CLICKED" The output of the command is a test suite file that you can find in the `tests` folder of your project.

- name: user_action_insights_widget_clicked
  description: Test the endpoint that returns user actions filtering by CLICKED
  parameters: action=CLICKED
  expected_result: |
    {"action":"CLICKED", "user_id":1, "timestamp":"2025-03-19T01:58:31Z"}
    {"action":"CLICKED", "user_id":2, "timestamp":"2025-03-20T05:34:22Z"}
    {"action":"CLICKED", "user_id":3, "timestamp":"2025-03-21T19:21:34Z"} When creating tests, follow these guidelines:

- Give each test a meaningful name and description that explains its purpose.
- Define query parameters without quotes.
- The `expected_result`   should match the data object from your endpoint's response.
- An empty string ( `''`   ) in the `expected_result`   means the endpoint returns no data. If an empty result is unexpected, verify your endpoint's output and update the test by running:

tb test update user_action_insights_widget Once you have your test suite, you can run it using the `tb test run` command.

tb test run

» Running tests

* user_action_insights_widget_clicked.yaml
✓ user_action_insights_widget_clicked passed

✓ 1/1 passed
## Next steps [¶](https://www.tinybird.co/docs/forward/test-and-deploy/test-your-project#next-steps)

- Make changes to your data sources and test them. See[  Evolve data sources](/docs/forward/test-and-deploy/evolve-data-source)  .
- Learn more about[  deployments](/docs/forward/test-and-deploy/deployments)  .
- Learn about datafiles, like .datasource and .pipe files. See[  Datafiles](/docs/forward/dev-reference/datafiles)  .


---

URL: https://www.tinybird.co/docs/forward/test-and-deploy/evolve-data-source
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "Evolve data sources · Tinybird Docs"
theme-color: "#171612"
description: "Evolve your data sources in Tinybird."
---


# Evolve data sources [¶](https://www.tinybird.co/docs/forward/test-and-deploy/evolve-data-source#evolve-data-sources)

Copy as MD After you've deployed your project, you can evolve your data sources. For example, you might need to add a new column, change the data type of a column, or change the sorting key. Tinybird handles the data migration.

## Types of changes [¶](https://www.tinybird.co/docs/forward/test-and-deploy/evolve-data-source#types-of-changes)

You can evolve your data sources by editing one or more of the following:

- Landing data source schema.
- Landing data source engine settings.
- Materialized data source.

## Landing data source schema [¶](https://www.tinybird.co/docs/forward/test-and-deploy/evolve-data-source#landing-data-source-schema)

When you make changes to the schema of a landing data source, such as adding or editing columns or changing a data type, you can follow these steps:

1. In Tinybird Local, start a dev session with `tb dev`  .
2. Edit the .datasource file to add the changes. See[  SCHEMA instructions](/docs/forward/dev-reference/datafiles/datasource-files#schema)  .
3. Add a[  forward query](https://www.tinybird.co/docs/forward/test-and-deploy/evolve-data-source#forward-query)   instruction to tell Tinybird how to migrate your data.
4. Run `tb deploy --check`   to validate the deployment before creating it. This is a good way of catching potential breaking changes.
5. Deploy and promote your changes in Tinybird Cloud using `tb --cloud deploy`  .

When Tinybird Cloud creates the deployment, it automatically populates the new table following the updated schema.

If a deployment fails, Tinybird automatically discards the staging deployment and maintains the live version.

### Forward query [¶](https://www.tinybird.co/docs/forward/test-and-deploy/evolve-data-source#forward-query)

If you make changes to a .datasource file that are incompatible with the live version, you must use a forward query to transform the data from the live schema to the new one. Otherwise, your deployment fails due to a schema mismatch.

The `FORWARD_QUERY` instruction is a `SELECT` query executed on the live data source. The query must include the column selection part of the query, for example `SELECT a, b, c` or `SELECT * except 'guid', toUUID(guid) AS guid` . The `FROM` and `WHERE` clauses aren't supported.

The following is an example of a forward query that changes the `session_id` column from a `String` to a `UUID` type:

##### tinybird/datasources/forward-query.datasource - data source with a FORWARD_QUERY declaration

DESCRIPTION >
    Analytics events landing data source

SCHEMA >
    `timestamp` DateTime `json:$.timestamp`,
    `session_id` UUID `json:$.session_id`,
    `action` String `json:$.action`,
    `version` String `json:$.version`,
    `payload` String `json:$.payload`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "timestamp"
ENGINE_TTL "timestamp + toIntervalDay(60)"

FORWARD_QUERY >
    SELECT timestamp, CAST(session_id, 'UUID') as session_id, action, version, payload Tinybird runs a backfill to migrate the data to the new schema. These backfills are logged in `datasources_ops_log` with the `event_type` set to `deployment_backfill`.

If the existing data is incompatible with the schema change, the staging deployment fails and is discarded. For example, if you change a data type from `String` to `UUID` , but the existing data contains invalid values like `'abc'` , the deployment fails with this error:

» tb --cloud deploy
...

✓ Deployment submitted successfully
Deployment failed
* Error on datasource '<datasource_name>': Error migrating data: Populate job <job_id> failed, status: error
Rolling back deployment
Previous deployment is already live
Removing current deployment
Discard process successfully started
Discard process successfully completed If you're willing to accept data loss or default values for incompatible records, you can make the deployment succeed by using the [accurateCastOrDefault](/docs/sql-reference/functions/type-conversion-functions#accuratecastordefaultx-t-default-value) function in your forward query:

FORWARD_QUERY >
    SELECT timestamp, accurateCastOrDefault(session_id, 'UUID') as session_id, action, version, payload After changes have been deployed and promoted, if you want to deploy other changes that don't affect that data source, you can remove the forward query.

## Landing data source engine settings [¶](https://www.tinybird.co/docs/forward/test-and-deploy/evolve-data-source#landing-data-source-engine-settings)

When you make changes to the engine settings of a landing data source, such as changing the sorting or partition key, you can follow these steps:

1. In Tinybird Local, start a dev session with `tb dev`  .
2. Edit the .datasource file to add the changes. No forward query is required. See[  engine settings](/docs/forward/dev-reference/datafiles/datasource-files#engine-settings)  .
3. Run `tb deploy --check`   to validate the deployment before creating it. This is a good way of catching potential breaking changes.
4. Deploy and promote your changes in Tinybird Cloud using `tb --cloud deploy`  .

When Tinybird Cloud creates the deployment, it automatically populates the new table following the changes.

## Materialized data sources [¶](https://www.tinybird.co/docs/forward/test-and-deploy/evolve-data-source#materialized-data-sources)

When editing materialized data sources, you need to consider the settings of the landing data sources that feed into them, especially the TTL (Time To Live) settings.

[Forward queries](https://www.tinybird.co/docs/forward/test-and-deploy/evolve-data-source#forward-query) are essential when evolving materialized data sources, both schema and engine settings, to retain historical data.

If your landing data source has a shorter TTL than your materialized data source, you will get a warning when you deploy your changes. You will need to add a forward query to prevent data loss or, if you accept loss of historical data, add the `--allow-destructive-operations` flag to your deployment command.

For example, consider this scenario:

- Landing data source has a 7-day TTL.
- Materialized data source has no TTL (keeps data indefinitely).
- You want to change the data type of a column in the materialized data source.

Without a forward query, recalculating the materialized data source would only process the last 7 days of data due to the landing source's TTL, causing you to lose historical data beyond that period. To retain all historical data, use a forward query to transform the data from the live schema to the new one.

Here's an example materialized data source that uses a forward query to transform the data type of the `visits` column from `AggregateFunction(count, UInt16)` to `AggregateFunction(count, UInt64)`:

DESCRIPTION >
    Materialized data source for daily page visits aggregation

SCHEMA >
    `date` Date,
    `page_url` String,
    `visits` AggregateFunction(count, UInt64)

ENGINE "AggregatingMergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(date)"
ENGINE_SORTING_KEY "date, page_url"

FORWARD_QUERY >
    SELECT date, page_url, CAST(visits, 'AggregateFunction(count, UInt64)') AS visits Omitting the forward query instruction fully recalculates the materialized data source.

You can omit the forward query when:

- Landing data source has a longer TTL than the materialized data source, or no TTL.
- Making non-backward compatible changes, like adding a new group by column.
- Accepting loss of historical data.

## Next steps [¶](https://www.tinybird.co/docs/forward/test-and-deploy/evolve-data-source#next-steps)

- Learn more about[  deployments](/docs/forward/test-and-deploy/deployments)  .
- Learn about datafiles, like .datasource and .pipe files. See[  Datafiles](/docs/forward/dev-reference/datafiles)  .


---

URL: https://www.tinybird.co/docs/forward/test-and-deploy/deployments
Last update: 2025-07-03T06:52:53.000Z
Content:
---
title: "Deployments · Tinybird Docs"
theme-color: "#171612"
description: "Deploy your data project to Tinybird."
---


# Deployments in Tinybird [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments#deployments-in-tinybird)

Copy as MD Changing state in data infrastructure can be complex. Each state transition must ensure data integrity and consistency.

Tinybird deployments simplify this process by providing robust mechanisms for managing state changes, allowing you to validate and push updates seamlessly while minimizing the risk of data conflicts or loss.

## What is a deployment? [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments#what-is-a-deployment)

Deployments are versions of your project resources and data running on local or cloud infrastructure.

## Types of deployments [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments#types-of-deployments)

There are two types of deployments:

- Staging deployments: Deployments you can use to validate your changes. You access them using the `--staging`   flag.
- Live deployments: Deployments that make your changes available to your users.

Each type can be deployed to Tinybird Local ( `--local` ) or Tinybird Cloud ( `--cloud` ).

## Deployment status [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments#deployment-status)

Deployments have the following statuses:

- `In progress`   : The deployment is in progress. Use `--wait`   to wait for it to finish.
- `Live`   : The deployment is active and has been promoted from staging.
- `Staging`   : The deployment is active in staging. Use `--staging`   to access it.
- `Failed`   : The deployment failed. Try `tb deploy --check`   to debug the issue.
- `Deleted`   : The deployment was deleted as a result of creating new deployments.

## Deployment methods [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments#deployment-methods)

The following deployment methods are available:

- [  CLI](/docs/forward/test-and-deploy/deployments/cli)  .
- [  CI/CD](/docs/forward/test-and-deploy/deployments/cicd)  .

## Staging deployments [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments#staging-deployments)

You can write data to, and read data from, a staging deployment before promoting it to live. This is useful when you've made schema changes that might be incompatible with the current live deployment, like adding new columns.

### Writing to staging deployments [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments#writing-to-staging-deployments)

You can use the [Events API](/docs/forward/get-data-in/events-api) to write directly to staging deployments through the `__tb__min_deployment` parameter, which indicates the target deployment ID. For example:

curl \
    -H "Authorization: Bearer <import_token>" \
    -d '{"date": "2020-04-05 00:05:38", "city": "Chicago", "new_column": "value"}' \
    'https://api.tinybird.co/v0/events?name=events_test&__tb__min_deployment=5' In the example, if the ID of your current live deployment is 4 and you're creating deployment with an ID of 5, the data will be ingested into the staging deployment 5 only. This allows you to:

1. Make schema changes in a staging deployment.
2. Ingest data compatible with the new schema.
3. Validate the changes work as expected.
4. Promote the deployment to live when ready.

Without the parameter, data would be rejected if it doesn't match the schema of the current live deployment.

To get the deployment ID, run `tb deployment ls`.

### Reading from staging deployments [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments#reading-from-staging-deployments)

You can query data from a staging deployment using [pipe endpoints](/docs/forward/work-with-data/publish-data/endpoints) . To access a staging endpoint, add the `__tb__deployment` parameter to your API request:

curl \
    -H "Authorization: Bearer <query_token>" \
    'https://api.tinybird.co/v0/pipes/my_endpoint?__tb__deployment=5' This allows you to:

1. Test your endpoints with the new schema changes.
2. Validate query results before promoting to live.
3. Ensure your application works correctly with the updated data structure.

To get the deployment ID, run `tb deployment ls`.

### Continuous operation [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments#continuous-operation)

Once the deployment is promoted to live, you can continue using the same API calls. In the previous example, calls using the `__tb__min_deployment=5` or `__tb__deployment=5` parameters will keep working without interruption. The parameters ensure compatibility both before and after promotion.

For more details on the Events API parameters, see the [Events API Reference](/docs/api-reference/events-api).

## Next steps [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments#next-steps)

- See how to[  deploy your project using the CLI](/docs/forward/test-and-deploy/deployments/cli)  .
- See how to[  deploy your project using CI/CD](/docs/forward/test-and-deploy/deployments/cicd)  .


---

URL: https://www.tinybird.co/docs/forward/pricing/support
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Support · Tinybird Docs"
theme-color: "#171612"
description: "Tinybird is here to help. Learn about our support options."
---


# Support [¶](https://www.tinybird.co/docs/forward/pricing/support#support)

Copy as MD Tinybird provides support through different channels depending on your plan. See [Plans](../pricing).

Read on to learn more about support options and common solutions.

## Channels [¶](https://www.tinybird.co/docs/forward/pricing/support#channels)

Tinybird provides support through the following channels depending on your plan:

| Plan | Channels |
| --- | --- |
| Free | Support for the Free plan is available through the[  Community Slack](/docs/community)   , which is monitored by the Tinybird team. |
| Developer | In addition to the Community Slack, priority support is provided through[  support@tinybird.co](mailto:support@tinybird.co)  . |
| Enterprise | In addition to priority email support, Enterprise customers can request a dedicated Slack channel for direct support. |

## Integrated troubleshooting [¶](https://www.tinybird.co/docs/forward/pricing/support#integrated-troubleshooting)

Tinybird tries to give you direct feedback and notifications if it spots anything going wrong. Use Tinybird's [Service Data Sources](../monitoring/service-datasources) to get more details on what's going on in your data and queries.

## Recover deleted items [¶](https://www.tinybird.co/docs/forward/pricing/support#recover-deleted-items)

Tinybird creates and backs up daily snapshots and retains them for 7 days.

## Recover data from quarantine [¶](https://www.tinybird.co/docs/forward/pricing/support#recover-data-from-quarantine)

The quickest way to recover rows from quarantine is to fix the cause of the errors and then reingest the data. See Recover data from quarantine.

You can also use the [Service Data Sources](../monitoring/service-datasources) , like `datasources_ops_log`.

## Get help [¶](https://www.tinybird.co/docs/forward/pricing/support#get-help)

If you haven't been able to solve the issue, or it looks like there is a problem on Tinybird's side, get in touch. You can always contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

If you have an Enterprise account with Tinybird, contact us using your shared Slack channel.


---

URL: https://www.tinybird.co/docs/forward/pricing/limits
Last update: 2025-07-01T18:02:39.000Z
Content:
---
title: "Limits · Tinybird Docs"
theme-color: "#171612"
description: "Tinybird has limits on certain operations and processes to ensure the highest performance."
---


# Limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#limits)

Copy as MD Tinybird has limits on certain operations and processes to ensure the highest performance.

## Free plan limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#free-limits)

Organizations on the Free plan have the following limits:

- 1k queries per day in total for all Workspaces in the Organization. Applies to API Endpoints and Query API calls.
- 10 queries per second (QPS) limit. Applies to API Endpoints and Query API calls. Your operations can take x2 QPS per second allowed in you plan for API endpoint requests or SQL queries. See[  Burst mode in Free plan](../pricing/concepts#qps-burst-mode-for-free-plan)  .
- 10 GB storage, including all Data Sources from all Workspaces. This limit is based on the daily average usage for the current month.
- 0.5 vCPU limit, limiting the concurrency but not the total active minutes. See[  Active minutes](../pricing/billing#active-minutes)  .
- 1 max thread for running queries in API Endpoints and Query API calls. See[  Max threads](../pricing/concepts#max-threads)  .
- 1 Copy Pipe per Workspace with a 20-second execution time limit, running once per hour. See[  Copy Pipe limits](../pricing/limits#copy-pipe-limits)  .
- 1 active delete job per Workspace. See[  Delete limits](../pricing/limits#delete-limits)  .

If you reach the limits, you will receive an email notification. See [Email notifications](../pricing/limits#email-notifications).

See [Free plan](../pricing#free) for more information.

## Developer plan limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#developer-limits)

Depending on the size of your Developer plan, the following limits apply:

- You can exceed the queries per second limit (QPS) in your plan up to 4x (plan's ceiling). Below the 4x, additional QPS requests are billed at a fixed rate per request. See[  QPS Overages and QPS Ceiling](../pricing/concepts#qps-overages-and-qps-ceiling)   and the[  Pricing page](https://www.tinybird.co/pricing)   for details. Beyond 4x, API Endpoints and queries requests will be rate limited.
- You can't exceed the vCPU time limit. The limit relates to vCPU time used during a minute. Tinybird allows usage bursts of 2x the vCPU time per minute. If you reach the vCPU limit you're allowed the base QPS limit of your plan size during the next 5 minutes (no QPS Overages). See[  vCPU burst mode](../pricing/concepts#vcpu-burst-mode)  .
- Running queries in API Endpoints and Query API calls are limited by the max threads included in the plan. See[  Max threads](../pricing/concepts#max-threads)  .
- If you exceed the active minutes in your plan, additional minutes are billed at a fixed rate per minute. See the[  Pricing page](https://www.tinybird.co/pricing)   for more information.

Limits are applied at the organization level, so all Workspaces in an Organization share the same limits. If you reach the limits, you will receive an email notification. See [Email notifications](../pricing/limits#email-notifications).

See [Developer plan](../pricing#developer) for more information.

## Enterprise plan limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#enterprise-limits)

Enterprise plans limits depend on the infrastructure type: shared or dedicated.

### Shared infrastructure [¶](https://www.tinybird.co/docs/forward/pricing/limits#shared-infrastructure)

Enterprise plans starting at 4 vCPUs on shared infrastructure have the same limits as the Developer plans.

- You can exceed the queries per second limit (QPS) in your plan up to 4x (plan's ceiling). Below the 4x, additional QPS requests are billed at a fixed rate per request. See[  QPS Overages and QPS Ceiling](../pricing/concepts#qps-overages-and-qps-ceiling)   and the[  Pricing page](https://www.tinybird.co/pricing)   for details. Beyond 4x, API Endpoints and queries requests will be rate limited.
- You can't exceed the vCPU time limit. The limit relates to vCPU time used during a minute. Tinybird allows usage bursts of 2x the vCPU time per minute. If you reach the vCPU limit you're allowed the base QPS limit of your plan size during the next 5 minutes (no QPS Overages). See[  vCPU burst mode](../pricing/concepts#vcpu-burst-mode)  .
- Running queries in API Endpoints and Query API calls are limited by the max threads included in the plan. See[  Max threads](../pricing/concepts#max-threads)  .
- If you exceed the active minutes in your plan, additional minutes are billed at a fixed rate per minute. See the[  Pricing page](https://www.tinybird.co/pricing)   for more information.

Limits are applied at the organization level, so all Workspaces in an Organization share the same limits. If you reach the limits, you will receive an email notification. See [Email notifications](../pricing/limits#email-notifications).

### Dedicated infrastructure [¶](https://www.tinybird.co/docs/forward/pricing/limits#dedicated-infrastructure)

Enterprise plans with dedicated infrastructure have no predefined limits. The only limits are the ones related to the underlying infrastructure and available capacity.

See [Rate limiter](../pricing/billing#rate-limiter) for more information on rate limiting in dedicated infrastructure plans.

## Email notifications [¶](https://www.tinybird.co/docs/forward/pricing/limits#email-notifications)

When reaching or temporarily exceeding your plan's limits, you will receive one of the following emails.

### Free plans [¶](https://www.tinybird.co/docs/forward/pricing/limits#free-plans)

The following table shows the limits' notifications you might receive for the Free plan.

| Limit | Warning email | Alert email |
| --- | --- | --- |
| vCPU usage (1/2 vCPU) | Triggered after 3 minutes at 150-200% usage in last 30 minutes. | Triggered after 1 minute at >200% usage in last 30 minutes. |
| [  Queries Per Second](../pricing/concepts#queries-per-second)   (10 QPS) | Triggered after 30 seconds with activity in last 30 minutes, where some surpass >75% of limit. | Triggered after 30 seconds with activity in last 30 minutes, where some surpass >100% of limit. |
| [  Active minutes](../pricing/concepts#active-vcpu-minutes-hours) | Used more than 75% of allocated minutes. | Used more than 100% of allocated minutes. |
| Storage Usage | Used more than 75% of allocated storage. | Used more than 100% of allocated storage. |
| Requests per day (1,000 queries) | Used more than 75% of daily query limit. | Used more than 100% of daily query limit. |

### Developer and shared plans [¶](https://www.tinybird.co/docs/forward/pricing/limits#developer-and-shared-plans)

The following table shows the limits' notifications you might receive for plans on shared infrastructure.

| Limit | Warning email | Alert email |
| --- | --- | --- |
| vCPU usage | Triggered after 3 minutes at 150-200% usage in last 30 minutes. | Triggered after 1 minute at >200% usage in last 30 minutes. |
| [  Queries Per Second](../pricing/concepts#queries-per-second) | Triggered after 5 seconds with activity in last 30 minutes, where some surpass >95% of plan's ceiling (4x plan's qps). |  |
| [  Queries Overages](../pricing/concepts#queries-overages) | Triggered once there are more than 200 QPS Overages during the billing cycle. |  |
| [  Active minutes](../pricing/concepts#active-vcpu-minutes-hours) | Used more than 75% of allocated minutes. | Used more than 100% of allocated minutes. |
| Extra costs | Overage costs in QPS and/or Active minutes are more than 20% of plan's base fee | 2nd warning if more than >95% |  |
| Storage Usage | Used more than 75% of allocated storage. | Used more than 100% of allocated storage. |

## Workspace limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#workspace-limits)

The following limits apply to all workspaces in an organization, regardless of the plan.

| Description | Limit |
| --- | --- |
| Number of Workspaces | 90 Workspaces. Soft limit, contact support to increase it. |
| Number of seats | 90 seats. Soft limit, contact support to increase it. |
| Number of Data Sources | 100 Data Sources. Soft limit, contact support to increase it. |
| Number of Copy Pipes | Up to 10 Pipes, depending on the plan. See[  Copy Pipe limits](../pricing/limits#copy-pipe-limits) |
| Number of Sink Pipes | Up to 10 Pipes, depending on the plan. See[  Sink Pipe limits](../pricing/limits#sink-pipe-limits)  . |
| Number of Tokens | 100,000 tokens. If you need more you should take a look at[  JWT tokens](../administration/tokens/jwt)   ) |
| Number of secrets | 100 secrets. |

See [Rate limits for JWTs](../administration/tokens/jwt#rate-limits-for-jwt-tokens) for more detail specifically on JWT limits.

## Ingestion limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#ingestion-limits)

The following ingestion limits apply to all workspaces in an organization.

| Description | Limit |
| --- | --- |
| Data Source max columns | 500 |
| Full body upload | 8 MB |
| Multipart upload - CSV and NDJSON | 500 MB |
| Multipart upload - Parquet | 50 MB |
| Max file size - Parquet - Free plan | 1 GB |
| Max file size - Parquet - Developer and Enterprise plan | 5 GB |
| Max file size (uncompressed) - Free plan | 10 GB |
| Max file size (uncompressed) - Developer and Enterprise plan | 32 GB |
| Kafka topics | Default is 5. Soft limit, contact support to increase it. |
| Max parts created at once - NDJSON/Parquet jobs and Events API | 12 parts |

### Ingestion limits (API) [¶](https://www.tinybird.co/docs/forward/pricing/limits#ingestion-limits-api)

Tinybird throttles requests based on the capacity. So if your queries are using 100% resources you might not be able to run more queries until the running ones finish.

| Description | Limit and time window |
| --- | --- |
| Request size - Events API | 10 MB |
| Response size | 100 MB |
| Create Data Source from schema | 25 times per minute |
| Create Data Source from file or URL * | 5 times per minute |
| Append data to Data Source * | 5 times per minute |
| Append data to Data Source using<a href="/docs/api-reference/events-api#post-v0events"> `POST /v0/events`</a> | 100 times per second |
| Create Data Source using<a href="/docs/api-reference/events-api#post-v0events"> `POST /v0/events`</a> | 5 times per minute |
| Replace data in a Data Source * | 5 times per minute |

* The quota is shared at Workspaces level when creating, appending data, or replacing data. For example, you can't do 5 requests of each type per minute, for a total of 15 requests. You can do at most a grand total of 5 requests of those types combined.

The number of rows in append requests doesn't impact the ingestion limit; each request counts as a single ingestion.

If you exceed your rate limit, your request is throttled and you receive *HTTP 429 Too Many Requests* response codes from the API. Each response contains a set of HTTP headers with your current rate limit status.

| Header Name | Description |
| --- | --- |
| `X-RateLimit-Limit` | The maximum number of requests you're permitted to make in the current limit window. |
| `X-RateLimit-Remaining` | The number of requests remaining in the current rate limit window. |
| `X-RateLimit-Reset` | The time in seconds after the current rate limit window resets. |
| `Retry-After` | The time to wait before making a another request. Only present on 429 responses. |

## Query limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#query-limits)

The following query limits apply to all workspaces in an organization, regardless of the plan.

| Description | Limit |
| --- | --- |
| SQL length | 8KB |
| Result length | 100 MB |
| Query execution time | 10 seconds |

If you exceed your rate limit, your request will be throttled and you will receive *HTTP 429 Too Many Requests* response codes from the API. Each response contains a set of HTTP headers with your current rate limit status.

| Header Name | Description |
| --- | --- |
| `X-RateLimit-Limit` | The maximum number of requests you're permitted to make in the current limit window. |
| `X-RateLimit-Remaining` | The number of requests remaining in the current rate limit window. |
| `X-RateLimit-Reset` | The time in seconds after the current rate limit window resets. |
| `Retry-After` | The time to wait before making a another request. Only present on 429 responses. |

### Query timeouts [¶](https://www.tinybird.co/docs/forward/pricing/limits#query-timeouts)

If query execution time exceeds the default limit of 10 seconds, an error message appears. Long execution times hint at issues that need to be fixed in the query or the Data Source schema.

To avoid query timeouts, you can:

- Optimize your queries to remove inefficiencies and common mistakes.
- Increase the number of available threads in your plan to process queries concurrently. See[  Max threads](../pricing/concepts#max-threads)   to understand how threads affect query execution.

If you still need to increase the timeout limit, contact support. See [Get help](../pricing/support#get-help).

Only paid accounts can raise the timeout limit.

## Publishing limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#publishing-limits)

The following publishing limits apply to all workspaces in an organization, regardless of the plan.

### Copy Pipe limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#copy-pipe-limits)

Copy Pipes have the following limits, depending on your billing plan:

| Plan | Copy Pipes per Workspace | Execution time | Frequency | Active jobs (running or queued) |
| --- | --- | --- | --- | --- |
| Free | 1 | 20s | Once an hour | 1 |
| Developer | 3 | 30s | Up to every 10 minutes | 3 |
| Enterprise | 10 | 50% of the scheduling period, 30 minutes max | Up to every minute | 6 |

The 6 active Copy Pipe jobs includes:

- 2 concurrently running jobs, and
- 4 additional jobs in the queue (pending execution).

This limit ensures fair resource usage and system stability across all users. If your workflows require higher throughput, contact [support@tinybird.co](mailto:support@tinybird.co) so that we may evaluate your use case and scale as needed.

### Sink Pipe limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#sink-pipe-limits)

Sink Pipes have the following limits, depending on your billing plan:

| Plan | Sink Pipes per Workspace | Execution time | Frequency | Memory usage per query | Active jobs (running or queued) |
| --- | --- | --- | --- | --- | --- |
| Developer | 3 | 30s | Up to every 10 min | 10 GB | 3 |
| Enterprise | 10 | 300s | Up to every minute | 10 GB | 6 |

## Delete limits [¶](https://www.tinybird.co/docs/forward/pricing/limits#delete-limits)

Delete jobs have the following limits, depending on your billing plan:

| Plan | Active delete jobs per Workspace |
| --- | --- |
| Free | 1 |
| Developer | 3 |
| Enterprise | 6 |

## Next steps [¶](https://www.tinybird.co/docs/forward/pricing/limits#next-steps)

- Understand how Tinybird[  plans and billing work](../pricing/billing)  .


---

URL: https://www.tinybird.co/docs/forward/pricing/concepts
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Key concepts · Tinybird Docs"
theme-color: "#171612"
description: "Key concepts for understanding your bill and plan limits."
---


# Billing and limits concepts [¶](https://www.tinybird.co/docs/forward/pricing/concepts#billing-and-limits-concepts)

Copy as MD Tinybird [plans](../pricing) are sized and billed according to available resources and usage, with limits that you can exceed temporarily.

Read on to understand the key concepts behind your bill and plan limits.

## Active vCPU minutes / hours [¶](https://www.tinybird.co/docs/forward/pricing/concepts#active-vcpu-minutes-hours)

Developer plans bill vCPU usage using active minutes. An active minute is when any operation used a vCPU for a minute. Multiple operations executed within the same minute still count as a single active minute. When using fractioned vCPUs, an active minute is proportional to the fraction, for example 30 seconds of 0.5 vCPU.

Plan sizes come with a number of active hours, which is the number of active minutes you can use divided by 60. If you consume all your active minutes, the overage is billed at a fixed rate per minute. Usage bursts allows you to temporarily exceed the vCPU usage limit. See [vCPU burst mode](https://www.tinybird.co/docs/forward/pricing/concepts#vcpu-burst-mode).

## Queries per second [¶](https://www.tinybird.co/docs/forward/pricing/concepts#queries-per-second)

Queries per second (QPS) is the number of queries per second that your plan includes. Calls to API endpoints and queries sent to the Query API count towards your QPS allowance. Queries made from the UI are excluded from the limit.

Plan sizes come with a number of included QPS, which is the maximum number of queries per second that your plan allows without incurring additional costs. If you are in a paid plan and you exceed the QPS allowance, the platform will support the traffic peaks (up to a ceiling of 4x the QPS included in your plan) but the requests above the QPS allowance will be subject to additional costs at a fixed rate per request. If you go beyond that plan's ceiling, you will be rate limited for those requests. See [QPS Overages and QPS Ceiling](https://www.tinybird.co/docs/forward/pricing/concepts#qps-overages-and-qps-ceiling).

You'll receive emails alerting you about QPS overages as well as when the accumulated overage costs go beyond 20% of your plan's fixed fee (due to extra QPS or Active vCPU minutes, if that were the case). If your consumption grows and upgrading to the next plan would be cheaper, we will email you as well with the recommendation.

If you're in the Free plan, you're probably still exploring the platform and how to get value for your use case, so we grant you some margin for peaks. See [QPS burst mode for Free plan](https://www.tinybird.co/docs/forward/pricing/concepts#qps-burst-mode-for-free-plan)

## vCPU burst mode [¶](https://www.tinybird.co/docs/forward/pricing/concepts#vcpu-burst-mode)

This mode allows you to temporarily exceed your vCPU limit. If you temporarily exceed your limits, you won't be billed and you'll receive an email alerting you of the situation and suggesting to increase your plan size.

Your operations can take x2 vCPU time per minute allowed in you plan for real-time operations. For batch operations, like populates or copies, we allow the whole operation to run until it reaches a platform limit, like maximum available memory.

For example, for a populate that needs 180 seconds of CPU time in a minute, if you are on a Developer Plan S-1 where we allow operations to run up to 120 seconds per minute, the operation will finish and the limit won't be triggered.

## QPS Overages and QPS Ceiling [¶](https://www.tinybird.co/docs/forward/pricing/concepts#qps-overages-and-qps-ceiling)

Once you are over your plan's QPS allowance, you can keep making requests up to 4 times your plan's QPS (the plan's QPS ceiling), and those requests above the allowance are subject to additional costs at a fixed rate per request. This applies to API endpoint and Query API requests.


<-figure->
![QPS Overages and QPS Ceiling](/docs/_next/image?url=%2Fdocs%2Fimg%2Fqps_chart.png&w=3840&q=75)

</-figure->
Example:

- You are in an S-1/4 Developer plan, which includes 10 QPS
- Your project is growing strong and you get a sudden traffic peak of 20 QPS for a short period (let's say during 120 seconds)
- You are under your plan's ceiling (in this case 40), so you're not rate limited.
- There have been 10 (20 minus 10) requests over your plan's allowance for 120 seconds, which equals 1,200 requests over QPS included in your plan that will be billed at a small fixed rate per request in the next bill
- You can stay in your current plan

You keep growing during the next weeks:

- You start to have a pretty regular use of ~40 QPS
- Requests over 40 QPS (the plan's ceiling) will be rate-limited
- After some days, you receive an email about QPS Overages and current overage costs, with a plan recommendation
- You can upgrade to a higher Developer plan that includes 40 QPS (or whatever you need), and from that instant, your traffic falls within the new plan's ceiling.

While your org is temporarily limited due to vCPU high usage (over vCPU burst capacity) you won't have access to QPS Ceiling (QPS Allowance x4) and you will be rate-limited to your QPS allowance.

## QPS burst mode for Free plan [¶](https://www.tinybird.co/docs/forward/pricing/concepts#qps-burst-mode-for-free-plan)

Your operations, while on the Free plan, can take x2 QPS per second for API endpoint requests or SQL queries.

To better understand how burst mode works, consider the following:

- Your free plan allows 10 queries per second as the normal rate (leak rate).
- You have a burst capacity of 20 QPS, meaning you can temporarily handle up to 20 queries per second for short bursts.
- Each second, your bucket can "leak" 10 queries and can temporarily hold more if needed.

Here's how burst mode works in practice:

- If you send 15 queries in one second, 10 are processed at the normal rate and 5 use burst capacity.
- If you send 25 queries in one second, 10 are processed at normal rate, 10 use burst capacity, and 5 are rejected (over the 20 QPS burst limit).
- The burst capacity "leaks" back to normal levels at a rate of 10 QPS, meaning after a burst, your capacity gradually returns to the standard rate.

The leaking mechanism ensures you can handle occasional traffic spikes while maintaining overall performance and preventing system overload.

For example, if you use your full burst capacity of 20 QPS, it takes about 1 second of processing at the normal 10 QPS rate before you can burst again. This helps balance flexibility for traffic spikes with consistent system performance.

## Max threads [¶](https://www.tinybird.co/docs/forward/pricing/concepts#max-threads)

Max threads refers to the maximum number of concurrent threads that can be used to execute queries in API Endpoints and Query API calls. Each thread represents a separate processing unit that can handle part of a query independently.

Having more threads available means your queries can be processed with higher parallelism, potentially improving overall query performance when dealing with multiple concurrent requests. The number of max threads available depends on your plan:

- Free plan: Limited to 1 thread
- Developer and Enterprise shared plans: Limited by the threads included in your plan
- Enterprise dedicated plans: Limited by the underlying infrastructure

While more threads can improve concurrent query processing, the final performance also depends on factors like your vCPU limit and the complexity of your queries.

## Next steps [¶](https://www.tinybird.co/docs/forward/pricing/concepts#next-steps)

- Learn how to[  estimate your plan size](../pricing#estimate-your-plan-size)  .
- Read the[  billing docs](../pricing/billing)   to understand which data operations count towards your bill, and how to optimize your usage.
- Learn about[  limits](../pricing/limits)   and how to adjust them.


---

URL: https://www.tinybird.co/docs/forward/pricing/billing
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Understand your bill · Tinybird Docs"
theme-color: "#171612"
description: "Information about billing, what it's based on, as well as Tinybird pricing plans."
---


# Understand your bill [¶](https://www.tinybird.co/docs/forward/pricing/billing#understand-your-bill)

Copy as MD If you are on a paid plan, read on to learn how billing works for each plan. See [Tinybird plans](../pricing) for information on each plan's features.

To learn about the prices of each plan, see [Pricing](https://www.tinybird.co/pricing).

## Developer plans [¶](https://www.tinybird.co/docs/forward/pricing/billing#developer-plans)

Developer plans come in many sizes, depending on the compute resources you need. Each size includes a base capacity for your organization, which is a fixed amount you pay for. Other items are billed according to usage beyond a threshold.

Developer plans billing is calculated from the following resources:

| Resource | Billing | Type |
| --- | --- | --- |
| vCPUs | Number of vCPUs available. More than 3 vCPUs require an Enterprise plan. Each vCPU size includes a number of active minutes. | Fixed. |
| Active minutes | An active minute is when any operation used a vCPU for a minute. See[  Active minutes](../pricing/concepts#active-minutes)  . | Usage based. |
| Queries per second (QPS) | Number of queries per second (QPS). Each vCPU size includes a QPS allowance. See[  Queries per second](../pricing/concepts#queries-per-second)  . | Usage based when usage exceeds the QPS included in the plan. |
| Storage | Average of daily maximum usage of the compressed disk storage of all your data, in gigabytes. | Usage based when usage exceeds the 25 GB included in the plan. |
| Data transfer | When using[  Sinks](/docs/api-reference/sink-pipes-api)   , usage is billed depending on the destination, which can be the same cloud provider and region as your Tinybird cluster, or a different one. | Usage based. |

### Self service [¶](https://www.tinybird.co/docs/forward/pricing/billing#self-service)

After you enter your credit card details, you can pick whatever size you want for your Developer plan, with up to 3 vCPUs supported. You can resize your plan every 24 hours. See [Resize and upgrade](../pricing#developer-resize-and-upgrade).

To use more than 3 vCPUs, you need to sign up to an Enterprise plan. See [Enterprise plans](https://www.tinybird.co/docs/forward/pricing/billing#enterprise-plans).

### Periodicity [¶](https://www.tinybird.co/docs/forward/pricing/billing#developer-periodicity)

Developer plans are billed monthly. Each monthly invoice contain fixed costs and usage-based costs calculated from the previous period.

The billing period starts on the subscription date.

## Enterprise plans [¶](https://www.tinybird.co/docs/forward/pricing/billing#enterprise-plans)

Enterprise plans provide plans greater than 3 vCPUs. The size of each plan determines the fixed amount you pay for. Other items are billed according to usage beyond a threshold.

Enterprise plans start at 4 vCPUs of capacity and have a minimum storage of 1 terabyte. Resizing requires contacting sales.

Plans with dedicated compute resources are also available. Contact Tinybird support at [support@tinybird.co](mailto:support@tinybird.co) to learn more.

### What are credits? [¶](https://www.tinybird.co/docs/forward/pricing/billing#what-are-credits)

Enterprise billing is based on [credits](https://www.tinybird.co/docs/forward/pricing/billing#what-are-credits) . A credit is a unit of resource usage. You can acquire and spend credits across the entire Tinybird offering, regardless of the region or feature.

Credits provide a predictable, consistent, and easy to track way of paying for Tinybird services. Instead of committing to specific features or an amount of resources, you commit to spending a number of credits over at least a 12 months period. You can spend credits on storage, compute, support, and so on.

### Credits usage [¶](https://www.tinybird.co/docs/forward/pricing/billing#credits-usage)

The following table shows how Tinybird calculates credits usage for each resource:

| Resource | Billing | Type |
| --- | --- | --- |
| vCPUs | Number of vCPUs available. Each vCPU size includes a number of active minutes. | Fixed. |
| Active minutes | An active minute is when any operation used a vCPU for a minute. See[  Active minutes](../pricing/concepts#active-minutes)  . | Usage based. |
| Queries per second (QPS) | Number of queries per second (QPS). Each vCPU size includes a QPS allowance. See[  Queries per second](../pricing/concepts#queries-per-second)  . | Usage based when usage exceeds the QPS included in the plan. |
| Storage | Average of daily maximum usage of the compressed disk storage of all your data, in gigabytes. | Usage based when usage exceeds the storage included in the plan. |
| Data transfer | When using[  Sinks](/docs/api-reference/sink-pipes-api)   , usage is billed depending on the destination, which can be the same cloud provider and region as your Tinybird cluster, or a different one. | Usage based. |
| Support | Premier or Enterprise monthly support fee. | Fixed. |
| Private Link | Private connection. Optional. Billed monthly. | Usage based. |

### Periodicity [¶](https://www.tinybird.co/docs/forward/pricing/billing#enterprise-periodicity)

Enterprise plans are billed every month according to the amount of [credits](https://www.tinybird.co/docs/forward/pricing/billing#what-are-credits) you've used. The billing cycle starts on the first day of the month and ends on the last day of the month.

### Track invoices [¶](https://www.tinybird.co/docs/forward/pricing/billing#track-invoices)

In Enterprise plans, invoices are issued upon [credits](https://www.tinybird.co/docs/forward/pricing/billing#what-are-credits) purchase, which can happen when signing the contract or when purchasing additional credits. You can check your invoices from the customer portal.

### Monitor usage [¶](https://www.tinybird.co/docs/forward/pricing/billing#monitor-usage)

You can monitor [credits](https://www.tinybird.co/docs/forward/pricing/billing#what-are-credits) usage, including remaining credits, cluster usage, and current commitment through your organization's dashboard. See [Enterprise infrastructure monitoring](../administration/organizations#dedicated-infrastructure-monitoring) . You can also check usage using the monthly usage receipts.

### Rate limiter [¶](https://www.tinybird.co/docs/forward/pricing/billing#rate-limiter)

The rate limiter monitors the status of the cluster and limits the number of concurrent requests to prevent the cluster from crashing due to insufficient memory. This allows the cluster to continue working, albeit with a rate limit.

The rate limiter activates when the following situation occurs:

- When total memory usage in a host in the cluster is over 70% in clusters with less than 64GB of memory per host and 80% in the rest.
- Percentage of 408 Timeout Exceeded and 500 Internal Server Error due to memory limits for a Pipe endpoint exceeds 10% of the total requests.

If both conditions are met, the maximum number of concurrent requests to the Pipe endpoint is limited proportionally to the percentage of errors. Workspace administrators in dedicated infrastructure receive an email indicating the affected Pipe endpoints and the concurrency limit.

The rate limiter rolls back after 5 minutes and it's activated again if the previously described conditions repeat.

For example, if a Pipe endpoint is receiving 10 requests per second and 5 failed during a high memory usage scenario due to a timeout or memory error, the number of concurrent queries is limited to a half, that is, 5 concurrent requests for that specific Pipe endpoint.

While the Rate limiter is active, endpoints return a 429 HTTP status code. You can retry those requests using a backoff mechanism. For example, you can space requests 1 second between each other.

## See also [¶](https://www.tinybird.co/docs/forward/pricing/billing#see-also)

- Explore different[  Tinybird plans](../pricing)  .


---

URL: https://www.tinybird.co/docs/forward/monitoring/service-datasources
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Service data sources · Tinybird Docs"
theme-color: "#171612"
description: "In addition to the data sources you upload, Tinybird provides other "Service data sources" that allow you to inspect what's going on in your account."
---


# Service data sources [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#service-data-sources)

Copy as MD Tinybird provides Service data sources that you can use to inspect what's going on in your Tinybird account, diagnose issues, monitor usage, and so on.

For example, you can get real time stats about API calls or a log of every operation over your data sources. This is similar to using system tables in a database, although Service data sources contain information about the usage of the service itself.

Queries made to Service data sources are free of charge and don't count towards your usage. However, calls to API endpoints that use Service data sources do count towards API rate limits. See [Billing](../pricing/billing).

## Considerations [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#considerations)

- You can't use Service data sources in materialized view queries.
- Pass dynamic query parameters to API endpoints to then query Service data sources.
- You can only query Organization-level Service data sources if you're an administrator. See[  Consumption overview](../administration/organizations#consumption-overview)  .

## Service data sources [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#service-data-sources)

The following Service data sources are available.

### tinybird.pipe_stats_rt [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-pipe-stats-rt)

Contains information about all requests made to your [API endpoints](../work-with-data/publish-data/endpoints) in real time. This data source has a TTL of 7 days. If you need to query data older than 7 days you must use the aggregated by day data available at [tinybird.pipe_stats](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-pipe-stats).

Calls made against Service data sources are not logged and don't count towards usage limits.

| Field | Type | Description |
| --- | --- | --- |
| `start_datetime` | `DateTime` | API call start date and time. |
| `pipe_id` | `String` | pipe Id as returned in our[  pipes API](/docs/api-reference/pipe-api)   ( `query_api`   in case it's a Query API request). |
| `pipe_name` | `String` | pipe name as returned in our[  pipes API](/docs/api-reference/pipe-api)   ( `query_api`   in case it's a Query API request). |
| `duration` | `Float` | API call duration, in seconds. |
| `read_bytes` | `UInt64` | API call read data, in bytes. |
| `read_rows` | `UInt64` | API call rows read. |
| `result_rows` | `UInt64` | Rows returned by the API call. |
| `url` | `String` | URL ( `token`   param is removed for security reasons). |
| `error` | `UInt8` | `1`   if query returned error, else `0`  . |
| `request_id` | `String` | API call identifier returned in `x-request-id`   header. Format is ULID string. |
| `token` | `String` | API call token identifier used. |
| `token_name` | `String` | API call token name used. |
| `status_code` | `Int32` | API call returned status code. |
| `method` | `String` | API call method POST or GET. |
| `parameters` | `Map(String, String)` | API call parameters used. |
| `release` | `String` | Semantic version of the release (deprecated). |
| `user_agent` | `Nullable(String)` | User Agent HTTP header from the request. |
| `resource_tags` | `Array(String)` | Tags associated with the pipe when the request was made. |
| `cpu_time` | `Float` | CPU time used by the query, in seconds. |

### tinybird.pipe_stats [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-pipe-stats)

Aggregates the request stats in [tinybird.pipe_stats_rt](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-pipe-stats-rt) by day.

Calls made against Service data sources are not logged and don't count towards usage limits.

| Field | Type | Description |
| --- | --- | --- |
| `date` | `Date` | Request date and time. |
| `pipe_id` | `String` | pipe Id as returned in our[  pipes API](/docs/api-reference/pipe-api)  . |
| `pipe_name` | `String` | Name of the pipe. |
| `view_count` | `UInt64` | Request count. |
| `error_count` | `UInt64` | Number of requests with error. |
| `avg_duration_state` | `AggregateFunction(avg, Float32)` | Average duration state, in seconds (see[  Querying _state columns](https://www.tinybird.co/docs/forward/monitoring/service-datasources#querying-state-columns)   ). |
| `quantile_timing_state` | `AggregateFunction(quantilesTiming(0.9, 0.95, 0.99), Float64)` | 0.9, 0.95 and 0.99 quantiles state. Time, in milliseconds (see[  Querying _state columns](https://www.tinybird.co/docs/forward/monitoring/service-datasources#querying-state-columns)   ). |
| `read_bytes_sum` | `UInt64` | Total bytes read. |
| `read_rows_sum` | `UInt64` | Total rows read. |
| `resource_tags` | `Array(String)` | All the tags associated with the resource when the aggregated requests were made. |

### tinybird.bi_stats_rt [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-bi-stats-rt)

This data source has a TTL of 7 days. If you need to query data older than 7 days you must use the aggregated by day data available at [tinybird.bi_stats](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-bi-stats).

| Field | Type | Description |
| --- | --- | --- |
| `start_datetime` | `DateTime` | Query start timestamp. |
| `query` | `String` | Executed query. |
| `query_normalized` | `String` | Normalized executed query. This is the pattern of the query, without literals. Useful to analyze usage patterns. |
| `error_code` | `Int32` | Error code, if any. `0`   on normal execution. |
| `error` | `String` | Error description, if any. Empty otherwise. |
| `duration` | `UInt64` | Query duration, in milliseconds. |
| `read_rows` | `UInt64` | Read rows. |
| `read_bytes` | `UInt64` | Read bytes. |
| `result_rows` | `UInt64` | Total rows returned. |
| `result_bytes` | `UInt64` | Total bytes returned. |

### tinybird.bi_stats [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-bi-stats)

Aggregates the stats in [tinybird.bi_stats_rt](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-bi-stats-rt) by day.

| Field | Type | Description |
| --- | --- | --- |
| `date` | `Date` | Stats date. |
| `database` | `String` | Database identifier. |
| `query_normalized` | `String` | Normalized executed query. This is the pattern of the query, without literals. Useful to analyze usage patterns. |
| `view_count` | `UInt64` | Requests count. |
| `error_count` | `UInt64` | Error count. |
| `avg_duration_state` | `AggregateFunction(avg, Float32)` | Average duration state, in milliseconds (see[  Querying _state columns](https://www.tinybird.co/docs/forward/monitoring/service-datasources#querying-state-columns)   ). |
| `quantile_timing_state` | `AggregateFunction(quantilesTiming(0.9, 0.95, 0.99), Float64)` | 0.9, 0.95 and 0.99 quantiles state. Time, in milliseconds (see[  Querying _state columns](https://www.tinybird.co/docs/forward/monitoring/service-datasources#querying-state-columns)   ). |
| `read_bytes_sum` | `UInt64` | Total bytes read. |
| `read_rows_sum` | `UInt64` | Total rows read. |
| `avg_result_rows_state` | `AggregateFunction(avg, Float32)` | Total bytes returned state (see[  Querying _state columns](https://www.tinybird.co/docs/forward/monitoring/service-datasources#querying-state-columns)   ). |
| `avg_result_bytes_state` | `AggregateFunction(avg, Float32)` | Total rows returned state (see[  Querying _state columns](https://www.tinybird.co/docs/forward/monitoring/service-datasources#querying-state-columns)   ). |

### tinybird.block_log [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-block-log)

The data source contains details about how Tinybird ingests data into your data sources. You can use this Service data source to spot problematic parts of your data.

| Field | Type | Description |
| --- | --- | --- |
| `timestamp` | `DateTime` | Date and time of the block ingestion. |
| `import_id` | `String` | Id of the import operation. |
| `job_id` | `Nullable(String)` | Id of the job that ingested the block of data, if it was ingested by URL. In this case, `import_id`   and `job_id`   must have the same value. |
| `request_id` | `String` | Id of the request that performed the operation. In this case, `import_id`   and `job_id`   must have the same value. Format is ULID string. |
| `source` | `String` | Either the URL or `stream`   or `body`   keywords. |
| `block_id` | `String` | Block identifier. You can cross this with the `blocks_ids`   column from the[  tinybird.datasources_ops_log](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-datasources-ops-log)   Service data source. |
| `status` | `String` | `done`   | `error`  . |
| `datasource_id` | `String` | data source consistent id. |
| `datasource_name` | `String` | data source name when the block was ingested. |
| `start_offset` | `Nullable(Int64)` | The starting byte of the block, if the ingestion was split, where this block started. |
| `end_offset` | `Nullable(Int64)` | If split, the ending byte of the block. |
| `rows` | `Nullable(Int32)` | How many rows it ingested. |
| `parser` | `Nullable(String)` | Whether the native block parser or falling back to row by row parsing is used. |
| `quarantine_lines` | `Nullable(UInt32)` | If any, how many rows went into the quarantine data source. |
| `empty_lines` | `Nullable(UInt32)` | If any, how many empty lines were skipped. |
| `bytes` | `Nullable(UInt32)` | How many bytes the block had. |
| `processing_time` | `Nullable(Float32)` | How long it took, in seconds. |
| `processing_error` | `Nullable(String)` | Detailed message in case of error. |

When Tinybird ingests data from a URL, it splits the download in several requests, resulting in different ingestion blocks. The same happens when the data upload happens with a multipart request.

### tinybird.datasources_ops_log [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-datasources-ops-log)

Contains all operations performed to your data sources. Tinybird tracks the following operations:

| Event | Description |
| --- | --- |
| `create` | A data source is created. |
| `append` | Append operation. |
| `append-hfi` | Append operation using the[  High-frequency Ingestion API](/docs/forward/get-data-in/events-api)  . |
| `append-kafka` | Append operation using the[  Kafka Connector](/docs/forward/get-data-in/connectors/kafka)  . |
| `replace` | A replace operation took place in the data source. |
| `delete` | A delete operation took place in the data source. |
| `truncate` | A truncate operation took place in the data source. |
| `rename` | The data source was renamed. |
| `populateview-queued` | A populate operation was queued for execution. |
| `populateview` | A finished populate operation (up to 8 hours after it started). |
| `copy` | A copy operation took place in the data source. |
| `alter` | An alter operation took place in the data source. |

Materializations are logged with same `event_type` and `operation_id` as the operation that triggers them. You can track the materialization pipe with `pipe_id` and `pipe_name`.

Tinybird logs all operations with the following information in this data source:

| Field | Type | Description |
| --- | --- | --- |
| `timestamp` | `DateTime` | Date and time when the operation started. |
| `event_type` | `String` | Operation being logged. |
| `operation_id` | `String` | Groups rows affected by the same operation. Useful for checking materializations triggered by an append operation. |
| `datasource_id` | `String` | Id of your data source. The data source id is consistent after renaming operations. You should use the id when you want to track name changes. |
| `datasource_name` | `String` | Name of your data source when the operation happened. |
| `result` | `String` | `ok`   | `error` |
| `elapsed_time` | `Float32` | How much time the operation took, in seconds. |
| `error` | `Nullable(String)` | Detailed error message if the result was error. |
| `import_id` | `Nullable(String)` | Id of the import operation, if data has been ingested using one of the following operations: `create`  , `append`   or `replace` |
| `job_id` | `Nullable(String)` | Id of the job that performed the operation, if any. If data has been ingested, `import_id`   and `job_id`   must have the same value. |
| `request_id` | `String` | Id of the request that performed the operation. If data has been ingested, `import_id`   and `request_id`   must have the same value. Format is ULID string. |
| `rows` | `Nullable(UInt64)` | How many rows the operations affected. This depends on `event_type`   : for the `append`   event, how many rows got inserted; for `delete`   or `truncate`   events, how many rows the data source had; for `replace`   , how many rows the data source has after the operation. |
| `rows_quarantine` | `Nullable(UInt64)` | How many rows went into the quarantine data source, if any. |
| `blocks_ids` | `Array(String)` | List of blocks ids used for the operation. See the[  tinybird.block_log](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-block-log)   Service data source for more details. |
| `options` | `Nested(Names String, Values String)` | Tinybird stores key-value pairs with extra information for some operations. For the `replace`   event, Tinybird uses the `rows_before_replace`   key to track how many rows the data source had before the replacement happened, the `replace_condition`   key shows what condition was used. For `append`   and `replace`   events, Tinybird stores the data `source`   , for example the URL, or body/stream keywords. For `rename`   event, `old_name`   and `new_name`   . For `populateview`   you can find there the whole populate `job`   metadata as a JSON string. For `alter`   events, Tinybird stores `operations`   , and dependent pipes as `dependencies`   if they exist. |
| `read_bytes` | `UInt64` | Read bytes in the operation. |
| `read_rows` | `UInt64` | Read rows in the operation. |
| `written_rows` | `UInt64` | Written rows in the operation. |
| `written_bytes` | `UInt64` | Written bytes in the operation. |
| `written_rows_quarantine` | `UInt64` | Quarantined rows in the operation. |
| `written_bytes_quarantine` | `UInt64` | Quarantined bytes in the operation. |
| `pipe_id` | `String` | If present, materialization pipe id as returned in our[  pipes API](/docs/api-reference/pipe-api)  . |
| `pipe_name` | `String` | If present, materialization pipe name as returned in our[  pipes API](/docs/api-reference/pipe-api)  . |
| `release` | `String` | Semantic version of the release (deprecated). |
| `resource_tags` | `Array(String)` | Tags associated with the pipe when the request was made. |
| `cpu_time` | `Float32` | CPU time used by the operation, in seconds. |
| `memory_usage` | `UInt64` | Memory consuptiom by the operation, in bytes. |

### tinybird.datasource_ops_stats [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-datasource-ops-stats)

Data from `datasource_ops_log` , aggregated by day.

| Field | Type | Description |
| --- | --- | --- |
| `event_date` | `Date` | Date of the event. |
| `workspace_id` | `String` | Unique identifier for the workspace. |
| `event_type` | `String` | Name of your data source. |
| `pipe_id` | `String` | Identifier of the pipe. |
| `pipe_name` | `String` | Name of the pipe. |
| `error_count` | `UInt64` | Number of requests with an error. |
| `executions` | `UInt64` | Number of executions. |
| `avg_elapsed_time_state` | `Float32` | Average time spent in elapsed state. |
| `quantiles_state` | `Float32` | 0.9, 0.95 and 0.99 quantiles state. Time in milliseconds (see[  Querying _state columns](https://www.tinybird.co/docs/forward/monitoring/service-datasources#querying-state-columns)   ). |
| `read_bytes` | `UInt64` | Read bytes in the operation. |
| `read_rows` | `UInt64` | Read rows in the Sink operation. |
| `written_rows` | `UInt64` | Written rows in the Sink operation. |
| `read_bytes` | `UInt64` | Read bytes in the operation. |
| `written_bytes` | `UInt64` | Written bytes in the operation. |
| `written_rows_quarantine` | `UInt64` | Quarantined rows in the operation. |
| `written_bytes_quarantine` | `UInt64` | Quarantined bytes in the operation. |
| `resource_tags` | `Array(String)` | Tags associated with the pipe when the request was made. |

### tinybird.endpoint_errors [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-endpoint-errors)

It provides the last 30 days errors of your published endpoints. Tinybird logs all errors with additional information in this data source.

| Field | Type | Description |
| --- | --- | --- |
| `start_datetime` | `DateTime` | Date and time when the API call started. |
| `request_id` | `String` | The id of the request that performed the operation. Format is ULID string. |
| `pipe_id` | `String` | If present, pipe id as returned in our[  pipes API](/docs/api-reference/pipe-api)  . |
| `pipe_name` | `String` | If present, pipe name as returned in our[  pipes API](/docs/api-reference/pipe-api)  . |
| `params` | `Nullable(String)` | URL query params included in the request. |
| `url` | `Nullable(String)` | URL pathname. |
| `status_code` | `Nullable(Int32)` | HTTP error code. |
| `error` | `Nullable(String)` | Error message. |
| `resource_tags` | `Array(String)` | Tags associated with the pipe when the request was made. |

### tinybird.kafka_ops_log [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-kafka-ops-log)

Contains all operations performed to your Kafka data sources during the last 30 days.

| Field | Type | Description |
| --- | --- | --- |
| `timestamp` | `DateTime` | Date and time when the operation took place. |
| `datasource_id` | `String` | Id of your data source. The data source id is consistent after renaming operations. You should use the id when you want to track name changes. |
| `topic` | `String` | Kafka topic. |
| `partition` | `Int16` | Partition number, or `-1`   for all partitions. |
| `msg_type` | `String` | 'info' for regular messages, 'warning' for issues related to the user's Kafka cluster, deserialization or materialized views, and 'error' for other issues. |
| `lag` | `Int64` | Number of messages behind for the partition. This is the difference between the high-water mark and the last commit offset. |
| `processed_messages` | `Int32` | Messages processed for a topic and partition. |
| `processed_bytes` | `Int32` | Amount of bytes processed. |
| `committed_messages` | `Int32` | Messages ingested for a topic and partition. |
| `msg` | `String` | Information in the case of warnings or errors. Empty otherwise. |

### tinybird.datasources_storage [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-datasources-storage)

Contains stats about your data sources storage.

Tinybird logs maximum values per hour, the same as when it calculates storage consumption.

| Field | Type | Description |
| --- | --- | --- |
| `datasource_id` | `String` | Id of your data source. The data source id is consistent after renaming operations. You should use the id when you want to track name changes. |
| `datasource_name` | `String` | Name of your data source. |
| `timestamp` | `DateTime` | When storage was tracked. By hour. |
| `bytes` | `UInt64` | Max number of bytes the data source has, not including quarantine. |
| `rows` | `UInt64` | Max number of rows the data source has, not including quarantine. |
| `bytes_quarantine` | `UInt64` | Max number of bytes the data source has in quarantine. |
| `rows_quarantine` | `UInt64` | Max number of rows the data source has in quarantine. |

### tinybird.releases_log (deprecated) [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-releases-log-deprecated)

Contains operations performed to your releases. Tinybird tracks the following operations:

| Event | Description |
| --- | --- |
| `init` | First Release is created on Git sync. |
| `override` | Release commit is overridden. `tb init --override-commit {{commit}}`  . |
| `deploy` | Resources from a commit are deployed to a Release. |
| `preview` | Release status is changed to preview. |
| `promote` | Release status is changed to live. |
| `post` | Resources from a commit are deployed to the live Release. |
| `rollback` | Rollback is done a previous Release is now live. |
| `delete` | Release is deleted. |

Tinybird logs all operations with additional information in this data source.

| Field | Type | Description |
| `timestamp` | `DateTime64` | Date and time when the operation took place. |
| `event_type` | `String` | Name of your data source. |
| `semver` | `String` | Semantic version identifies a release. |
| `commit` | `String` | Git sha commit related to the operation. |
| `token` | `String` | API call token identifier used. |
| `token_name` | `String` | API call token name used. |
| `result` | `String` | `ok`   | `error` |
| `error` | `String` | Detailed error message. |

### tinybird.sinks_ops_log [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-sinks-ops-log)

Contains all operations performed to your Sink pipes.

| Field | Type | Description |
| `timestamp` | `DateTime64` | Date and time when the operation took place. |
| `service` | `LowCardinality(String)` | Type of Sink (GCS, S3, and so on). |
| `pipe_id` | `String` | The ID of the Sink pipe. |
| `pipe_name` | `String` | the name of the Sink pipe. |
| `token_name` | `String` | Token name used. |
| `result` | `LowCardinality(String)` | `ok`   | `error` |
| `error` | `Nullable(String)` | Detailed error message. |
| `elapsed_time` | `Float64` | The duration of the operation in seconds. |
| `job_id` | `Nullable(String)` | ID of the job that performed the operation, if any. |
| `read_rows` | `UInt64` | Read rows in the Sink operation. |
| `written_rows` | `UInt64` | Written rows in the Sink operation. |
| `read_bytes` | `UInt64` | Read bytes in the operation. |
| `written_bytes` | `UInt64` | Written bytes in the operation. |
| `output` | `Array(String)` | The outputs of the operation. In the case of writing to a bucket, the name of the written files. |
| `parameters` | `Map(String, String)` | The parameters used. Useful to debug the parameter query values. |
| `options` | `Map(String, String)` | Extra information. You can access the values with `options['key']`   where key is one of: file_template, file_format, file_compression, bucket_path, execution_type. |
| `cpu_time` | `Float64` | The CPU time used by the sinks, in seconds. |

### tinybird.data_transfer [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-data-transfer)

Stats of data transferred per hour by a workspace.

| Field | Type | Description |
| `timestamp` | `DateTime` | Date and time data transferred is tracked. By hour. |
| `event` | `LowCardinality(String)` | Type of operation generated the data (ie. `sink`   ). |
| `origin_provider` | `LowCardinality(String)` | Provider data was transferred from. |
| `origin_region` | `LowCardinality(String)` | Region data was transferred from. |
| `destination_provider` | `LowCardinality(String)` | Provider data was transferred to. |
| `destination_region` | `LowCardinality(String)` | Region data was transferred to. |
| `kind` | `LowCardinality(String)` | `intra`   | `inter`   depending if the data moves within or outside the region. |

### tinybird.jobs_log [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#tinybird-jobs-log)

Contains all job executions performed in your workspace. Tinybird logs all jobs with extra information in this data source:

| Field | Type | Description |
| --- | --- | --- |
| `job_id` | `String` | Unique identifier for the job. |
| `job_type` | `LowCardinality(String)` | Type of job execution. `delete_data`  , `import`  , `populateview`  , `query`  , `copy`  , `copy_from_main`  , `copy_from_branch`  , `data_branch`  , `deploy_branch`  , `regression_tests`  , `sink`  , `sink_from_branch`  . |
| `workspace_id` | `String` | Unique identifier for the workspace. |
| `pipe_id` | `String` | Unique identifier for the pipe. |
| `pipe_name` | `String` | Name of the pipe. |
| `created_at` | `DateTime` | Timestamp when the job was created. |
| `updated_at` | `DateTime` | Timestamp when the job was last updated. |
| `started_at` | `DateTime` | Timestamp when the job execution started. |
| `status` | `LowCardinality(String)` | Current status of the job. `waiting`  , `working`  , `done`  , `error`  , `cancelled`  . |
| `error` | `Nullable(String)` | Detailed error message if the result was error. |
| `job_metadata` | `JSON String` | Additional metadata related to the job execution. |

Learn more about how to track background jobs execution in the [Jobs monitoring guide](./jobs).

## Query _state columns [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#query-state-columns)

Several of the Service data sources include columns suffixed with `_state` . This suffix identifies columns with values that are in an intermediate aggregated state. When reading these columns, merge the intermediate states to get the final value.

To merge intermediate states, wrap the column in the original aggregation function and apply the `-Merge` combinator.

For example, to finalize the value of the `avg_duration_state` column, you use the `avgMerge` function:

##### finalize the value for the avg_duration_state column

SELECT
  date,
  avgMerge(avg_duration_state) avg_time,
  quantilesTimingMerge(0.9, 0.95, 0.99)(quantile_timing_state) quantiles_timing_in_ms_array
FROM tinybird.pipe_stats
where pipe_id = 'PIPE_ID'
group by date See [Combinators](/docs/sql-reference/functions/aggregate-functions#aggregate-function-combinators) to learn more about the `-Merge` combinator.

## Organization service data sources [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#organization-service-data-sources)

The following is a complete list of available Organization Service data sources:

| Field | Description |
| --- | --- |
| `organization.workspaces` | Lists all Organization workspaces and related information, including name, IDs, databases, plan, when it was created, and whether it has been soft-deleted. |
| `organization.processed_data` | Information related to all processed data per day per workspace. |
| `organization.datasources_storage` | Equivalent to tinybird.datasources_storage but with data for all Organization workspaces. |
| `organization.pipe_stats` | Equivalent to tinybird.pipe_stats but with data for all Organization workspaces. |
| `organization.pipe_stats_rt` | Equivalent to tinybird.pipe_stats_rt but with data for all Organization workspaces. |
| `organization.datasources_ops_log` | Equivalent to tinybird.datasources_ops_log but with data for all Organization workspaces. |
| `organization.data_transfer` | Equivalent to tinybird.data_transfer but with data for all Organization workspaces. |
| `organization.jobs_log` | Equivalent to tinybird.jobs_log but with data for all Organization workspaces. |
| `organization.sinks_ops_log` | Equivalent to tinybird.sinks_ops_log but with data for all Organization workspaces. |
| `organization.bi_stats` | Equivalent to tinybird.bi_stats but with data for all Organization workspaces. |
| `organization.bi_stats_rt` | Equivalent to tinybird.bi_stats_rt but with data for all Organization workspaces. |
| `organization.endpoint_errors` | Equivalent to tinybird.endpoint_errors but with data for all Organization workspaces. |
| `organization.shared_infra_active_minutes` | Contains information about vCPU active minutes consumption aggregated by minute for all Organization workspaces. Only available for Developer and Enterprise plans in shared infrastructure. |
| `organization.shared_infra_qps_overages` | Contains information about QPS consumption and overages aggregated by second for all Organization workspaces. Only available for Developer and Enterprise plans in shared infrastructure. |

To query Organization Service data sources, go to any workspace that belongs to the Organization and use the previous as regular Service data source from the Playground or within pipes. Use the admin `Token of an Organization Admin` . You can also copy your admin Token and make queries using your preferred method, like `tb sql`.

### metrics_logs service data source [¶](https://www.tinybird.co/docs/forward/monitoring/service-datasources#metrics-logs-service-data-source)

The `metrics_logs` Service data source is available in all the organization's workspaces. As with the rest of Organization Service data sources, it's only available to Organization administrators. New records for each of the metrics monitored are added every minute with the following schema:

| Field | Type | Description |
| --- | --- | --- |
| timestamp | DateTime | Timestamp of the metric |
| cluster | LowCardinality(String) | Name of the cluster |
| host | LowCardinality(String) | Name of the host |
| metric | LowCardinality(String) | Name of the metric |
| value | String | Value of the metric |
| description | LowCardinality(String) | Description of the metric |
| organization_id | String | ID of your organization |

The available metrics are the following:

| Metric | Description |
| --- | --- |
| MemoryTracking | Total amount of memory, in bytes, allocated by the server. |
| OSMemoryTotal | The total amount of memory on the host system, in bytes. |
| InstanceType | Instance type of the host. |
| Query | Number of executing queries. |
| NumberCPU | Number of CPUs. |
| LoadAverage1 | The whole system load, averaged with exponential smoothing over 1 minute. The load represents   the number of threads across all the processes (the scheduling entities of the OS kernel), that   are currently running by CPU or waiting for IO, or ready to run but not being scheduled at this   point of time. This number includes all the processes, not only the server. The number   can be greater than the number of CPU cores, if the system is overloaded, and many processes are   ready to run but waiting for CPU or IO. |
| LoadAverage15 | The whole system load, averaged with exponential smoothing over 15 minutes. The load represents   the number of threads across all the processes (the scheduling entities of the OS kernel), that   are currently running by CPU or waiting for IO, or ready to run but not being scheduled at this   point of time. This number includes all the processes, not only the server. The number   can be greater than the number of CPU cores, if the system is overloaded, and many processes are   ready to run but waiting for CPU or IO. |
| CPUUsage | The ratio of time the CPU core was running OS kernel (system) code or userspace code. This is a   system-wide metric, it includes all the processes on the host machine, not just the   server. This includes also the time when the CPU was under-utilized due to the   reasons internal to the CPU (memory loads, pipeline stalls, branch mispredictions, running   another SMT core). |


---

URL: https://www.tinybird.co/docs/forward/monitoring/organization-consumption
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Advanced Organization Consumption Monitoring · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to monitor your organization's resource consumption in detail using SQL queries"
---


# Advanced Organization Consumption Monitoring [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#advanced-organization-consumption-monitoring)

Copy as MD While Tinybird provides built-in graphs and metrics in the UI for monitoring your organization's resource consumption, some use cases require more detailed insights. This guide explains how to use SQL queries to monitor your consumption in detail, specifically for organizations using shared infrastructure.

This monitoring approach is only applicable for customers on shared infrastructure (Developer and Enterprise plans). If you're on a dedicated infrastructure plan, please contact your account manager for specific monitoring solutions.

## Prerequisites [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#prerequisites)

- You must be an organization administrator to run these queries
- Your organization must be on shared infrastructure (Developer or Enterprise plans)
- Basic understanding of SQL and Tinybird's Data Sources

## Understanding Organization Usage [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#understanding-organization-usage)

By default, Tinybird provides usage graphs in the UI that show:

- Resource consumption over the last 7 days
- A usage table displaying resources consuming the most vCPU
- Basic QPS (Queries Per Second) metrics

While these built-in visualizations are sufficient for most use cases, you might need more granular control and insight into your consumption patterns.

## Advanced vCPU Monitoring [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#advanced-vcpu-monitoring)

vCPU active minutes are only tracked for Developer and Enterprise plans on shared infrastructure. These metrics are stored in organization service data sources, which are only accessible to organization administrators.

### Important Notes About vCPU Metrics [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#important-notes-about-vcpu-metrics)

- Materialized Views currently show 0 for direct vCPU time
- The landing Data Source includes both its own CPU time and the CPU time of its associated Materialized Views

### Detailed CPU Usage Analysis [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#detailed-cpu-usage-analysis)

You can analyze detailed CPU consumption across different operations using several service data sources:

#### API and SQL Operations [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#api-and-sql-operations)

Using `pipe_stats` and `pipe_stats_rt` service data sources, you can monitor CPU usage for API endpoints and SQL queries:

Here's an example query that shows vCPU consumption by Pipe for the last 7 days:

##### Monitor vCPU consumption by Pipe for last 7 days

SELECT 
  pipe_name,
  round(sum(cpu_time), 2) as total_cpu_seconds,
  count() as total_requests
FROM organization.pipe_stats_rt
WHERE 
  start_datetime >= (now() - interval 7 day)
GROUP BY pipe_name
ORDER BY total_cpu_seconds DESC
#### Data Source Operations [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#data-source-operations)

The `datasources_ops_log` service data source provides CPU metrics for operations on your Data Sources.

Here's an example query that shows vCPU consumption by Data Source and operation type:

##### Monitor Data Source operations vCPU consumption

SELECT 
  datasource_name,
  event_type,
  round(sum(cpu_time), 2) as total_cpu_seconds,
  count() as total_operations
FROM organization.datasources_ops_log
WHERE 
  timestamp >= (now() - interval 7 day)
GROUP BY 
  datasource_name,
  event_type
ORDER BY total_cpu_seconds DESC
#### Sinks Operations [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#sinks-operations)

Monitor CPU usage in your Sinks using the `sinks_ops_log` service data source:

Here's an example query that shows vCPU consumption by Sink:

##### Monitor Sinks operations vCPU consumption

SELECT 
  pipe_name,
  round(sum(cpu_time), 2) as total_cpu_seconds,
  count() as total_operations
FROM organization.sinks_ops_log
WHERE 
  timestamp >= (now() - interval 7 day)
GROUP BY pipe_name
ORDER BY total_cpu_seconds DESC
### vCPU Active Minutes Consumption [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#vcpu-active-minutes-consumption)

For tracking your overall vCPU active minutes consumption, which is directly related to billing, use the `shared_infra_active_minutes` service data source. This provides aggregated consumption data that aligns with your plan's limits.

Here's an example query that shows all active minutes for the current day:

##### Monitor active minutes for current day

SELECT * FROM organization.shared_infra_active_minutes
WHERE
  toStartOfDay(minute) = today()
ORDER BY minute DESC
## Storage Monitoring [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#storage-monitoring)

Storage consumption is a key billing metric that measures the amount of data stored in your Data Sources. You can monitor storage usage using the `datasources_storage` service data source.

Storage is billed based on two factors:

1. The maximum total storage used by your organization each day (including quarantined data)
2. The average of those daily maximums throughout your billing cycle

Storage for data in quarantine is included in your billing calculations. When monitoring storage for costs, always consider both regular and quarantined data.

### Current Storage Usage [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#current-storage-usage)

Here's an example query that shows current storage usage by Data Source, including both regular and quarantined data:

##### Monitor current storage usage by Data Source

SELECT 
  datasource_name,
  round((bytes + bytes_quarantine)//1000000000, 2) as total_storage_gb,
  round(bytes_quarantine//1000000000, 2) as quarantine_storage_gb,
  rows + rows_quarantine as total_rows,
  rows_quarantine as quarantine_rows
FROM organization.datasources_storage
WHERE timestamp >= (now() - interval 2 hour)
ORDER BY total_storage_gb DESC
LIMIT 1 BY datasource_name
### Billing Period Storage Analysis [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#billing-period-storage-analysis)

To analyze your storage consumption for billing purposes, use this query that calculates the average of daily maximum storage across your billing period:

##### Calculate average storage for billing period

SELECT 
    greatest(avg(daily_max_org_storage_gb), 0) as avg_storage_gb,
    greatest(avg(daily_max_org_storage_rows), 0) as avg_storage_rows
FROM (
  SELECT 
    sum(floor(max_total_bytes_by_ds/1000000000, 6)) as daily_max_org_storage_gb,
    sum(max_total_rows_by_ds) as daily_max_org_storage_rows
  FROM (
    SELECT 
        toDate(timestamp) as date, 
        max(bytes + bytes_quarantine) as max_total_bytes_by_ds,
        max(rows + rows_quarantine) AS max_total_rows_by_ds
    FROM organization.datasources_storage
    WHERE 1= 1
        AND date >= '2025-04-XX' -- beginning of term
        AND date <= '2025-04-xx' -- end of term
    GROUP BY date, datasource_id
  )
  GROUP BY date
) Replace the date placeholders ( `'2025-04-XX'` ) with your actual billing period start and end dates to get accurate billing metrics.

## QPS (Queries Per Second) Monitoring [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#qps-queries-per-second-monitoring)

You can monitor your QPS consumption using two different data sources, each providing different insights into your usage:

### Detailed Query Analysis (Last 7 Days) [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#detailed-query-analysis-last-7-days)

Using the `pipe_stats_rt` service data source, you can analyze detailed information about your API endpoints and SQL queries usage. This data source provides rich information about each query but is limited to the last 7 days due to TTL.

The `pipe_stats_rt` data source has a 7-day TTL (Time To Live), so historical analysis is limited to this timeframe.

Here's an example query that shows the number of requests per Pipe over the last hour:

##### Monitor QPS by pipe for the last hour

SELECT 
  start_datetime, 
  pipe_name, 
  count() total 
FROM organization.pipe_stats_rt
WHERE 
  start_datetime BETWEEN (now() - interval 1 hour) AND now()
GROUP BY 
  start_datetime, pipe_name
ORDER BY 
  start_datetime DESC
### Historical QPS and Overages [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#historical-qps-and-overages)

For longer-term analysis of QPS consumption and overages, you can use the `shared_infra_qps_overages` service data source. This provides aggregated QPS data and overage information per second, though with less detail about individual queries.

Here's an example query that shows daily QPS overages for the current month:

##### Monitor daily QPS overages for current month

SELECT 
  toStartOfDay(start_datetime) day, 
  sum(overages) total_overages 
FROM organization.shared_infra_qps_overages
WHERE 
  toStartOfMonth(start_datetime) = toStartOfMonth(now())
GROUP BY day
ORDER BY day DESC
## Data Transfer Monitoring [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#data-transfer-monitoring)

Data transfer metrics track the amount of data moved through Sinks in your organization. The cost varies depending on whether data is transferred within the same region (Intra) or between different regions (Inter).

### Sinks Data Transfer [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#sinks-data-transfer)

Monitor data transfer costs for Sinks using the `data_transfer` service data source:

##### Monitor Sinks data transfer by type

SELECT 
  toStartOfDay(timestamp) as day,
  workspace_id,
  kind,
  round(sum(bytes)/1000000000, 2) as transferred_gb,
  count() as operations
FROM organization.data_transfer
WHERE 
  timestamp >= (now() - interval 30 day)
  AND kind IN ('intra', 'inter')
GROUP BY 
  day,
  workspace_id,
  kind
ORDER BY 
  day DESC,
  kind ASC
## Best Practices [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#best-practices)

1. **  Regular Monitoring**   : Set up a routine to check these metrics, especially if you're approaching your plan limits
2. **  Alert Setup**   : Consider setting up alerts using these queries to proactively monitor consumption
3. **  Resource Optimization**   : Use these insights to identify opportunities for query optimization or resource reallocation

## Additional Resources [¶](https://www.tinybird.co/docs/forward/monitoring/organization-consumption#additional-resources)

- [  Organizations](../administration/organizations)   - Learn about organization management and monitoring
- [  Understanding Billing](../pricing/billing)   - Understand how billing works in Tinybird, including storage, data transfer, and vCPU costs
- [  Resource Limits](../pricing/limits)   - Learn about the storage, QPS, and other resource limits for different plans


---

URL: https://www.tinybird.co/docs/forward/monitoring/latency
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Measure API endpoint latency · Tinybird Docs"
theme-color: "#171612"
description: "Latency is an essential metric to monitor in real-time applications. Learn how to measure and monitor the latency of your API endpoints in Tinybird."
---


# Measure API endpoint latency [¶](https://www.tinybird.co/docs/forward/monitoring/latency#measure-api-endpoint-latency)

Copy as MD Latency is the time it takes for a request to travel from the client to the server and back; the time it takes for a request to be sent and received. Latency is usually measured in seconds or milliseconds (ms). The lower the latency, the faster the response time.

Latency is an essential metric to monitor in real-time applications. Read on to learn how latency is measured in Tinybird, and how to monitor and visualize the latency of your API endpoints when data is being retrieved.

## How latency is measured [¶](https://www.tinybird.co/docs/forward/monitoring/latency#how-latency-is-measured)

When measuring latency in an end-to-end application, you consider data ingestion, data transformation, and data retrieval. In Tinybird, latency is measured as the time it takes for a request to be sent, and the response to be sent back to the client.

When calling an API endpoint, you can check this metric defined as `elapsed` in the `statistics` object of the response:

##### Statistics object within an example Tinybird API endpoint call

{
  "meta": [ ... ],
  "data": [ ... ],
  "rows": 10,
  "statistics": {
    "elapsed": 0.001706275,
    "rows_read": 10,
    "bytes_read": 180
  }
}
## Monitor latency [¶](https://www.tinybird.co/docs/forward/monitoring/latency#monitor-latency)

To monitor the latency of your API endpoints, use the `pipe_stats_rt` and `pipe_stats` [Service data sources](./service-datasources):

- `pipe_stats_rt`   consists of the real-time statistics of your API endpoints, and has a `duration`   field that encapsulates the latency time in seconds.
- `pipe_stats`   contains the**  aggregated**   statistics of your API endpoints by date, and presents a `avg_duration_state`   field which is the average duration of the API endpoint by day in seconds.

Because the `avg_duration_state` field is an intermediate state, you'd need to merge it when querying the data source using something like `avgMerge`.

For details on building pipes and endpoints that monitor the performance of your API endpoints using the `pipe_stats_rt` and `pipe_stats` data sources, follow the [API endpoint performance guide](./analyze-endpoints-performance#example-2-analyzing-the-performance-of-api-endpoints-over-time).

## Visualize latency [¶](https://www.tinybird.co/docs/forward/monitoring/latency#visualize-latency)

In your workspace, go to **Time Series** and select **Endpoint performance** to visualize the latency of your API endpoints over time.

## Next steps [¶](https://www.tinybird.co/docs/forward/monitoring/latency#next-steps)

- Read this blog on[  Monitoring global API latency](https://www.tinybird.co/blog-posts/dev-qa-global-api-latency-chronark)  .


---

URL: https://www.tinybird.co/docs/forward/monitoring/jobs
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Monitor jobs in your workspace · Tinybird Docs"
theme-color: "#171612"
description: "Many of the operations you can run in your workspace are executed using jobs. jobs_log provides you with an overview of all your jobs."
---


# Monitor jobs in your workspace [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#monitor-jobs-in-your-workspace)

Copy as MD Many operations in your Tinybird workspace, like imports, copy jobs, sinks and populates, run as background jobs within the platform. This approach ensures that the system can handle a large volume of requests efficiently without causing timeouts or delays in your workflow.

Monitoring and managing jobs, for example querying job statuses, types, and execution details, is essential for maintaining a healthy workspace. The two mechanisms for generic job monitoring are the [Jobs API](/docs/api-reference/jobs-api) and the<a href="./service-datasources#tinybird-jobs-log"> `jobs_log`</a> data source.

You can also track more specific things using dedicated Service data sources, such as `datasources_ops_log` for import, replaces, or copy, or `sinks_ops_log` for sink operations, or tracking jobs across [Organizations](../administration/organizations) with `organization.jobs_log` . See [Service data sources docs](./service-datasources).

The Jobs API and the `jobs_log` return identical information about job execution. However, the Jobs API has some limitations: It reports only on a single workspace, returns only 100 records, from the last 48 hours. If you want to monitor jobs outside these parameters, use the `jobs_log` data source.

## Track a specific job [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#track-a-specific-job)

The most elemental use case is to track a specific job. You can do this using the Jobs API or SQL queries.

### Jobs API [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#jobs-api-track-specific-job)

The Jobs API is a convenient way to programmatically check the status of a job. By sending a GET request, you can retrieve detailed information about a specific job. This method is particularly useful for integration into scripts or applications.

curl \
    -X GET "https://$TB_HOST/v0/jobs/{job_id}" \
    -H "Authorization: Bearer $TOKEN" Replace `{job_id}` with the actual job ID.


Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

### SQL queries [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#sql-queries-track-specific-job)

Alternatively, you can use SQL to query the `jobs_log` data source from directly within a Tinybird pipe. This method is ideal for users who are comfortable with SQL and prefer to run queries directly against the data, and then expose them with an endpoint or perform any other actions with it.

SELECT * FROM tinybird.jobs_log WHERE job_id='{job_id}' Replace `{job_id}` with the desired job ID. This query retrieves all columns for the specified job, providing comprehensive details about its execution.

## Track specific job types [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#track-specific-job-types)

Tracking jobs by type lets you monitor and analyze all jobs of a certain category, such as all `copy` jobs. This can help you understand the performance and status of specific job types across your entire workspace.

### Jobs API [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#jobs-api-track-specific-job-types)

You can fetch all jobs of a specific type by making a GET request against the Jobs API:

curl \
    -X GET "https://$TB_HOST/v0/jobs?kind=copy" \
    -H "Authorization: Bearer $TOKEN" Replace `copy` with the type of job you want to track. Make sure you have set your Tinybird host ( `$TB_HOST` ) and authorization token ( `$TOKEN` ) correctly.

### SQL queries [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#sql-queries-track-specific-job-types)

Alternatively, you can run an SQL query to fetch all jobs of a specific type from the `jobs_log` data source:

SELECT * FROM tinybird.jobs_log WHERE job_type='copy' Replace `copy` with the desired job type. This query retrieves all columns for jobs of the specified type.

## Track ongoing jobs [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#track-ongoing-jobs)

To keep track of jobs that are currently running, you can query the status of jobs in progress. This helps in monitoring the real-time workload and managing system performance.

### Jobs API [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#jobs-api-ongoing-jobs)

By making an HTTP GET request to the Jobs API, you can fetch all jobs that are currently in the `working` status:

curl \
    -X GET "https://$TB_HOST/v0/jobs?status=working" \
    -H "Authorization: Bearer $TOKEN" This call retrieves jobs that are actively running. Ensure you have set your Tinybird host ( `$TB_HOST` ) and authorization token ( `$TOKEN` ) correctly.

### SQL queries [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#sql-queries-ongoing-jobs)

You can also use an SQL query to fetch currently running jobs from the `jobs_log` data source:

SELECT * FROM tinybird.jobs_log WHERE status='working' This query retrieves all columns for jobs with the status `working` , allowing you to monitor ongoing operations.

## Track errored jobs [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#track-errored-jobs)

Tracking errored jobs is crucial for identifying and resolving issues that may arise during job execution. Jobs API or SQL queries to `jobs_log` helps you monitor jobs that errored during the execution.

### Jobs API [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#jobs-api-errored-jobs)

You can use the Jobs API to fetch details of jobs that have ended in error.

Use the following `curl` command to retrieve all jobs that have a status of `error`:

curl \
    -X GET "https://$TB_HOST/v0/jobs?status=error" \
    -H "Authorization: Bearer $TOKEN" This call fetches a list of jobs that are currently in an errored state, providing details that can be used for further analysis or debugging. Make sure you've set your Tinybird host ( `$TB_HOST` ) and authorization token ( `$TOKEN` ) correctly.

### SQL queries [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#sql-queries-errored-jobs)

Alternatively, you can use SQL to query the `jobs_log` data source directly.

Use the following SQL query to fetch job IDs, job types, and error messages for jobs that have encountered errors in the past day:

SELECT job_id, job_type, error
FROM tinybird.jobs_log
WHERE
    status='error' AND
    created_at > now() - INTERVAL 1 DAY
### Track success rate [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#track-success-rate)

Extrapolating from errored jobs, you can also use `jobs_log` to calculate the success rate of your workspace jobs:

SELECT
    job_type,
    pipe_id,
    countIf(status='done') AS job_success,
    countIf(status='error') AS job_error,
    job_success / (job_success + job_error) as success_rate
FROM tinybird.jobs_log
WHERE
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type, pipe_id
## Get job execution metadata [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#get-job-execution-metadata)

In the `jobs_log` data source, there is a property called `job_metadata` that contains metadata related to job executions. This includes the execution type, manual or scheduled, for Copy and Sink jobs, or the count of quarantined rows for append operations, along with many other properties.

You can extract and analyze this metadata using JSON functions within SQL queries. This allows you to gain valuable information about job executions directly from the `jobs_log` data source.

The following SQL query is an example of how to extract specific metadata fields from the `job_metadata` property, such as the import mode and counts of quarantined rows and invalid lines, and how to aggregate this data for analysis:

SELECT
    job_type,
    JSONExtractString(job_metadata, 'mode') AS import_mode,
    sum(simpleJSONExtractUInt(job_metadata, 'quarantine_rows')) AS quarantine_rows,
    sum(simpleJSONExtractUInt(job_metadata, 'invalid_lines')) AS invalid_lines
FROM tinybird.jobs_log
WHERE
    job_type='import' AND
    created_at >= toStartOfDay(now())
GROUP BY job_type, import_mode There are many other use cases you can put together with the properties in the `job_metadata` ; see below.

## Advanced use cases [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#advanced-use-cases)

Beyond basic tracking, you can leverage the `jobs_log` data source for more advanced use cases, such as gathering statistics and performance metrics. This can help you optimize job scheduling and resource allocation.

### Get queue status [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#get-queue-status)

The following SQL query returns the number of jobs that are waiting to be executed, the number of jobs that are in progress, and how many of them are done already:

SELECT
    job_type,
    countIf(status='waiting') AS jobs_in_queue,
    countIf(status='working') AS jobs_in_progress,
    countIf(status='done') AS jobs_succeeded,
    countIf(status='error') AS jobs_errored
FROM tinybird.jobs_log
WHERE
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type
### Run time statistics grouped by type of job [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#run-time-statistics-grouped-by-type-of-job)

The following SQL query calculates the maximum, minimum, median, and p95 running time (in seconds) grouped by type of job over the past day. This helps in understanding the efficiency of different job types:

SELECT
    job_type,
    max(date_diff('s', started_at, updated_at)) as max_run_time_in_secs,
    min(date_diff('s', started_at, updated_at)) as min_run_time_in_secs,
    median(date_diff('s', started_at, updated_at)) as median_run_time_in_secs,
    quantile(0.95)(date_diff('s', started_at, updated_at)) as p95_run_time_in_secs
FROM tinybird.jobs_log
WHERE
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type
### Statistics on queue time by type of job [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#statistics-on-queue-time-by-type-of-job)

The following SQL query calculates the average queue time, in seconds, for a specific type of job over the past day. This can help in identifying bottlenecks in job scheduling:

SELECT
    job_type,
    max(date_diff('s', created_at, started_at)) as max_run_time_in_secs,
    min(date_diff('s', created_at, started_at)) as min_run_time_in_secs,
    median(date_diff('s', created_at, started_at)) as median_run_time_in_secs,
    quantile(0.95)(date_diff('s', created_at, started_at)) as p95_run_time_in_secs
FROM tinybird.jobs_log
WHERE
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type
### Get statistics on job completion rate [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#get-statistics-on-job-completion-rate)

The following SQL query calculates the success rate by type of job (e.g., copy) and pipe over the past day. This can help you to assess the reliability and efficiency of your workflows by measuring the completion rate of the jobs, and find potential issues and areas for improvement:

SELECT
    job_type,
    pipe_id,
    countIf(status='done') AS job_success,
    countIf(status='error') AS job_error,
    job_success / (job_success + job_error) as success_rate
FROM tinybird.jobs_log
WHERE
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type, pipe_id
### Statistics on the amount of manual versus scheduled run jobs [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#statistics-on-the-amount-of-manual-versus-scheduled-run-jobs)

The following SQL query calculates the percentage rate between manual and scheduled jobs. Understanding the distribution of manually-executed jobs versus scheduled jobs can let you know about some on-demand jobs performed for some specific reasons:

SELECT
    job_type,
    countIf(JSONExtractString(job_metadata, 'execution_type')='manual') AS job_manual,
    countIf(JSONExtractString(job_metadata, 'execution_type')='scheduled') AS job_scheduled
FROM tinybird.jobs_log
WHERE
    job_type='copy' AND
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type
## Next steps [¶](https://www.tinybird.co/docs/forward/monitoring/jobs#next-steps)

- Read up on the<a href="./service-datasources#tinybird-jobs-log"> `jobs_log`   Service data source specification</a>  .


---

URL: https://www.tinybird.co/docs/forward/monitoring/health-checks
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Health checks · Tinybird Docs"
theme-color: "#171612"
description: "Use the built-in Tinybird tools to monitor your data ingestion and API endpoint processes."
---


# Check the health of your data sources [¶](https://www.tinybird.co/docs/forward/monitoring/health-checks#check-the-health-of-your-data-sources)

Copy as MD After you have fixed all the possible errors in your source files, matched the data source schema to your needs, and done on-the-fly transformations, you can start ingesting data periodically. Knowing the status of your ingestion processes helps you to keep your data clean and consistent.

## Data source summary [¶](https://www.tinybird.co/docs/forward/monitoring/health-checks#data-source-summary)

Select a data source to see the size of the data source, the number of rows, the number of rows in the quarantine data source, and when it was last updated. The log contains details of the events for the data source, which appears as the results of the query.

## Service data sources for continuous monitoring [¶](https://www.tinybird.co/docs/forward/monitoring/health-checks#service-data-sources-for-continuous-monitoring)

[Service data sources](./service-datasources) can help you with ingestion health checks. You can use them like any other data source in your Workspace, which means you can create API Endpoints to monitor your ingestion processes.

Querying the `tinybird.datasources_ops_log` directly, you can, for example, lists your ingest processes during the last week:

##### LISTING INGESTIONS IN THE LAST 7 DAYS

SELECT * 
FROM tinybird.datasources_ops_log
WHERE toDate(timestamp) > now() - INTERVAL 7 DAY
ORDER BY timestamp DESC This query calculates the percentage of quarantined rows for a given period of time:

##### CALCULATE % OF ROWS THAT WENT TO QUARANTINE

SELECT 
  countIf(result != 'ok') / countIf(result == 'ok') * 100 percentage_failed,
  sum(rows_quarantine) / sum(rows) * 100 quarantined_rows
FROM tinybird.datasources_ops_log The following query monitors the average duration of your periodic ingestion processes for a given data source:

##### CALCULATING AVERAGE INGEST DURATION

SELECT avg(elapsed_time) avg_duration 
FROM tinybird.datasources_ops_log
WHERE datasource_id = 't_8417d5126ed84802aa0addce7d1664f2' If you want to configure or build an external service that monitors these metrics, you need to create an API Endpoint and raise an alert when passing a threshold. When you receive an alert, you can check the quarantine data source or the Operations log to see what's going on and fix your source files or ingestion processes.

## Monitoring API endpoints [¶](https://www.tinybird.co/docs/forward/monitoring/health-checks#monitoring-api-endpoints)

You can use the `pipe_stats` and `pipe_stats_rt` [Service data sources](./service-datasources) to monitor the performance of your API Endpoints.

Every request to a pipe is logged to `tinybird.pipe_stats_rt` and kept in this data source for the last week.

The following example API Endpoint aggregates the statistics for each hour for the selected pipe.

##### PIPE_STATS_RT_BY_HR

SELECT
  toStartOfHour(start_datetime) as hour,
  count() as view_count,
  round(avg(duration), 2) as avg_time,
  arrayElement(quantiles(0.50)(duration),1) as quantile_50,
  arrayElement(quantiles(0.95)(duration),1) as quantile_95,
  arrayElement(quantiles(0.99)(duration),1) as quantile_99
FROM tinybird.pipe_stats_rt
WHERE pipe_id = 'PIPE_ID'
GROUP BY hour
ORDER BY hour `pipe_stats` contains statistics about your pipe Endpoints' API calls aggregated per day using intermediate states.

##### PIPE_STATS_BY_DATE

SELECT
  date,
  sum(view_count) view_count,
  sum(error_count) error_count,
  avgMerge(avg_duration_state) avg_time,
  quantilesTimingMerge(0.9, 0.95, 0.99)(quantile_timing_state) quantiles_timing_in_millis_array
FROM tinybird.pipe_stats
WHERE pipe_id = 'PIPE_ID'
GROUP BY date
ORDER BY date You can use these API endpoints to trigger alerts whenever statistics pass predefined thresholds. [Export API endpoint statistics in Prometheus format](../work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format) to integrate with your monitoring and alerting tools.


---

URL: https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Analyze the performance of your API Endpoints · Tinybird Docs"
theme-color: "#171612"
description: "Learn more about how to measure the performance of your API Endpoints."
---


# Analyze the performance of your API Endpoints [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#analyze-the-performance-of-your-api-endpoints)

Copy as MD You can use the `pipe_stats` and `pipe_stats_rt` Service Data Sources to analyze the performance of your API Endpoints. Read on to see several practical examples that show what you can do with these Data Sources.

## Knowing what to optimize [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#knowing-what-to-optimize)

Before you optimize, you need to know what to optimize. The `pipe_stats` and `pipe_stats_rt` Service Data Sources let you see how your API Endpoints are performing, so you can find causes of overhead and improve performance.

These Service Data Sources provide performance data and consumption data for every single request. You can also filter and sort results by Tokens to see who is accessing your API Endpoints and how often.

The difference between `pipe_stats_rt` and `pipe_stats` is that `pipe_stats` provides aggregate stats, like average request duration and total read bytes, per day, whereas `pipe_stats_rt` offers the same information but without aggregation. Every single request is stored in `pipe_stats_rt` . The examples in this guide use `pipe_stats_rt` , but you can use the same logic with `pipe_stats` if you need more than 7 days of lookback.

## Before you start [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#before-you-start)

You need a high-level understanding of Tinybird's [Service Data Sources](./service-datasources).

### Understand the core stats [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#understand-the-core-stats)

This guide focuses on the following fields in the `pipe_stats_rt` Service Data Source:

- `pipe_name`   (String): Pipe name as returned in Pipes API.
- `duration`   (Float): the duration in seconds of each specific request.
- `read_bytes`   (UInt64): How much data was scanned for this particular request.
- `read_rows`   (UInt64): How many rows were scanned.
- `token_name`   (String): The name of the Token used in a particular request.
- `status_code`   (Int32): The HTTP status code returned for this particular request.

You can find the full schema for `pipe_stats_rt` in the [API docs](./service-datasources#tinybird-pipe-stats-rt).

The value of `pipe_name` is "query_api" in the event as it's a Query API request. The following section covers how to monitor query performance when using the Query API.

### Use the Query API with metadata parameters [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#use-the-query-api-with-metadata-parameters)

If you are using the [Query API](/docs/api-reference/query-api) to run queries in Tinybird you can still track query performance using the `pipe_stats_rt` Service Data Source. Add metadata related to the query as request parameters, as well as any existing parameters used in your query.

For example, when running a query against the Query API you can leverage a parameter called `app_name` to track all queries from the "explorer" application. Here's an example using `curl`:

##### Using the metadata parameters with the Query API

curl -X POST \
    -H "Authorization: Bearer <PIPE:READ token>" \
    --data "% SELECT * FROM events LIMIT {{Int8(my_limit, 10)}}" \
    "https://api.tinybird.co/v0/sql?my_limit=10&app_name=explorer" When you run the following queries, use the `parameters` attribute to access those queries where `app_name` equals "explorer":

##### Simple Parameterized Query

SELECT *
FROM tinybird.pipe_stats_rt
WHERE parameters['app_name'] = 'explorer'
## Detect errors in your API Endpoints [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#detect-errors-in-your-api-endpoints)

If you want to monitor the number of errors per Endpoint over the last hour, you can run the following query:

##### Errors in the last hour

SELECT
  pipe_name, 
  status_code, 
count() as error_count
FROM tinybird.pipe_stats_rt
WHERE status_code >= 400
AND start_datetime > now() - INTERVAL 1 HOUR
GROUP BY pipe_name, status_code
ORDER BY status_code desc If you have errors, the query would return something like:

Pipe_a | 404 | 127
Pipe_b | 403 | 32 With one query, you can see in real time if your API Endpoints are experiencing errors, and investigate further if so.

## Analyze the performance of API Endpoints over time [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#analyze-the-performance-of-api-endpoints-over-time)

You can also use `pipe_stats_rt` to track how long API calls take using the `duration` field, and seeing how that changes over time. API performance is directly related to how much data you are reading per request, so if your API Endpoint is dynamic, request duration varies.

For instance, it might be receiving start and end date parameters that alter how long a period is being read.

##### API Endpoint performance over time

SELECT 
   toStartOfMinute(start_datetime) t,
   pipe_name,
   avg(duration) avg_duration,
   quantile(.95)(duration) p95_duration,
   count() requests
FROM tinybird.pipe_stats_rt
WHERE
   start_datetime >= {{DateTime(start_date_time, '2022-05-01 00:00:00', description="Start date time")}} AND
   start_datetime < {{DateTime(end_date_time, '2022-05-25 00:00:00', description="End date time")}}
GROUP BY t, pipe_name
ORDER BY t desc, pipe_name
## Find the endpoints that process the most data [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#find-the-endpoints-that-process-the-most-data)

You might want to find Endpoints that repeatedly scan large amounts of data. They are your best candidates for optimization to reduce time and spend.

Here's an example of using `pipe_stats_rt` to find the API Endpoints that have processed the most data as a percentage of all processed data in the last 24 hours:

##### Most processed data last 24 hours

WITH (
   SELECT sum(read_bytes)
   FROM tinybird.pipe_stats_rt
   WHERE
   start_datetime >= now() - INTERVAL 24 HOUR
   ) as total,
sum(read_bytes) as processed_byte
SELECT
   pipe_id,
   quantile(0.9)(duration) as p90,
   formatReadableSize(processed_byte) AS processed_formatted,
   processed_byte*100/total as percentage
FROM tinybird.pipe_stats_rt
WHERE
   start_datetime >= now() - INTERVAL 24 HOUR
GROUP BY pipe_id
ORDER BY percentage DESC
### Include consumption of the Query API [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#include-consumption-of-the-query-api)

If you use Tinybird's Query API to query your Data Sources directly, you probably want to include in your analysis which queries are consuming more.

Whenever you use the Query API, the field `pipe_name` contain the value `query_api` . The actual query is included as part of the `q` parameter in the `url` field. You can modify the query in the previous section to extract the SQL query that's processing the data.

##### Using the Query API

WITH (
   SELECT sum(read_bytes)
   FROM tinybird.pipe_stats_rt
   WHERE
   start_datetime >= now() - INTERVAL 24 HOUR
   ) as total,
sum(read_bytes) as processed_byte
SELECT
   if(pipe_name = 'query_api', normalizeQuery(extractURLParameter(decodeURLComponent(url), 'q')),pipe_name) as pipe_name,
   quantile(0.9)(duration) as p90,
   formatReadableSize(processed_byte) AS processed_formatted,
   processed_byte*100/total as percentage
FROM tinybird.pipe_stats_rt
WHERE
   start_datetime >= now() - INTE
RVAL 24 HOUR
GROUP BY pipe_name
ORDER BY percentage DESC
## Monitor usage of Tokens [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#monitor-usage-of-tokens)

If you use your API Endpoint with different Tokens, for example if allowing different customers to check their own data, you can track and control which Tokens are being used to access these endpoints.

The following example shows, for the last 24 hours, the number and size of requests per Token:

##### Token usage last 24 hours

SELECT
   count() requests,
   formatReadableSize(sum(read_bytes)) as total_read_bytes,
   token_name
FROM tinybird.pipe_stats_rt
WHERE
   start_datetime >= now() - INTERVAL 24 HOUR
GROUP BY token_name
ORDER BY requests DESC To get this information, request the Token name ( `token_name` column) or id ( `token` column).

## Next steps [¶](https://www.tinybird.co/docs/forward/monitoring/analyze-endpoints-performance#next-steps)

- Learn how to[  monitor jobs in your Workspace](./jobs)  .
- Monitor the[  latency of your API Endpoints](./latency)  .


---

URL: https://www.tinybird.co/docs/forward/install-tinybird/self-managed
Last update: 2025-06-10T22:41:34.000Z
Content:
---
title: "Self-managed regions · Tinybird Docs"
theme-color: "#171612"
description: "Create your own Tinybird Cloud region in the cloud service provider of your choice."
---


# Self-managed regions [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed#self-managed-regions)

Copy as MD By default, Tinybird relies on [Tinybird managed regions](/docs/api-reference#regions-and-endpoints) . Self-managed regions allow you to run Tinybird infrastructure in your own cloud environment while still benefiting from Tinybird Cloud's management capabilities. This gives you more control over data residency, compliance requirements, and infrastructure costs while maintaining the same Tinybird experience.

Self-managed regions are currently in beta. Use them for development environments only; avoid production workloads. Features, requirements, and implementation details might change as we continue to improve this capability.

## Key benefits [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed#key-benefits)

Key benefits of self-managed regions include:

- Data sovereignty and compliance: Keep your data in specific geographic locations to meet regulatory requirements.
- Integration: Connect directly to your existing data sources and internal systems without data leaving your network.
- Complete customization: Tailor performance by optimizing hardware resources for your specific workload patterns.
- Cost efficiency: Avoid egress fees and leverage your existing cloud commitments and reserved instances.

## Limitations [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed#limitations)

Self-managed regions have some limitations compared to Tinybird managed regions:

- You are responsible for infrastructure maintenance, scaling, and security.
- Some advanced features may require additional configuration.
- Performance tuning and optimization are your responsibility.
- Upgrades must be managed manually by redeploying with newer container versions.

## How self-managed regions work [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed#how-self-managed-regions-work)

When you set up a self-managed region:

1. You register a region with Tinybird Cloud.
2. You deploy the Tinybird Local container in your cloud environment.
3. You interact with your self-managed region directly using the Tinybird CLI with the `--cloud`   flag.

This approach gives you complete control over your data and infrastructure while maintaining the familiar Tinybird developer experience.

## Add a self-managed region [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed#add-a-self-managed-region)

There are two main ways to create a self-managed region:

- Manually. See[  Create a self-managed region manually](/docs/forward/install-tinybird/self-managed/manual)  .
- Using the `tb infra init`   command. See[  Use the CLI to add a self-managed region](/docs/forward/install-tinybird/self-managed/assisted)  .

## Supported clouds [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed#supported-clouds)

The following cloud service providers are supported:

- AWS
- GCP (Coming soon)
- Azure (Coming soon)


You can also create a self-managed region by deploying the container on the following Platform as a Service (PaaS) providers:

- Fly.io
- Zeet
- Google Cloud Run
- CloudFoundry
- Azure Container Instances
- DigitalOcean App Platform

When deploying Tinybird Local on a PaaS, make sure to expose the following environment variables to the container:

- `TB_INFRA_TOKEN`
- `TB_INFRA_WORKSPACE`
- `TB_INFRA_ORGANIZATION`
- `TB_INFRA_USER`

See [tb infra](/docs/forward/dev-reference/commands/tb-infra) for more information.


---

URL: https://www.tinybird.co/docs/forward/install-tinybird/migrate
Last update: 2025-06-16T11:37:10.000Z
Content:
---
title: "Migrate from Tinybird Classic · Tinybird Docs"
theme-color: "#171612"
description: "Migrate your project from Tinybird Classic to Tinybird Forward."
---


# Migrate from Tinybird Classic [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#migrate-from-tinybird-classic)

Copy as MD Tinybird Forward introduces a new way of working with your data projects, with changes to APIs and CLI that may be incompatible with Classic. If you're starting a new project, see the [Get started](/docs/forward) guide.

## Considerations before migrating [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#considerations-before-migrating)

Before migrating your workspace from Tinybird Classic, understand these key differences in Forward:

- Development happens locally using the[  Tinybird Local container](/docs/forward/install-tinybird/local)   , not in the UI
- The following features are not yet supported:
  - DynamoDB connector
  - BI Connector
  - Shared data sources
  - Include files
  - `VERSION`     tag in datafiles
  - CI/CD workflows use different CLI commands than Classic. See[    CI/CD](/docs/forward/test-and-deploy/deployments/cicd)    .
  - Resource-scoped tokens using the `:sql_filter`     suffix (e.g. `DATASOURCES:READ:datasource_name:sql_filter`    , `PIPES:READ:pipe_name:sql_filter`     ) are no longer supported and must be removed before migrating. Check[    JWTs](/docs/forward/administration/tokens/jwt)     instead.
- The Tinybird support team will need to[  enable a feature flag](https://www.tinybird.co/docs/forward/install-tinybird/migrate#enable-forward-feature-flag)   to complete the migration.

If these changes work for your use case, continue reading to learn how to migrate.

Migration is permanent and cannot be reversed. After deploying with Forward, you cannot switch your workspace back to Classic.

## Migrate your workspace [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#migrate-your-workspace)

1
### Install the Tinybird Forward CLI [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#install-the-tinybird-forward-cli)

Run the following command to install the Tinybird Forward CLI and the Tinybird Local container:


- For macOS and Linux
- For Windows

curl https://tinybird.co | sh See [install Tinybird Forward](/docs/forward/install-tinybird) for more information.

**Managing CLI Versions** : Having both Tinybird Classic and Forward CLIs installed can cause version conflicts since both use the `tb` command. To avoid these conflicts, consider:

1. Using the uv Python package manager to keep both CLIs completely isolated (recommended):

# For Classic CLI
uvx --from tinybird-cli@latest tb

# For Forward CLI
uvx --from tinybird@latest tb
1. Creating aliases in your shell configuration:

# Add to .bashrc or .zshrc
alias tb-classic="path/to/classic/tb"
alias tb-forward="path/to/forward/tb"
1. Using separate virtual environments for each CLI version.

This ensures you use the correct CLI version for each operation during migration.

The following steps use the uv Python package.

2
### Authenticate to your workspace [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#authenticate-to-your-workspace)

Authenticate to your workspace using the Classic CLI:

uvx --from tinybird-cli@latest tb auth --interactive Follow the prompts to complete authentication.

3
### Pull your project [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#pull-your-project)

If you already have the latest version of your datafiles locally (e.g. from your Git repo), skip to the next step.

If you don't have your datafiles locally, pull them from Tinybird using the Forward CLI:

uvx --from tinybird@latest tb pull 4
### Check deployment compatibility [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#check-deployment-compatibility)

Validate your project's compatibility with the Forward CLI:

uvx --from tinybird@latest tb --cloud deploy --check You should see:

* No changes to be deployed
* No changes in tokens to be deployed If you encounter any errors, it's recommended to fix them in your Classic workspace so you can have a "clean" first Forward deployment. See [common migration errors](https://www.tinybird.co/docs/forward/install-tinybird/migrate#common-migration-errors) for information about common errors and fixes.

Fix all of the errors, repull your workspace (if necessary), and rerun the deployment check until there are no changes detected.

5
### Contact support to enable the Forward feature flag [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#contact-support-to-enable-the-forward-feature-flag)

Once your project passes the compatibility check, contact Tinybird support ( [support@tinybird.co](mailto:support@tinybird.co) ) to enable the Forward feature flag for your workspace.

6
### Trigger a deployment [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#trigger-a-deployment)

Once the feature flag is enabled, it's time to trigger a deployment.

To create a simple first deployment, generate a dummy endpoint as the only change:

##### forward_dummy_endpoint.pipe

NODE n
SQL >
    SELECT 'forward'

TYPE endpoint There are two ways to [deploy your project](/docs/forward/test-and-deploy/deployments):

#### Option 1: CI/CD (recommended) [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#option-1-cicd-recommended)

In an empty directory outside of your existing project, generate default CI/CD workflows by running the following command:

uvx --from tinybird@latest tb create `tb create` creates the scaffolding for a new project, including the GitHub/GitLab YAML files. Review the workflows, edit them as desired, and add the files to the root of your project.

Finally, trigger the deployment by committing your project to Git and creating a merge/pull request.

#### Option 2: CLI [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#option-2-cli)

If you don't have CI/CD configured, you can deploy manually:

uvx --from tinybird@latest tb --cloud deploy 7
### Open the project in Tinybird Cloud [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#open-the-project-in-tinybird-cloud)

After the deployment succeeds, open the project in Tinybird Cloud by running the following command:

uvx --from tinybird@latest tb --cloud open The migration is complete! Your project will continue working as expected; you do not need to change your tokens, endpoint URLs, or anything else.

## Common migration errors [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#common-migration-errors)

Common errors and changes include (but are not limited to):

### Missing connection files [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#missing-connection-files)

In Forward, [.connection files](/docs/forward/dev-reference/datafiles/connection-files) are used to store your connector details.

You need to create .connection files to enable your connections to Kafka, S3, or GCS. If you manually pulled your datafiles, the .connection files were created, but they are empty.

See [Connectors](/docs/forward/get-data-in/connectors) for more information about the syntax.

### Kafka settings have been deprecated [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#kafka-settings-have-been-deprecated)

Some settings in the Kafka connector have been deprecated. You need to update your Kafka .connection file to use the most up-to-date [Kafka settings](/docs/forward/get-data-in/connectors/kafka#kafka-connection-settings).

### Replace include files with secrets [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#replace-include-files-with-secrets)

Include files are not supported in Forward. The fix depends on your use of include files:

- If you use include files to store secrets, use[  tb secret](/docs/forward/dev-reference/commands/tb-secret)   to set secrets in your local and cloud environments.
- If you use include files to reuse query logic, you can create[  generic pipes](/docs/forward/work-with-data/pipes#create-generic-pipes)   and reference them in your endpoint pipes. For example:

##### reusable_filters.pipe

NODE apply_params
SQL >
    %
    SELECT * FROM my_datasource
    WHERE
        tenant_id = {{ String(tenant) }}
        AND date BETWEEN {{ Date(start_date) }} AND {{ Date(end_date) }}
##### my_endpoint.pipe

NODE endpoint
SQL >
    %
    SELECT * FROM reusable_filters

TYPE endpoint
### Add TYPE endpoint to your .pipe files [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#add-type-endpoint-to-your-pipe-files)

You need to add `TYPE endpoint` to your .pipe files so they can be published as API endpoints.

If you omit the<a href="/docs/forward/dev-reference/datafiles/pipe-files#available-instructions"> `TYPE` instruction</a> , the pipe will be a generic pipe that is not publicly exposed.

##### example.pipe

NODE my_node
SQL >
    SELECT * FROM my_datasource

TYPE endpoint
## Next steps [¶](https://www.tinybird.co/docs/forward/install-tinybird/migrate#next-steps)

- Learn about working with Forward in the[  Forward documentation](/docs/forward)  .


---

URL: https://www.tinybird.co/docs/forward/install-tinybird/local
Last update: 2025-07-01T20:01:33.000Z
Content:
---
title: "Tinybird Local container · Tinybird Docs"
theme-color: "#171612"
description: "Use the Tinybird Local container to run Tinybird locally and in CI workflows."
---


# Tinybird Local container [¶](https://www.tinybird.co/docs/forward/install-tinybird/local#tinybird-local-container)

Copy as MD You can run your own Tinybird instance locally using the `tinybird-local` container.

The Tinybird Local container is useful in CI/CD pipelines. See [CI/CD](/docs/forward/test-and-deploy/deployments/cicd) for more information. You can also deploy it on your own cloud infrastructure.

## Prerequisites [¶](https://www.tinybird.co/docs/forward/install-tinybird/local#prerequisites)

To get started, you need a container runtime, like Docker or orbstack.

## Run the Tinybird Local container [¶](https://www.tinybird.co/docs/forward/install-tinybird/local#run-the-tinybird-local-container)

To run Tinybird locally, run the following command:

docker run --platform linux/amd64 -p 7181:7181 --name tinybird-local -d tinybirdco/tinybird-local:latest By default, Tinybird Local runs on port 7181, although you can expose it locally using any other port.

## API endpoints [¶](https://www.tinybird.co/docs/forward/install-tinybird/local#api-endpoints)

By default, the Tinybird Local container exposes the following API endpoint:

- `http://localhost:7181/api/v0/`

You can call all the existing [Tinybird API endpoints](/docs/api-reference) locally. For example:

```shell
curl \
      -X POST 'http://localhost:7181/v0/events?name=<your_datasource>' \
      -H "Authorization: Bearer <your_token>" \
      -d $'<your_data>'
## Persist data between sessions [¶](https://www.tinybird.co/docs/forward/install-tinybird/local#persist-data-between-sessions)

To persist your data between development sessions, you can specify a custom path for storing your data volumes with the `--volumes-path` flag:

tb local start --volumes-path <your/path> This ensures your data persists between restarts, making local development more efficient and reliable.

Remember that `tb local stop` does not remove the data. `tb local restart` does, so to get to an earlier state you can do `tb local restart --volumes-path ./tb_previous_snapshot`

## Next steps [¶](https://www.tinybird.co/docs/forward/install-tinybird/local#next-steps)

- Learn about datafiles. See[  Datafiles](/docs/forward/dev-reference/datafiles)  .
- Learn about the Tinybird CLI. See[  Command reference](/docs/forward/dev-reference/commands)  .


---

URL: https://www.tinybird.co/docs/forward/get-started/quick-start
Last update: 2025-07-09T22:38:24.000Z
Content:
---
title: "Quick start guide · Tinybird Docs"
theme-color: "#171612"
description: "Follow this step-by-step tutorial to get started with Tinybird."
---


# Get started with Tinybird [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#get-started-with-tinybird)

Copy as MD Follow these steps to install Tinybird Local and Tinybird CLI on your machine, build your first data project, and deploy it to Tinybird Cloud.

See [Core concepts](/docs/forward/get-started/concepts) for a complete overview of Tinybird.

## Before you begin [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#before-you-begin)

To get started, you need the following:

- A container runtime, like Docker or Orbstack
- Linux or macOS

## Deploy a new project in five minutes [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#deploy-a-new-project-in-five-minutes)

1
### Create a Tinybird account [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#create-a-tinybird-account)

If you don't already have a Tinybird account, you can create one at [cloud.tinybird.co](https://cloud.tinybird.co/) -- it's free!

2
### Install and authenticate [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#install-and-authenticate)

Run the following command to install the Tinybird CLI:


- For macOS and Linux
- For Windows

curl https://tinybird.co | sh Then, authenticate with your Tinybird account using `tb login`:

tb login In the browser, create a new workspace or select an existing one.

3
### Run Tinybird Local [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#run-tinybird-local)

After you've authenticated, run `tb local start` to start a Tinybird Local instance in a Docker container, allowing you to develop and test your project locally.

tb local start 4
### Create a project [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#create-a-project)

Start with the project structure `tb create`:

tb create 

# » Creating new project structure...
# Learn more about data files https://www.tinybird.co/docs/forward/datafiles
# ./datasources       → Where your data lives. Define the schema and settings for your tables.
# ./endpoints         → Expose real-time HTTP APIs of your transformed data.
# ./materializations  → Stream continuous updates of the result of a pipe into a new data source.
# ./copies            → Capture the result of a pipe at a moment in time and write it into a target data source.
# ./sinks             → Export your data to external systems on a scheduled or on-demand basis.
# ./pipes             → Transform your data and reuse the logic in endpoints, materializations and copies.
# ./fixtures          → Files with sample data for your project.
# ./tests             → Test your pipe files with data validation tests.
# ./connections       → Connect to and ingest data from popular sources: Kafka, S3 or GCS.
# ✓ Scaffolding completed!


# » Creating .env.local file...
# ✓ Done!


# » Creating CI/CD files for GitHub and GitLab...
# ./.gitignore
# ./.github/workflows/tinybird-ci.yml
# ./.github/workflows/tinybird-cd.yml
# ./.gitlab-ci.yml
# ./.gitlab/tinybird/tinybird-ci.yml
# ./.gitlab/tinybird/tinybird-cd.yml
# ✓ Done!


# » Creating rules...
# ./.cursorrules
# ✓ Done!


# » Creating Claude Code rules...
# ./CLAUDE.md
# ✓ Done! First things first, you need a data source. Use NYC yellow taxis [dataset](https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page) if you don't have sample data.

tb datasource create --url https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2025-05.parquet --name trips
# Running against Tinybird Local

# » Creating .datasource file...
# /datasources/trips.datasource
# ✓ .datasource created! Data sources are the definition of the database tables where you will store the data. More information about data sources [here](/docs/forward/dev-reference/datafiles/datasource-files).

#### Content of /datasources/trips.datasource [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#content-of-datasourcestrips-datasource)

Inspecting the file you see a description and the schema, with the column names, their types, and the [JSONPath](/docs/forward/dev-reference/datafiles/datasource-files#jsonpath-expressions) to access the parquet fields.

##### datasources/trips.datasource

DESCRIPTION >
    Generated from https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2025-05.parquet

SCHEMA >
    VendorID Int32 `json:$.VendorID`,
    tpep_pickup_datetime DateTime64(6) `json:$.tpep_pickup_datetime`,
    tpep_dropoff_datetime DateTime64(6) `json:$.tpep_dropoff_datetime`,
    passenger_count Int64 `json:$.passenger_count`,
    trip_distance Float64 `json:$.trip_distance`,
    RatecodeID Int64 `json:$.RatecodeID`,
    store_and_fwd_flag String `json:$.store_and_fwd_flag`,
    PULocationID Int32 `json:$.PULocationID`,
    DOLocationID Int32 `json:$.DOLocationID`,
    payment_type Int64 `json:$.payment_type`,
    fare_amount Float64 `json:$.fare_amount`,
    extra Float64 `json:$.extra`,
    mta_tax Float64 `json:$.mta_tax`,
    tip_amount Float64 `json:$.tip_amount`,
    tolls_amount Float64 `json:$.tolls_amount`,
    improvement_surcharge Float64 `json:$.improvement_surcharge`,
    total_amount Float64 `json:$.total_amount`,
    congestion_surcharge Float64 `json:$.congestion_surcharge`,
    Airport_fee Float64 `json:$.Airport_fee`,
    cbd_congestion_fee Float64 `json:$.cbd_congestion_fee` 5
### Add fixtures data for testing [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#add-fixtures-data-for-testing)

It is important to test locally, and if you're going to add this file to your repo, it is better not to have real production data in case it contains PII. `tb mock` will create synthetic data for that, and with `--prompt` and `--rows` flags you can customize it.

tb mock trips --prompt "data looks like this {'VendorID':1,'tpep_pickup_datetime':1746058026000,'tpep_dropoff_datetime':1746059055000,'passenger_count':1,'trip_distance':3.7,'RatecodeID':1,'store_and_fwd_flag':'N','PULocationID':140,'DOLocationID':202,'payment_type':1,'fare_amount':18.4,'extra':4.25,'mta_tax':0.5,'tip_amount':4.85,'tolls_amount':0,'improvement_surcharge':1,'total_amount':29,'congestion_surcharge':2.5,'Airport_fee':0,'cbd_congestion_fee':0.75} and location ids range go from 1 to 265" --rows 10000
# Running against Tinybird Local

# » Creating fixture for trips...
# ✓ /fixtures/trips.ndjson created
# ✓ Sample data for trips created with 10000 rows 6
### Add the lookup table and create an endpoint [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#add-the-lookup-table-and-create-an-endpoint)

Projects in Tinybird usually consist of data sources and API Endpoints to expose the query results. Create one to check in which zone passengers give bigger tips. Also, trips data source has two columns, `PULocationID` and `DOLocationID` that need a reference table to be understood. Add that table as well.

Use `tb create` command. It is not just for scaffolding, it allows you to create resources passing the `--prompt` and `--data` options.

tb create \
    --prompt "Create the lookup table (data attached) and add an endpoint that finds which Zone is more likely to have better tips for a given pickup location (default to JFK Airport but make it dynamic so users enter the pickup zone in text).
    Note: Trips parquet file rows look like this: {'VendorID':1,'tpep_pickup_datetime':1746058026000,'tpep_dropoff_datetime':1746059055000,'passenger_count':1,'trip_distance':3.7,'RatecodeID':1,'store_and_fwd_flag':'N','PULocationID':140,'DOLocationID':202,'payment_type':1,'fare_amount':18.4,'extra':4.25,'mta_tax':0.5,'tip_amount':4.85,'tolls_amount':0,'improvement_surcharge':1,'total_amount':29,'congestion_surcharge':2.5,'Airport_fee':0,'cbd_congestion_fee':0.75}
    Note 2: for the lookup, prioritize subqueries over joins" \
    --data "https://d37ci6vzurychx.cloudfront.net/misc/taxi_zone_lookup.csv"

# » Creating resources...
# ./datasources/taxi_zone_lookup.datasource
# ./endpoints/taxi_zone_lookup_endpoint.pipe
# ./endpoints/best_tip_zones.pipe

# » Creating project description...
# ./README.md
# ✓ Resources created!


# » Generating fixtures...
# ./fixtures/taxi_zone_lookup.csv
# ✓ Done!
#### /datasources/taxi_zone_lookup.datasource [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#datasourcestaxi-zone-lookup-datasource)

In this case, as it is a CSV data source there are no JSONPaths and column names are taken from CSV headers.

##### datasources/taxi_zone_lookup.datasource

DESCRIPTION >
    Generated from https://d37ci6vzurychx.cloudfront.net/misc/taxi_zone_lookup.csv

SCHEMA >
    `locationid` Int32,
    `borough` String,
    `zone` String,
    `service_zone` String
#### /fixtures/taxi_zone_lookup.csv [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#fixturestaxi-zone-lookup-csv)

A sample of the CSV data so you can test locally.

##### fixtures/rows.csv

"LocationID","Borough","Zone","service_zone"
1,"EWR","Newark Airport","EWR"
2,"Queens","Jamaica Bay","Boro Zone"
3,"Bronx","Allerton/Pelham Gardens","Boro Zone"
4,"Manhattan","Alphabet City","Yellow Zone"
5,"Staten Island","Arden Heights","Boro Zone"
#### endpoints/best_tip_zones.pipe [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#endpointsbest-tip-zones-pipe)

Endpoints are a kind of [pipe](/docs/forward/work-with-data/pipes) that you can call from other applications. You have data in a data source, use a pipe to build SQL logic, and then publish the result of your query as a REST API endpoint. Pipes contain just SQL and a [templating language](/docs/forward/work-with-data/query-parameters) that lets you add query parameters to the API. More details about Endpoints [here](/docs/forward/work-with-data/publish-data/endpoints).

##### endpoints/best_tip_zones.pipe

DESCRIPTION >
    Finds which Zone is more likely to have better tips for a given pickup location

NODE best_tip_zones_query
SQL >
    %
    WITH pickup_zone AS (
        SELECT locationid
        FROM taxi_zone_lookup
        WHERE zone = {{String(pickup_zone, 'JFK Airport')}}
        LIMIT 1
    )
    SELECT
        tz.zone AS dropoff_zone,
        tz.borough AS dropoff_borough,
        AVG(tip_amount / fare_amount) AS avg_tip_ratio,
        COUNT(*) AS trip_count
    FROM trips t
    JOIN taxi_zone_lookup tz ON t.DOLocationID = tz.locationid
    WHERE
        t.PULocationID = (SELECT locationid FROM pickup_zone)
        AND t.fare_amount > 0
        AND t.payment_type = 1 -- Credit card payments only
    GROUP BY dropoff_zone, dropoff_borough
    ORDER BY avg_tip_ratio DESC
    LIMIT 20

TYPE ENDPOINT 7
### Run the development server [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#run-the-development-server)

To start developing, run the `tb dev` command and start editing the data files within the created project directory. This command starts the development server and also provides a console to interact with the database. The project will automatically rebuild and reload upon saving changes to any file.

tb dev

# » Building project...
# ✓ datasources/taxi_zone_lookup.datasource created
# ✓ datasources/trips.datasource created
# ✓ endpoints/taxi_zone_lookup_endpoint.pipe created
# ✓ endpoints/best_tip_zones.pipe created

# ✓ Build completed in 1.5s

# Watching for changes...

# tb » 8
### Test the API Endpoint [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#test-the-api-endpoint)

The goal is to have a working endpoint that you can call from other applications.

#### About tokens [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#about-tokens)

All Tinybird API calls require authentication using a **token** . Tokens control who can access your data and what they can do with it. Think of them like API keys that protect your endpoints.

For local development, Tinybird automatically creates an admin token for you: `admin local_testing@tinybird.co` . This token has full permissions in your local environment, making it perfect for testing.

#### Making your first API call [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#making-your-first-api-call)

Default local url is `http://localhost:7181` and your local admin token is `admin local_testing@tinybird.co`.

Outside the dev server, copy the token value with `tb token copy` and use it to call the endpoint. Send the `pickup_zone` parameter with a value of `Newark Airport` and check the response.

tb token copy "admin local_testing@tinybird.co" && TB_LOCAL_TOKEN=$(pbpaste)

curl -X GET "http://localhost:7181/v0/pipes/best_tip_zones.json?token=$TB_LOCAL_TOKEN&pickup_zone=Newark+Airport"
# {
#         "meta":
#         [
#                 {
#                         "name": "dropoff_zone",
#                         "type": "String"
#                 },
#                 {
#                         "name": "dropoff_borough",
#                         "type": "String"
#                 },
#                 {
#                         "name": "avg_tip_ratio",
#                         "type": "Float64"
#                 },
#                 {
#                         "name": "trip_count",
#                         "type": "UInt64"
#                 }
#         ],
#
#         "data":
#         [
#                 {
#                         "dropoff_zone": "Newark Airport",
#                         "dropoff_borough": "EWR",
#                         "avg_tip_ratio": 0.22222222219635465,
#                         "trip_count": 8
#                 }
#         ],
#
#         "rows": 1,
#
#         "rows_before_limit_at_least": 1,
#
#         "statistics":
#         {
#                 "elapsed": 0.018887949,
#                 "rows_read": 10530,
#                 "bytes_read": 339680
#         }
# } 9
### Deploy to Tinybird Cloud [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#deploy-to-tinybird-cloud)

To deploy to Tinybird Cloud, create a deployment using the `--cloud` flag. This prepares all the resources in the cloud environment.

tb --cloud deploy 10
### Append data to Tinybird Cloud [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#append-data-to-tinybird-cloud)

Use `tb datasource append` with the `--cloud` flag to ingest the data from the URL into Tinybird Cloud.

tb --cloud datasource append trips --url "https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2025-05.parquet"
tb --cloud datasource append taxi_zone_lookup --url "https://d37ci6vzurychx.cloudfront.net/misc/taxi_zone_lookup.csv"

# Running against Tinybird Cloud: Workspace quickstart

# » Appending data to trips
# ✓ Rows appended!
# Running against Tinybird Cloud: Workspace quickstart

# » Appending data to taxi_zone_lookup
# ✓ Rows appended! 11
### Open the project in Tinybird Cloud [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#open-the-project-in-tinybird-cloud)

To open the project in Tinybird Cloud, run the following command:

tb --cloud open Go to **Endpoints** and select an endpoint to see stats and snippets.

**Tokens in production** : Your local admin token ( `admin local_testing@tinybird.co` ) won't work with Tinybird Cloud. The cloud environment has its own tokens for security. When you need to call your cloud endpoints from applications, create specific tokens with limited permissions. See [Tokens](/docs/forward/administration/tokens) for details.

## Next steps [¶](https://www.tinybird.co/docs/forward/get-started/quick-start#next-steps)

- Familiarize yourself with Tinybird concepts. See[  Core concepts](/docs/forward/get-started/concepts)  .
- Learn about datafiles, like .datasource and .pipe files. See[  Datafiles](/docs/forward/dev-reference/datafiles)  .
- Get data into Tinybird from a variety of sources. See[  Get data in](/docs/forward/get-data-in)  .
- Learn about authentication and securing your endpoints. See[  Tokens](/docs/forward/administration/tokens)  .
- Browse the Tinybird CLI commands reference. See[  Commands reference](/docs/forward/dev-reference/commands)  .
- Learn with a more detailed example. See[  Learn](/docs/forward/get-started/learn)  .
- Detect and fix Quarantine errors. See[  Quarantine](/docs/forward/get-data-in/quarantine)  .


---

URL: https://www.tinybird.co/docs/forward/get-started/learn
Last update: 2025-06-04T08:32:07.000Z
Content:
---
title: "Learn Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Learn Tinybird by building a multi-tenant SaaS analytics project."
---


# Learn Tinybird [¶](https://www.tinybird.co/docs/forward/get-started/learn#learn-tinybird)

Copy as MD
## What you'll build [¶](https://www.tinybird.co/docs/forward/get-started/learn#what-youll-build)

In this tutorial, you'll build a multi-tenant SaaS analytics project.

Imagine you're a developer **building a multitenant ecommerce SaaS** that is growing like crazy. You’re set to be the new Shopify, and you need to **give analytics to your users** , the merchants: a real-time view of how much they are selling and a feature for real-time recommendations in their shops.

You'll learn how to:

- Set up your local environment
- Define how to ingest data to your project
- Create API endpoints to expose the analytical features
- Validate your data project
- Deploy it to Tinybird Cloud

## Prerequisites [¶](https://www.tinybird.co/docs/forward/get-started/learn#prerequisites)

- A Tinybird account. If you don't already have a Tinybird account, you can create one at[  cloud.tinybird.co](https://cloud.tinybird.co/signup)   -- it's free!
- Basic SQL knowledge
- Tinybird CLI and Tinybird Local container installed on your machine.[  See Tinybird Forward installation](/docs/forward/install-tinybird#install-tinybird-on-your-machine)

## Next steps [¶](https://www.tinybird.co/docs/forward/get-started/learn#next-steps)

- [  Chapter 1: Idea to Production](/docs/forward/get-started/learn/chapter1-idea-to-prod)


---

URL: https://www.tinybird.co/docs/forward/get-started/integrations
Last update: 2025-05-07T09:42:03.000Z
Content:
---
title: "Integrations · Tinybird Docs"
theme-color: "#171612"
description: "Connect Tinybird to your database, data warehouse, streaming platform, devtools, and other applications."
---


# Integrations [¶](https://www.tinybird.co/docs/forward/get-started/integrations#integrations)

Copy as MD You can integrate Tinybird with various data sources and devtools to support your use case.

- **  Native Integrations**   are built and maintained by Tinybird, and integrated into the Tinybird product.
- **  Guided Integrations**   aren't built and maintained by Tinybird, but utilize native Tinybird APIs and/or functionality in the external tool.

## List of integrations [¶](https://www.tinybird.co/docs/forward/get-started/integrations#list-of-integrations)

Native
## Amazon S3

Use the S3 Connector to ingest files from your Amazon S3 buckets into Tinybird. [Source Docs](/docs/forward/get-data-in/connectors/s3) Native
## Apache Kafka

Use the Kafka Connector to ingest data streams from your Kafka cluster into Tinybird. [Source Docs](/docs/forward/get-data-in/connectors/kafka) Native
## Confluent Cloud

Connect Tinybird to your Confluent Cloud cluster, select a topic, and Tinybird... [Docs](/docs/forward/get-data-in/connectors/kafka) Native
## PostgreSQL

The Tinybird postgresql() table function allows you to read data from your existing... [Docs](/docs/forward/get-data-in/table-functions/postgresql) Native
## Prometheus

Learn how to consume API Endpoints in Prometheus format. [Guide](/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format) Native
## Redpanda

The Redpanda Connector allows you to ingest data from your existing Redpanda cluster and... [Docs](/docs/forward/get-data-in/connectors/kafka) Guided
## Amazon Kinesis

Learn how to send data from AWS Kinesis to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-aws-kinesis) Guided
## Amazon SNS

SNS is a popular pub/sub messaging system for AWS users. Here's how to use SNS to send... [Guide](https://www.tinybird.co/blog-posts/use-aws-sns-to-send-data-to-tinybird) Guided
## Auth0 Log Streams

Learn how to send Auth0 Logs Streams to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-auth0-logs) Guided
## Clerk

With Clerk you can easily manage user auth. By integrating Clerk with Tinybird, you can... [Guide](/docs/forward/get-data-in/guides/ingest-from-clerk) Guided
## Dub

Learn how to connect Dub webhooks to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-dub) Guided
## Estuary

Learn how to use Estuary to push data streams to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-with-estuary) Guided
## GitHub

Learn how to connect GitHub Webhooks to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-github) Guided
## GitLab

Learn how to connect GitLab Webhooks to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-gitlab) Guided
## Google Cloud Storage

Learn how to automatically synchronize all the CSV files in a Google GCS bucket to a... [Guide](/docs/forward/get-data-in/guides/ingest-from-google-gcs) Guided
## Google Pub/Sub

Learn how to send data from Google Pub/Sub to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-google-pubsub) Guided
## Knock

Learn how to connect Knock outbound webhooks to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-knock) Guided
## LiteLLM

Learn how to send LiteLLM events to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-litellm) Guided
## Mailgun

Learn how to connect Mailgun webhooks to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-mailgun) Guided
## MongoDB

Learn how to ingest data into Tinybird from MongoDB [Guide](/docs/forward/get-data-in/guides/ingest-from-mongodb) Guided
## MySQL

A step-by-step guide to setting up Change Data Capture (CDC) with MySQL, Confluent Cloud,... [Guide](https://www.tinybird.co/blog-posts/mysql-cdc) Guided
## Orb Events

Learn how to configure a Orb webhook to send events to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-orb) Guided
## PagerDuty

Learn how to connect PagerDuty Webhooks to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-pagerduty) Guided
## Python logs

Learn how to send Python logs to Tinybird. [Guide](/docs/forward/get-data-in/guides/python-sdk) Guided
## Resend

With Resend you can send and receive emails programmatically. By integrating Resend with... [Guide](/docs/forward/get-data-in/guides/ingest-from-resend) Guided
## Rudderstack

Learn two different methods to send events from RudderStack to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-rudderstack) Guided
## Sentry

Learn how to connect Sentry Webhooks to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-sentry) Guided
## Snowflake

Learn how to bring Snowflake data into Tinybird using AWS S3 or Azure Blob Storage. [S3 guide](/docs/forward/get-data-in/guides/ingest-from-snowflake-using-aws-s3), [Azure guide](/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage) Guided
## Stripe

Learn how to connect Stripe webhooks to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-stripe) Guided
## Trigger.dev

Learn how to reliably trigger Tinybird jobs with Trigger.dev. [Guide](/docs/forward/work-with-data/publish-data/guides/reliable-scheduling-with-trigger) Guided
## Vector.dev

Learn how to ingest data from Vector.dev to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-vector) Guided
## Vercel AI SDK

Instrument LLM calls with the Vercel AI SDK and Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-vercel-ai-sdk) Guided
## Vercel Integration

This integration will allow you to link your Tinybird Workspaces with your Vercel... [Guide](/docs/forward/get-data-in/guides/ingest-from-vercel) Guided
## Vercel Log Drains

Learn how to connect Vercel Log Drains to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-vercel-logdrains) Guided
## Vercel Webhooks

Learn how to send Vercel events to Tinybird. [Guide](/docs/forward/get-data-in/guides/ingest-from-vercel)

---

URL: https://www.tinybird.co/docs/forward/get-started/concepts
Last update: 2025-06-16T08:31:59.000Z
Content:
---
title: "Core concepts of Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Learn the core concepts of Tinybird, how it works, and how to use it."
---


# Core concepts of Tinybird [¶](https://www.tinybird.co/docs/forward/get-started/concepts#core-concepts-of-tinybird)

Copy as MD Tinybird gives you the tooling and infrastructure you need to ship analytics features in your application:

- An OLAP database
- An API gateway
- A real-time ingestion system
- An auth system
- Tooling to build, iterate, and deploy data projects

Tinybird is built around several core concepts that abstract the underlying infrastructure and data storage. To make the most of Tinybird, it's important to understand these core concepts.

## Develop locally, deploy remotely [¶](https://www.tinybird.co/docs/forward/get-started/concepts#develop-locally-deploy-remotely)

Tinybird is a platform for building analytics features in your application. It's comprised of the following components:

- Tinybird CLI, a command-line tool to build, iterate, and deploy data projects.
- Tinybird Local, a Docker container that runs a local deployment of Tinybird.
- Tinybird Cloud, the web interface to manage your Tinybird deployment.

The typical workflow is as follows:

1. You develop your[  data project](/docs/forward/dev-reference/datafiles)   locally using Tinybird Local and version control.
2. Changes are[  iterated](/docs/forward/test-and-deploy)   and tested locally before anything is deployed in Tinybird Cloud.
3. When you're ready, you[  deploy your data project](/docs/forward/test-and-deploy/deployments)   to Tinybird Cloud, which enables your resources to be used by your application. Ingestion and queries are enabled in the deployment.
4. In[  Tinybird Cloud](https://cloud.tinybird.co/)   you can browse and query your data, check observability, and more.

The following diagram illustrates the development and usage flow, from local development to deployment in Tinybird Cloud.

## Your projects live in workspaces [¶](https://www.tinybird.co/docs/forward/get-started/concepts#your-projects-live-in-workspaces)

A [workspace](/docs/forward/administration/workspaces) contains the resources, data, and state of your Tinybird project. You can have as many workspaces as you need and invite users to collaborate.

Each workspace contains, at a minimum, a source of data, a processing resource, and an output destination. The following diagram illustrates the relationship between resources in a workspace.

## Data enters through data sources [¶](https://www.tinybird.co/docs/forward/get-started/concepts#data-enters-through-data-sources)

Data sources are how you ingest and store data in Tinybird. All your data lives inside a data source, and you write SQL queries against data sources. You can bulk upload or stream data into a data source.

You can bring data in from the following sources:

- Files in your local file system.
- Files in your cloud storage bucket.
- Events sent to the[  Events API](/docs/forward/get-data-in/events-api)  .
- Events streamed from Kafka.

Data sources are defined in .datasource files. See [Datasource files](/docs/forward/dev-reference/datafiles/datasource-files) for more information.

## Use pipes to process your data [¶](https://www.tinybird.co/docs/forward/get-started/concepts#use-pipes-to-process-your-data)

Pipes are how you write SQL logic in Tinybird. They're a collection of one or more SQL queries chained together, or nodes. Pipes let you break larger queries down into smaller queries that are easier to read.

With pipes you can:

- Process data in nodes.
- Publish API endpoints.
- Create materialized views.
- Create copies of data.

[Pipes](/docs/forward/work-with-data/pipes) are defined in .pipe files. See [Pipe files](/docs/forward/dev-reference/datafiles/pipe-files) for more information.

## Outputs are where your data goes [¶](https://www.tinybird.co/docs/forward/get-started/concepts#outputs-are-where-your-data-goes)

When your processed data is ready to be consumed by your application, you can publish it through API endpoints.

The following output is available:

- [  API endpoints](/docs/forward/work-with-data/publish-data/endpoints)   , which you can call using custom parameters from any application.

Endpoints are defined in .pipe files. See [Pipe files](/docs/forward/dev-reference/datafiles/pipe-files) for more information.

## Tokens protect your data [¶](https://www.tinybird.co/docs/forward/get-started/concepts#tokens-protect-your-data)

[Tokens](/docs/forward/administration/tokens) are how you authenticate and control access to your Tinybird resources. Without tokens, nobody can send data to your data sources or query your endpoints.

There are two types of tokens:

- **[  Static tokens](/docs/forward/administration/tokens/static-tokens)**   : Permanent tokens for backend integrations. Use them to send data, manage resources via CLI, or read data directly from your backend applications.
- **[  JWT tokens](/docs/forward/administration/tokens/jwt)**   : Short-lived tokens. Use them to create individual tokens per user that needs to call Tinybird endpoints, with optional filtering and rate limiting per user.

When you create a workspace, Tinybird automatically creates default tokens for you, including:

- **  Workspace admin token**   : Full access to everything in your workspace
- **  Your personal admin token**   : Stored in your `.tinyb`   file when you run `tb login`

For development, you can use these admin tokens. For production, create specific tokens with limited permissions (scopes) for security.

### Token workflow [¶](https://www.tinybird.co/docs/forward/get-started/concepts#token-workflow)

1. **  Development**   : Use admin tokens to build and test your project locally
2. **  Production**   : Create resource-specific tokens with minimal required permissions
3. **  Frontend access**   : Generate JWTs from your backend for browser-based queries

See [Tokens](/docs/forward/administration/tokens) for detailed token management.

## To go live, promote a deployment [¶](https://www.tinybird.co/docs/forward/get-started/concepts#to-go-live-promote-a-deployment)

After you develop your project, you can deploy it to Tinybird Local or Tinybird Cloud. This creates a deployment, which is a version of your project resources and data running on local or cloud infrastructure.

When you're ready to go live, you can promote your deployment in Tinybird Cloud. This makes your deployment available to your users. In other words, the deployment is live.

See [Deployments](/docs/forward/test-and-deploy/deployments) for more information.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/table-functions
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Table functions · Tinybird Docs"
theme-color: "#171612"
description: "Ingest data from remote databases into Tinybird."
---


# Table functions [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#table-functions)

Copy as MD Tinybird table functions allow you to read data from an existing database and schedule a regular copy pipe to orchestrate synchronization. You can load full tables or incrementally sync your data.

Tinybird supports the following table functions:

- [  Apache Iceberg](/docs/forward/get-data-in/table-functions/iceberg)
- [  MySQL](/docs/forward/get-data-in/table-functions/mysql)
- [  PostgreSQL](/docs/forward/get-data-in/table-functions/postgresql)
- [  URL](/docs/forward/get-data-in/table-functions/url)

## Prerequisites [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#prerequisites)

Your database needs to be open and public, exposed to the internet with publicly signed certs, so you can connect to it by passing the hostname, port, username, and password.

### Environment Variables API [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#environment-variables-api)

The Environment Variables API is currently only accessible at API level.

Pasting your credentials into a pipe node or datafile as plain text is a security risk. Instead, use the Environment Variables API to [create two new secrets](/docs/api-reference/environment-variables-api#post-v0secrets) for your username and password.

In the next step, you can interpolate your new secrets using the `tb_secret` function:

{{tb_secret('db_username')}}
{{tb_secret('db_password')}}
## Load an external table [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#load-an-external-table)

Create a new pipe node. Call the table function in the `FROM` and pass the connection details:

##### PostgreSQL table function example

SELECT *
FROM postgresql(
  'aws-0-eu-central-1.TODO.com:3866',
  'postgres',
  'orders',
  {{tb_secret('db_username')}},
  {{tb_secret('db_password')}},
) Publish this node as a copy pipe, thereby running the query manually. You can choose to append only new data, or replace all data.

### Using datafiles [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#using-datafiles)

You can also define node logic in [Pipe files](../dev-reference/datafiles/pipe-files) . An example for a PostgreSQL eCommerce `orders_backfill` scenario, with a node called `all_orders` , would be:

NODE all_orders
SQL >

    %
    SELECT *
    FROM postgresql(
      'aws-0-eu-central-1.TODO.com:3866',
      'postgres',
      'orders',
      {{tb_secret('db_username')}},
      {{tb_secret('db_password')}},
    )

TYPE copy
TARGET_DATASOURCE orders
COPY_SCHEDULE @on-demand
COPY_MODE replace
## Include filters [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#include-filters)

You can use a source column to filter by a value in Tinybird, for example:

##### Example copy pipe with PostgreSQL table function and filters

SELECT *
FROM postgresql(
  'aws-0-eu-central-1.TODO.com:3866',
  'postgres',
  'orders',
  {{tb_secret('db_username')}},
  {{tb_secret('db_password')}},
  )
WHERE orderDate > (select max(orderDate) from orders)
## Schedule runs [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#schedule-runs)

When publishing as a copy pipe, most users set it to run at a frequent interval using a cron expression.

You can also trigger the copy pipe on demand:

curl -H "Authorization: Bearer <PIPE:READ token>" \
    -X POST "https:/tinybird.co/api/v0/pipes/<pipe_id>/run" Having on-demand pipes in your workspace is helpful, as you can run a full sync manually any time you need it. You might also use them for weekly full syncs.

## Synchronization strategies [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#synchronization-strategies)

When copying data from your database to Tinybird, you can use one of the following strategies:

- Use `COPY_MODE replace`   to synchronize small dimensions tables, up to a few million rows, in a frequent schedule (1 to 5 minutes).
- Use `COPY_MODE append`   to do incremental appends. For example, you can append events data tagged with a timestamp. Combine it with `COPY_SCHEDULE`   and filters in the copy pipe SQL to sync the new events.

### Timeouts [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#timeouts)

When synchronizing dimensions tables with `COPY_MODE replace` and 1 minute schedule, the copy job might time out because it can't ingest the whole table in the defined schedule.

Timeouts depend on several factors:

- The timeout configured in your external database.
- The external database load.
- Network connectivity, for example when copying data from different cloud regions.

Follow these tips to avoid timeouts using incremental appends:

- Make sure to tag your data with an updated timestamp and use the column to filter the copy pipe SQL.
- Configure the copy pipe with an incremental append strategy and 1 minute schedule. That way you make sure only new records in the last minute are ingested, thus optimizing the copy job duration.
- Create an index in the external table to speed up filtering.
- Create the target data source as a ReplacingMergeTree using a unique or primary key as the `ENGINE_SORTING_KEY`   . Rows with the same `ENGINE_SORTING_KEY`   are deduplicated. Remember to use the `FINAL`   keyword when querying the data source to force deduplication at query time.
- Combine this approach with an hourly or daily replacement to get rid of deleted rows.

## Observability [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#observability)

Job executions are logged in the `datasources_ops_log` [Service Data Source](../monitoring/service-datasources) . You can check this log directly in the Data Source view page in the UI. Filter by `datasource_id` to monitor ingestion through the table functions from the `datasources_ops_log`:

##### Example query to the datasources_ops_log Service data source

SELECT
  timestamp,
  event_type,
  result,
  error,
  job_id
FROM
  tinybird.datasources_ops_log
WHERE
  datasource_id = 't_1234'
AND
  event_type = 'copy'
ORDER BY timestamp DESC
## Limits [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#limits)

The table functions inherit all the [limits of copy pipes](../pricing/limits#copy-pipe-limits).

Environment Variables are created at a workspace level, so you can connect one of each external database per Tinybird workspace.


Check the [limits page](/docs/forward/pricing/limits) for limits on ingestion, queries, API Endpoints, and more.

## Billing [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions#billing)

There are no additional or specific costs for the table function itself; only the costs associated with copy pipes apply. For more information on data operations and their charges, see the [billing docs](../pricing/billing).


---

URL: https://www.tinybird.co/docs/forward/get-data-in/quarantine
Last update: 2025-05-13T06:41:06.000Z
Content:
---
title: "Quarantine · Tinybird Docs"
theme-color: "#171612"
description: "Quarantine data sources store data that doesn't fit the schema."
---


# Quarantine data sources [¶](https://www.tinybird.co/docs/forward/get-data-in/quarantine#quarantine-data-sources)

Copy as MD Every data source in your workspace has an associated quarantine data source that stores data that doesn't fit the schema. If you send rows that don't match the data source schema, they're automatically sent to the quarantine table so that the ingest process doesn't fail.

By convention, quarantine data sources follow the naming pattern `{datasource_name}_quarantine` . You can review quarantined rows at any time or perform operations on them using Pipes. This is a useful source of information when fixing issues in the origin source or applying changes during ingest.

## Review quarantined data [¶](https://www.tinybird.co/docs/forward/get-data-in/quarantine#review-quarantined-data)

To check your quarantine data sources, run the `tb sql` command. For example:

tb sql "select * from <datasource_name>_quarantine limit 10" A sample output of the `tb sql` command is the following:

──────────────────────────────────────────────────────────────────
c__error_column: ['abslevel']
c__error: ["value '' on column 'abslevel' is not Float32"]
c__import_id: 01JKQPWT8GVXAN5GJ1VBD4XM27
day: 2014-07-30
station: Embassament de Siurana (Cornudella de Montsant)
volume: 11.57
insertion_date: 2025-02-10 10:36:20
────────────────────────────────────────────────────────────────── The quarantine data source schema contains the columns of the original row and the following columns with information about the issues that caused the quarantine:

- `c__error_column`   Array(String) contains an array of all the columns that contain an invalid value.
- `c__error`   Array(String) contains an array of all the errors that caused the ingestion to fail and led to the row being stored in quarantine. This column, along with `c__error_column`   , allows you to easily identify which columns have problems and what the specific errors are
- `c__import_id`   Nullable(String) contains the job's identifier in case the column was imported through a job.
- `insertion_date`   (DateTime) contains the timestamp in which the ingestion was done.

## Fixing quarantined data example [¶](https://www.tinybird.co/docs/forward/get-data-in/quarantine#fixing-quarantined-data-example)

Using the Electric Vehicle Population Data example:

tb create \
    --data "https://data.wa.gov/api/views/f6w7-q2d2/rows.csv?accessType=DOWNLOAD" \
    --prompt "Create an endpoint that ranks EV models. It should return all types by default, with optional type and limit parameters" You build the project and get the following quarantine error: `Error appending fixtures for 'rows': There was an error with file contents: 564 rows in quarantine.`

tb dev

» Building project...
✓ datasources/rows.datasource created
✓ endpoints/rows_endpoint.pipe created
✓ endpoints/model_ranking.pipe created
Error appending fixtures for 'rows': There was an error with file contents: 564 rows in quarantine.

✓ Build completed in 9.1s

Watching for changes...

tb » Inspecting the `rows_quarantine` data source:

tb » select distinct c__error from rows_quarantine

» Running QUERY

────────────────────────────────────────────────────────────────────────────────────────────
c__error: ["value '' on column 'postal_code' is not Int64", "value '' on column 'legislative_district' is not Int16", "value '' on column 'c_2020_census_tract' is not Int64"]
────────────────────────────────────────────────────────────────────────────────────────────
c__error: ["value '' on column 'electric_range' is not Int32", "value '' on column 'base_msrp' is not Int64"]
────────────────────────────────────────────────────────────────────────────────────────────
c__error: ["value '' on column 'legislative_district' is not Int16"]
──────────────────────────────────────────────────────────────────────────────────────────── The problem is that some columns should be Nullable or have a DEFAULT value. Let's proceed with adding a DEFAULT value of 0 for them.

Edit the `datasources/rows.datasource` file.

##### datasources/rows.datasource

DESCRIPTION >
    Generated from https://data.wa.gov/api/views/f6w7-q2d2/rows.csv?accessType=DOWNLOAD

SCHEMA >
    `vin__1_10_` String,
    `county` String,
    `city` String,
    `state` String,
    `postal_code` Int64 DEFAULT 0,
    `model_year` Int32,
    `make` String,
    `model` String,
    `electric_vehicle_type` String,
    `clean_alternative_fuel_vehicle__cafv__eligibility` String,
    `electric_range` Int32 DEFAULT 0,
    `base_msrp` Int64 DEFAULT 0,
    `legislative_district` Int16 DEFAULT 0,
    `dol_vehicle_id` Int64,
    `vehicle_location` String,
    `electric_utility` String,
    `c_2020_census_tract` Int64 DEFAULT 0 The dev server will rebuild the edited resources.

tb »

⟲ Changes detected in rows.datasource

» Rebuilding project...
✓ datasources/rows.datasource changed

✓ Rebuild completed in 1.1s No errors now, you're good to continue developing.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/local-file
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "Ingest data from files · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to ingest data from files to Tinybird."
---


# Ingest data from files [¶](https://www.tinybird.co/docs/forward/get-data-in/local-file#ingest-data-from-files)

Copy as MD You can ingest data from files to Tinybird using the [Data sources API](/docs/api-reference/datasource-api) or the [tb datasource](/docs/forward/dev-reference/commands/tb-datasource) CLI command.

## Supported file types [¶](https://www.tinybird.co/docs/forward/get-data-in/local-file#supported-file-types)

Tinybird supports these file types and compression formats at ingest time:

| File type | Method | Accepted extensions | Compression formats supported |
| --- | --- | --- | --- |
| CSV | File upload, URL | `.csv`  , `.csv.gz` | `gzip` |
| NDJSON | File upload, URL, Events API | `.ndjson`  , `.ndjson.gz` | `gzip` |
| Parquet | File upload, URL | `.parquet`  , `.parquet.gz` | `gzip` |
| Avro | Kafka |  | `gzip` |

## Analyze the schema of a file [¶](https://www.tinybird.co/docs/forward/get-data-in/local-file#analyze-the-schema-of-a-file)

Before you upload data from a file or create a data source, you can analyze the scheme of the file. Tinybird infers column names, types, and JSONPaths. This is helpful to identify the most appropriate data types for your columns. See [Data types](/docs/sql-reference/data-types).

The following examples show how to analyze a local CSV file.

- tb datasource
- Analyze API

tb datasource analyze local_file.csv
## Append data from a file [¶](https://www.tinybird.co/docs/forward/get-data-in/local-file#append-data-from-a-file)

You can append data from a local or remote file to a data source in Tinybird Local or Tinybird Cloud.

The following examples show how to append data from a local file to a data source in Tinybird Cloud:

- tb datasource (Local file)
- tb datasource (Remote file)
- Data source API
- Remote file using the API

tb --cloud datasource append <data_source_name> local_file.csv When appending CSV files, you can improve performance by excluding the CSV Header line. However, in this case, make sure the CSV columns are ordered. If you can't guarantee the order of columns in your CSV, include the CSV header.

## Replace data from a file [¶](https://www.tinybird.co/docs/forward/get-data-in/local-file#replace-data-from-a-file)

You can replace existing all data or a selection of data in a data source with the contents of a file. You can replace with data from local or remote files.

The following examples show how to replace data in Tinybird Cloud:

- tb datasource (Local file)
- tb datasource (Remote file)
- Data source API
- Remote file using the API

tb --cloud datasource replace <data_source_name> local_file.csv
## Replace data based on conditions [¶](https://www.tinybird.co/docs/forward/get-data-in/local-file#replace-data-based-on-conditions)

Instead of replacing all data, you can also replace specific partitions of data. To do this, you define an SQL condition that describes the filter that's applied. All matching rows are deleted before finally ingesting the new file. Only the rows matching the condition are ingested.

Replacements are made by partition, so make sure that the condition filters on the partition key of the data source. If the source file contains rows that don't match the filter, the rows are ignored.

The following examples show how to replace partial data using a condition:

- tb datasource (Local file)
- tb datasource (Remote file)
- Data source API
- Remote file using the API

tb --cloud datasource replace <data_source_name> local_file.csv --sql-condition "my_partition_key > 123" All the dependencies of the data source are recalculated so that your data is consistent after the replacement. If you have n-level dependencies, they're also updated by this operation.

Although replacements are atomic, Tinybird can't assure data consistency if you continue appending data to any related data source at the same time the replacement takes place. The new incoming data is discarded.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Ingest guides · Tinybird Docs"
theme-color: "#171612"
description: "Guides for ingesting data into Tinybird."
---


# Ingest guides [¶](https://www.tinybird.co/docs/forward/get-data-in/guides#ingest-guides)

Copy as MD Tinybird provides multiple ways to bring data into the platform. While [native connectors](./connectors) offer the most streamlined experience, you can use the Ingest APIs and other mechanisms to bring data from virtually any source.

Each guide provides step-by-step instructions and best practices for setting up reliable data ingestion pipelines. Whether you're working with batch files, streaming events, or database synchronization, you can find examples of how to effectively bring that data into Tinybird.

- [  Auth0](/docs/forward/get-data-in/guides/ingest-auth0-logs)
- [  AWS ELB logs](/docs/forward/get-data-in/guides/ingest-aws-elb-logs)
- [  AWS Kinesis](/docs/forward/get-data-in/guides/ingest-from-aws-kinesis)
- [  BigQuery using Google Cloud Storage](/docs/forward/get-data-in/guides/ingest-from-bigquery-using-google-cloud-storage)
- [  Clerk](/docs/forward/get-data-in/guides/ingest-from-clerk)
- [  CSV files](/docs/forward/get-data-in/guides/ingest-from-csv-files)
- [  Dub](/docs/forward/get-data-in/guides/ingest-from-dub)
- [  Estuary](/docs/forward/get-data-in/guides/ingest-with-estuary)
- [  GitHub](/docs/forward/get-data-in/guides/ingest-from-github)
- [  GitLab](/docs/forward/get-data-in/guides/ingest-from-gitlab)
- [  Google Cloud Storage](/docs/forward/get-data-in/guides/ingest-from-google-gcs)
- [  Google Pub/Sub](/docs/forward/get-data-in/guides/ingest-from-google-pubsub)
- [  Knock](/docs/forward/get-data-in/guides/ingest-from-knock)
- [  LiteLLM](/docs/forward/get-data-in/guides/ingest-litellm)
- [  Mailgun](/docs/forward/get-data-in/guides/ingest-from-mailgun)
- [  MongoDB](/docs/forward/get-data-in/guides/ingest-from-mongodb)
- [  OpenTelemetry](/docs/forward/get-data-in/guides/ingest-from-opentelemetry)
- [  Orb](/docs/forward/get-data-in/guides/ingest-from-orb)
- [  PagerDuty](/docs/forward/get-data-in/guides/ingest-from-pagerduty)
- [  Postgres CDC with Redpanda Connect](/docs/forward/get-data-in/guides/postgres-cdc-with-redpanda-connect)
- [  Python logs](/docs/forward/get-data-in/guides/python-sdk)
- [  Resend](/docs/forward/get-data-in/guides/ingest-from-resend)
- [  RudderStack](/docs/forward/get-data-in/guides/ingest-from-rudderstack)
- [  Sentry](/docs/forward/get-data-in/guides/ingest-from-sentry)
- [  Snowflake incremental updates](/docs/forward/get-data-in/guides/ingest-from-snowflake-using-incremental-updates)
- [  Snowflake using Azure Blob Storage](/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage)
- [  Snowflake using S3](/docs/forward/get-data-in/guides/ingest-from-snowflake-using-aws-s3)
- [  Stripe](/docs/forward/get-data-in/guides/ingest-from-stripe)
- [  Vector.dev](/docs/forward/get-data-in/guides/ingest-from-vector)
- [  Vercel (log drains)](/docs/forward/get-data-in/guides/ingest-vercel-logdrains)
- [  Vercel (webhooks)](/docs/forward/get-data-in/guides/ingest-from-vercel)
- [  Vercel AI SDK](/docs/forward/get-data-in/guides/ingest-vercel-ai-sdk)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/events-api
Last update: 2025-05-12T12:16:06.000Z
Content:
---
title: "Send events · Tinybird Docs"
theme-color: "#171612"
description: "Send JSON and NDJSON data to Tinybird by calling the Events API."
---


# Send events [¶](https://www.tinybird.co/docs/forward/get-data-in/events-api#send-events)

Copy as MD You send events to a [data source](/docs/forward/get-data-in/data-sources) using the Events API or the [tb datasource append](/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-append) CLI command with the `--events` flag.

## Send JSON events [¶](https://www.tinybird.co/docs/forward/get-data-in/events-api#send-json-events)

The following examples show how to append data via the Events API to a data source in Tinybird Cloud:

- Events API
- CLI

##### Sending batches of JSON events

curl \
-H "Authorization: Bearer <import_token>" \
-d $'{"date": "2020-04-05", "city": "Chicago"}\n{"date": "2020-04-05", "city": "Madrid"}\n' \
'https://api.tinybird.co/v0/events?name=<data_source_name>' Sending batches of events helps you achieve much higher total throughput than sending single events. You can send batches of JSON events to the Events API by formatting the events as NDJSON (newline delimited JSON). Each single JSON event must be separated by a newline `\n` character.

## Token [¶](https://www.tinybird.co/docs/forward/get-data-in/events-api#token)

The token you use to send events must have the `DATASOURCE:APPEND` scope. This should be defined in your data source file using `TOKEN {token_name} append` . For more details, see [resource-scoped tokens](/docs/forward/administration/tokens/static-tokens#resource-scoped-tokens).

## Limits [¶](https://www.tinybird.co/docs/forward/get-data-in/events-api#limits)

The Events API delivers a default capacity of:

- Up to 100 requests per second per data source
- Up to 20 MB per second per data source
- Up to 10 MB per request per data source.

### Rate limit headers [¶](https://www.tinybird.co/docs/forward/get-data-in/events-api#rate-limit-headers)

The Events API returns the following headers with the response:

| Header Name | Description |
| --- | --- |
| `X-RateLimit-Limit` | The maximum number of requests you're permitted to make in the current limit window. |
| `X-RateLimit-Remaining` | The number of requests remaining in the current rate limit window. |
| `X-RateLimit-Reset` | The time in seconds after the current rate limit window resets. |
| `Retry-After` | The time to wait before making a another request. Only present on 429 responses. |

### Check the payload size [¶](https://www.tinybird.co/docs/forward/get-data-in/events-api#check-the-payload-size)

To avoid hitting the request limit size, you can check your payload size before sending. For example:

- shell
- python
- js

echo '{"date": "2020-04-05", "city": "Chicago"}' | wc -c | awk '{print $1/1024/1024 " MB"}'
## Compress the data you send [¶](https://www.tinybird.co/docs/forward/get-data-in/events-api#compress-the-data-you-send)

You can compress the data you send to the Events API using Gzip. Compressing events adds overhead to the ingestion process, which can introduce latency, although it's typically minimal.

Here is an example of sending a JSON event compressed with Gzip from the command line:

echo '{"timestamp":"2022-10-27T11:43:02.099Z","transaction_id":"8d1e1533-6071-4b10-9cda-b8429c1c7a67","name":"Bobby Drake","email":"bobby.drake@pressure.io","age":42,"passport_number":3847665,"flight_from":"Barcelona","flight_to":"London","extra_bags":1,"flight_class":"economy","priority_boarding":false,"meal_choice":"vegetarian","seat_number":"15D","airline":"Red Balloon"}' | gzip > body.gz 

curl \
    -X POST 'https://api.tinybird.co/v0/events?name=gzip_events_example' \
    -H "Authorization: Bearer <AUTH_TOKEN>" \
    -H "Content-Encoding: gzip" \
    --data-binary @body.gz
## Write operation acknowledgements [¶](https://www.tinybird.co/docs/forward/get-data-in/events-api#write-operation-acknowledgements)

When you send data to the Events API, you usually receive an `HTTP202` response, which indicates that the request was successful, although it doesn't confirm that the data has been committed into the underlying database. This is useful when guarantees on writes aren't strictly necessary.

Typically, it takes under two seconds to receive a response from the Events API. For example:

curl \
    -X POST 'https://api.tinybird.co/v0/events?name=events_example' \
    -H "Authorization: Bearer <AUTH_TOKEN>" \
    -d $'{"timestamp":"2022-10-27T11:43:02.099Z"}'

< HTTP/2 202 
< content-type: application/json
< content-length: 42
< 
{"successful_rows":2,"quarantined_rows":0} If your use case requires absolute guarantees that data is committed, use the `wait` parameter. The `wait` parameter is a boolean that accepts a value of `true` or `false` . A value of `false` is the default behavior, equivalent to omitting the parameter entirely.

Using `wait=true` with your request asks the Events API to wait for acknowledgement that the data you sent has been committed into the underlying database. You then receive an `HTTP200` response that confirms data has been committed.

Adding `wait=true` to your request can result in a slower response time. Use a timeout of at least 10 seconds when waiting for the response. For example:

curl \
    -X POST 'https://api.tinybird.co/v0/events?name=events_example&wait=true' \
    -H "Authorization: Bearer <AUTH_TOKEN>" \
    -d $'{"timestamp":"2022-10-27T11:43:02.099Z"}'

< HTTP/2 200 
< content-type: application/json
< content-length: 42
< 
{"successful_rows":2,"quarantined_rows":0} Log your requests and responses from and to the Events API. This helps you get visibility into any failures.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/data-sources
Last update: 2025-05-12T13:39:22.000Z
Content:
---
title: "Data sources · Tinybird Docs"
theme-color: "#171612"
description: "Data sources contain all the data you bring into Tinybird, acting like tables in a database."
---


# Data sources [¶](https://www.tinybird.co/docs/forward/get-data-in/data-sources#data-sources)

Copy as MD When you send data to Tinybird, it's stored in a data source. You then write SQL queries to publish API [endpoints](/docs/forward/work-with-data/publish-data/endpoints) that will serve that data.

For example, if your event data lives in a Kafka topic, you can create a data source that connects directly to [Kafka](/docs/forward/get-data-in/connectors/kafka) and writes the events to Tinybird. Similarly, you can [send events](/docs/forward/get-data-in/events-api) or data [from a file](/docs/forward/get-data-in/local-file).

There are also intermediate data sources that are the result of [materialization](/docs/forward/work-with-data/optimize/materialized-views) or a [copy pipe](/docs/forward/work-with-data/optimize/copy-pipes).

Data sources are defined in `.datasource` files.

##### sample.datasource

SCHEMA >
    `timestamp` DateTime `json:$.timestamp`,
    `session_id` String `json:$.session_id`,
    `action` LowCardinality(String) `json:$.action`,
    `version` LowCardinality(String) `json:$.version`,
    `payload` String `json:$.payload` See all syntax options in the [Reference](/docs/forward/dev-reference/datafiles/datasource-files).

## Create Data Sources [¶](https://www.tinybird.co/docs/forward/get-data-in/data-sources#create-data-sources)

To create a new data source, you can manually define the .datasource file or run<a href="/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-create"> `tb datasource create` command</a>:

tb datasource create Once you run the command, you’ll be asked which type of data source you want to create.

- Blank. Generates a .datasource file with a couple of example columns you can edit.
- Local file. Creates a .datasource based on the schema of a file you have locally.
- Remote URL. Creates a .datasource based on the schema of a file from a remote URL.
- Kafka. Creates a datasource designed to work with a[  Kafka connection.](/docs/forward/get-data-in/connectors/kafka)   If you don’t have one yet, you’ll need to create it first, since the schema is built from the topic you select.
- Amazon S3. For working with an[  S3 connection.](/docs/forward/get-data-in/connectors/s3)   You’ll need an existing connection, as the schema is built from the file in the bucket you choose.
- GCS. Same idea as S3, but for[  Google Cloud Storage.](/docs/forward/get-data-in/connectors/gcs)   Make sure you have a connection set up first.

You can run `tb datasource create -h` anytime to see this list in the command help.

## Delete Data Sources [¶](https://www.tinybird.co/docs/forward/get-data-in/data-sources#delete-data-sources)

To delete a data source in Tinybird, you need to remove its corresponding .datasource file and deploy your changes using the `--allow-destructive-operations` flag to confirm the removal:

tb deploy --allow-destructive-operations This operation will permanently remove the data source and all its data from your Tinybird workspace. Make sure to review dependencies such as pipes or materialized views that might rely on the data source before deleting it.

## Quarantine data sources [¶](https://www.tinybird.co/docs/forward/get-data-in/data-sources#quarantine-data-sources)

Every data source you create in your Workspace has a quarantine data source associated that stores data that doesn't fit the schema. If you send rows that don't fit the data source schema, they're automatically sent to the quarantine table so that the ingest process doesn't fail.

See the [Quarantine](./data-operations/recover-from-quarantine) page for more details.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/connectors
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "Connectors · Tinybird Docs"
theme-color: "#171612"
description: "Connectors let you bring data into Tinybird from a variety of services."
---


# Connectors [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors#connectors)

Copy as MD Connectors are native integrations that let you connect to and ingest data from popular data platforms and services. Connectors provide a managed solution to stream or batch import data into Tinybird.

Each connector is fully managed by Tinybird and requires minimal setup - typically just authentication credentials and basic configuration. They handle the complexities of:

- Authentication and secure connections
- Schema detection and mapping
- Incremental updates and change data capture
- Error handling and monitoring
- Scheduling and orchestration

You can configure connectors using the Tinybird CLI, making it easy to incorporate them into your data workflows and CI/CD pipelines.

## Available connectors [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors#available-connectors)

The following connectors are available:

- [  GCS](/docs/forward/get-data-in/connectors/gcs)
- [  Kafka](/docs/forward/get-data-in/connectors/kafka)
- [  S3](/docs/forward/get-data-in/connectors/s3)


---

URL: https://www.tinybird.co/docs/forward/dev-reference/template-functions
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "Template functions · Tinybird Docs"
theme-color: "#171612"
description: "Template functions available in Tinybird datafiles."
---


# Template functions [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#template-functions)

Copy as MD The following template functions are available. You can use them in [datafiles](/docs/forward/dev-reference/datafiles) to accomplish different tasks.

## defined [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#defined)

Checks whether a variable is defined.

##### defined function

%
SELECT
  date
FROM my_table
{% if defined(param) %}
  WHERE ...
{% end %}
## column [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#column)

Retrieves the column by its name from a variable.

##### column function

%
{% set var_1 = 'name' %}
SELECT
  {{column(var_1)}}
FROM my_table
## columns [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#columns)

Retrieves columns by their name from a variable.

##### columns function

%
{% set var_1 = 'name,age,address' %}
SELECT
  {{columns(var_1)}}
FROM my_table
## date_diff_in_seconds [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#date-diff-in-seconds)

Returns the absolute value of the difference in seconds between two `DateTime` . See [DateTime](/docs/sql-reference/data-types/datetime).

The function accepts the following parameters:

- `date_1`   : the first date or DateTime.
- `date_2`   : the second date or DateTime.
- `date_format`   : (optional) the format of the dates. Defaults to `'%Y-%m-%d %H:%M:%S'`   , so you can pass `DateTime`   as `YYYY-MM-DD hh:mm:ss`   when calling the function.
- `backup_date_format`   : (optional) the format of the dates if the first format doesn't match. Use it when your default input format is a DateTime ( `2022-12-19 18:42:22`   ) but you receive a date instead ( `2022-12-19`   ).
- `none_if_error`   : (optional) whether to return `None`   if the dates don't match the provided formats. Defaults to `False`   . Use it to provide an alternate logic in case any of the dates are specified in a different format.

An example of how to use the function:

date_diff_in_seconds('2022-12-19T18:42:23.521Z', '2022-12-19T18:42:23.531Z', date_format='%Y-%m-%dT%H:%M:%S.%fz') The following example shows how to use the function in a datafile:

##### date_diff_in_seconds function

%
SELECT
  date, events
{% if date_diff_in_seconds(date_end, date_start, date_format="%Y-%m-%dT%H:%M:%Sz") < 3600 %}
  FROM my_table_raw
{% else %}
  FROM my_table_hourly_agg
{% end %}
  WHERE date BETWEEN
    parseDateTimeBestEffort({{String(date_start,'2023-01-11T12:24:04Z')}})
    AND 
    parseDateTimeBestEffort({{String(date_end,'2023-01-11T12:24:05Z')}})
## date_diff_in_minutes [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#date-diff-in-minutes)

Same behavior as [date_diff_in_seconds](https://www.tinybird.co/docs/forward/dev-reference/template-functions#date_diff_in_seconds) , but returns the difference in minutes.

## date_diff_in_hours [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#date-diff-in-hours)

Same behavior as [date_diff_in_seconds](https://www.tinybird.co/docs/forward/dev-reference/template-functions#date_diff_in_seconds) , but returns the difference in hours.

## date_diff_in_days [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#date-diff-in-days)

Returns the absolute value of the difference in days between two dates or DateTime.

##### date_diff_in_days function

%
SELECT
  date
FROM my_table
{% if date_diff_in_days(date_end, date_start) < 7 %}
  WHERE ...
{% end %} `date_format` is optional and defaults to `'%Y-%m-%d` , so you can pass DateTime as `YYYY-MM-DD` when calling the function.

As with `date_diff_in_seconds`, `date_diff_in_minutes` , and `date_diff_in_hours` , other [date_formats](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) are supported.

## split_to_array [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#split-to-array)

Splits comma separated values into an array. The function accepts the following parameters:

`split_to_array(arr, default, separator=',')`

- `arr`   : the value to split.
- `default`   : the default value to use if `arr`   is empty.
- `separator`   : the separator to use. Defaults to `,`  .

The following example splits `code` into an array of integers:

##### split_to_array function

%
SELECT
  arrayJoin(arrayMap(x -> toInt32(x), {{split_to_array(code, '')}})) as codes
FROM my_table The following example splits `param` into an array of strings using `|` as the custom separator:

##### split_to_array with a custom separator function

%
SELECT
  {{split_to_array(String(param, 'hi, how are you|fine thanks'), separator='|')}}
## enumerate_with_last [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#enumerate-with-last)

Creates an iterable array, returning a boolean value that allows to check if the current element is the last element in the array. You can use it alongside the [split_to_array function](https://www.tinybird.co/docs/forward/dev-reference/template-functions#split_to_array).

## symbol [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#symbol)

Retrieves the value of a variable. The function accepts the following parameters:

`symbol(x, quote)`

For example:

##### enumerate_with_last function

%
SELECT
    {% for _last, _x in enumerate_with_last(split_to_array(attr, 'amount')) %}
        sum({{symbol(_x)}}) as {{symbol(_x)}}
        {% if not _last %}, {% end %}
    {% end %}
FROM my_table
## sql_and [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#sql-and)

Creates a list of "WHERE" clauses, along with "AND" separated filters, that checks if a field (<column>) is or isn't (<op>) in a list/tuple (<transform_type_function>).

The function accepts the following parameters:

`sql_and(<column>__<op>=<transform_type_function> [, ...] )`

- `<column>`   : any column in the table.
- `<op>`   : one of: `in`  , `not_in`  , `gt`   (>), `lt`   (<), `gte`   (>=), `lte`   (<=)
- `<transform_type_function>`   : any of the transform type functions ( `Array(param, 'Int8')`  , `String(param)`   , etc.). If one parameter isn't specified, then the filter is ignored.

For example:

##### sql_and function

%
SELECT *
FROM my_table
WHERE 1
{% if defined(param) or defined(param2_not_in) %}
    AND {{sql_and(
        param__in=Array(param, 'Int32', defined=False),
        param2__not_in=Array(param2_not_in, 'String', defined=False))}}
{% end %} If this is queried with `param=1,2` and `param2_not_in=ab,bc,cd` , it translates to:

##### sql_and function - generated sql

SELECT *
FROM my_table
WHERE 1
    AND param  IN [1,2]
    AND param2 NOT IN ['ab','bc','cd'] If this is queried with `param=1,2` only, but `param2_not_in` isn't specified, it translates to:

##### sql_and function - generated sql param missing

SELECT *
FROM my_table
WHERE 1
    AND param  IN [1,2]
## Transform types functions [¶](https://www.tinybird.co/docs/forward/dev-reference/template-functions#transform-types-functions)

The following functions validate the type of a template variable and cast it to the desired data type. They also provide a default value if no value is passed.

- `Boolean(x)`
- `DateTime64(x)`
- `DateTime(x)`
- `Date(x)`
- `Float32(x)`
- `Float64(x)`
- `Int8(x)`
- `Int16(x)`
- `Int32(x)`
- `Int64(x)`
- `Int128(x)`
- `Int256(x)`
- `UInt8(x)`
- `UInt16(x)`
- `UInt32(x)`
- `UInt64(x)`
- `UInt128(x)`
- `UInt256(x)`
- `String(x)`
- `Array(x)`

Each function accepts the following parameters:

`type(x, default, description=<description>, required=<true|false>)`

- `x`   : the parameter or value.
- `default`   : (optional) the default value to use if `x`   is empty.
- `description`   : (optional) the description of the value.
- `required`   : (optional) whether the value is required.

For example, `Int32` in the following query, `lim` is the parameter to be cast to an `Int32`, `10` is the default value, and so on:

##### transform_type_functions

%
SELECT * FROM TR LIMIT {{Int32(lim, 10, description="Limit the number of rows in the response", required=False)}}

---

URL: https://www.tinybird.co/docs/forward/dev-reference/list-of-errors
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "List of API endpoint database errors · Tinybird Docs"
theme-color: "#171612"
description: "The following list contains all internal database errors that an API endpoint might return, and their numbers."
---


# List of internal database errors [¶](https://www.tinybird.co/docs/forward/dev-reference/list-of-errors#list-of-internal-database-errors)

Copy as MD API endpoint responses have an additional HTTP header, `X-DB-Exception-Code` , where you can check the internal database error, reported as a stringified number.

The following list contains all internal database errors and their numbers:

- `UNSUPPORTED_METHOD = "1"`
- `UNSUPPORTED_PARAMETER = "2"`
- `UNEXPECTED_END_OF_FILE = "3"`
- `EXPECTED_END_OF_FILE = "4"`
- `CANNOT_PARSE_TEXT = "6"`
- `INCORRECT_NUMBER_OF_COLUMNS = "7"`
- `THERE_IS_NO_COLUMN = "8"`
- `SIZES_OF_COLUMNS_DOESNT_MATCH = "9"`
- `NOT_FOUND_COLUMN_IN_BLOCK = "10"`
- `POSITION_OUT_OF_BOUND = "11"`
- `PARAMETER_OUT_OF_BOUND = "12"`
- `SIZES_OF_COLUMNS_IN_TUPLE_DOESNT_MATCH = "13"`
- `DUPLICATE_COLUMN = "15"`
- `NO_SUCH_COLUMN_IN_TABLE = "16"`
- `DELIMITER_IN_STRING_LITERAL_DOESNT_MATCH = "17"`
- `CANNOT_INSERT_ELEMENT_INTO_CONSTANT_COLUMN = "18"`
- `SIZE_OF_FIXED_STRING_DOESNT_MATCH = "19"`
- `NUMBER_OF_COLUMNS_DOESNT_MATCH = "20"`
- `CANNOT_READ_ALL_DATA_FROM_TAB_SEPARATED_INPUT = "21"`
- `CANNOT_PARSE_ALL_VALUE_FROM_TAB_SEPARATED_INPUT = "22"`
- `CANNOT_READ_FROM_ISTREAM = "23"`
- `CANNOT_WRITE_TO_OSTREAM = "24"`
- `CANNOT_PARSE_ESCAPE_SEQUENCE = "25"`
- `CANNOT_PARSE_QUOTED_STRING = "26"`
- `CANNOT_PARSE_INPUT_ASSERTION_FAILED = "27"`
- `CANNOT_PRINT_FLOAT_OR_DOUBLE_NUMBER = "28"`
- `CANNOT_PRINT_INTEGER = "29"`
- `CANNOT_READ_SIZE_OF_COMPRESSED_CHUNK = "30"`
- `CANNOT_READ_COMPRESSED_CHUNK = "31"`
- `ATTEMPT_TO_READ_AFTER_EOF = "32"`
- `CANNOT_READ_ALL_DATA = "33"`
- `TOO_MANY_ARGUMENTS_FOR_FUNCTION = "34"`
- `TOO_FEW_ARGUMENTS_FOR_FUNCTION = "35"`
- `BAD_ARGUMENTS = "36"`
- `UNKNOWN_ELEMENT_IN_AST = "37"`
- `CANNOT_PARSE_DATE = "38"`
- `TOO_LARGE_SIZE_COMPRESSED = "39"`
- `CHECKSUM_DOESNT_MATCH = "40"`
- `CANNOT_PARSE_DATETIME = "41"`
- `NUMBER_OF_ARGUMENTS_DOESNT_MATCH = "42"`
- `ILLEGAL_TYPE_OF_ARGUMENT = "43"`
- `ILLEGAL_COLUMN = "44"`
- `ILLEGAL_NUMBER_OF_RESULT_COLUMNS = "45"`
- `UNKNOWN_FUNCTION = "46"`
- `UNKNOWN_IDENTIFIER = "47"`
- `NOT_IMPLEMENTED = "48"`
- `LOGICAL_ERROR = "49"`
- `UNKNOWN_TYPE = "50"`
- `EMPTY_LIST_OF_COLUMNS_QUERIED = "51"`
- `COLUMN_QUERIED_MORE_THAN_ONCE = "52"`
- `TYPE_MISMATCH = "53"`
- `STORAGE_DOESNT_ALLOW_PARAMETERS = "54"`
- `STORAGE_REQUIRES_PARAMETER = "55"`
- `UNKNOWN_STORAGE = "56"`
- `TABLE_ALREADY_EXISTS = "57"`
- `TABLE_METADATA_ALREADY_EXISTS = "58"`
- `ILLEGAL_TYPE_OF_COLUMN_FOR_FILTER = "59"`
- `UNKNOWN_TABLE = "60"`
- `ONLY_FILTER_COLUMN_IN_BLOCK = "61"`
- `SYNTAX_ERROR = "62"`
- `UNKNOWN_AGGREGATE_FUNCTION = "63"`
- `CANNOT_READ_AGGREGATE_FUNCTION_FROM_TEXT = "64"`
- `CANNOT_WRITE_AGGREGATE_FUNCTION_AS_TEXT = "65"`
- `NOT_A_COLUMN = "66"`
- `ILLEGAL_KEY_OF_AGGREGATION = "67"`
- `CANNOT_GET_SIZE_OF_FIELD = "68"`
- `ARGUMENT_OUT_OF_BOUND = "69"`
- `CANNOT_CONVERT_TYPE = "70"`
- `CANNOT_WRITE_AFTER_END_OF_BUFFER = "71"`
- `CANNOT_PARSE_NUMBER = "72"`
- `UNKNOWN_FORMAT = "73"`
- `CANNOT_READ_FROM_FILE_DESCRIPTOR = "74"`
- `CANNOT_WRITE_TO_FILE_DESCRIPTOR = "75"`
- `CANNOT_OPEN_FILE = "76"`
- `CANNOT_CLOSE_FILE = "77"`
- `UNKNOWN_TYPE_OF_QUERY = "78"`
- `INCORRECT_FILE_NAME = "79"`
- `INCORRECT_QUERY = "80"`
- `UNKNOWN_DATABASE = "81"`
- `DATABASE_ALREADY_EXISTS = "82"`
- `DIRECTORY_DOESNT_EXIST = "83"`
- `DIRECTORY_ALREADY_EXISTS = "84"`
- `FORMAT_IS_NOT_SUITABLE_FOR_INPUT = "85"`
- `RECEIVED_ERROR_FROM_REMOTE_IO_SERVER = "86"`
- `CANNOT_SEEK_THROUGH_FILE = "87"`
- `CANNOT_TRUNCATE_FILE = "88"`
- `UNKNOWN_COMPRESSION_METHOD = "89"`
- `EMPTY_LIST_OF_COLUMNS_PASSED = "90"`
- `SIZES_OF_MARKS_FILES_ARE_INCONSISTENT = "91"`
- `EMPTY_DATA_PASSED = "92"`
- `UNKNOWN_AGGREGATED_DATA_VARIANT = "93"`
- `CANNOT_MERGE_DIFFERENT_AGGREGATED_DATA_VARIANTS = "94"`
- `CANNOT_READ_FROM_SOCKET = "95"`
- `CANNOT_WRITE_TO_SOCKET = "96"`
- `CANNOT_READ_ALL_DATA_FROM_CHUNKED_INPUT = "97"`
- `CANNOT_WRITE_TO_EMPTY_BLOCK_OUTPUT_STREAM = "98"`
- `UNKNOWN_PACKET_FROM_CLIENT = "99"`
- `UNKNOWN_PACKET_FROM_SERVER = "100"`
- `UNEXPECTED_PACKET_FROM_CLIENT = "101"`
- `UNEXPECTED_PACKET_FROM_SERVER = "102"`
- `RECEIVED_DATA_FOR_WRONG_QUERY_ID = "103"`
- `TOO_SMALL_BUFFER_SIZE = "104"`
- `CANNOT_READ_HISTORY = "105"`
- `CANNOT_APPEND_HISTORY = "106"`
- `FILE_DOESNT_EXIST = "107"`
- `NO_DATA_TO_INSERT = "108"`
- `CANNOT_BLOCK_SIGNAL = "109"`
- `CANNOT_UNBLOCK_SIGNAL = "110"`
- `CANNOT_MANIPULATE_SIGSET = "111"`
- `CANNOT_WAIT_FOR_SIGNAL = "112"`
- `THERE_IS_NO_SESSION = "113"`
- `CANNOT_CLOCK_GETTIME = "114"`
- `UNKNOWN_SETTING = "115"`
- `THERE_IS_NO_DEFAULT_VALUE = "116"`
- `INCORRECT_DATA = "117"`
- `ENGINE_REQUIRED = "119"`
- `CANNOT_INSERT_VALUE_OF_DIFFERENT_SIZE_INTO_TUPLE = "120"`
- `UNSUPPORTED_JOIN_KEYS = "121"`
- `INCOMPATIBLE_COLUMNS = "122"`
- `UNKNOWN_TYPE_OF_AST_NODE = "123"`
- `INCORRECT_ELEMENT_OF_SET = "124"`
- `INCORRECT_RESULT_OF_SCALAR_SUBQUERY = "125"`
- `CANNOT_GET_RETURN_TYPE = "126"`
- `ILLEGAL_INDEX = "127"`
- `TOO_LARGE_ARRAY_SIZE = "128"`
- `FUNCTION_IS_SPECIAL = "129"`
- `CANNOT_READ_ARRAY_FROM_TEXT = "130"`
- `TOO_LARGE_STRING_SIZE = "131"`
- `AGGREGATE_FUNCTION_DOESNT_ALLOW_PARAMETERS = "133"`
- `PARAMETERS_TO_AGGREGATE_FUNCTIONS_MUST_BE_LITERALS = "134"`
- `ZERO_ARRAY_OR_TUPLE_INDEX = "135"`
- `UNKNOWN_ELEMENT_IN_CONFIG = "137"`
- `EXCESSIVE_ELEMENT_IN_CONFIG = "138"`
- `NO_ELEMENTS_IN_CONFIG = "139"`
- `ALL_REQUESTED_COLUMNS_ARE_MISSING = "140"`
- `SAMPLING_NOT_SUPPORTED = "141"`
- `NOT_FOUND_NODE = "142"`
- `FOUND_MORE_THAN_ONE_NODE = "143"`
- `FIRST_DATE_IS_BIGGER_THAN_LAST_DATE = "144"`
- `UNKNOWN_OVERFLOW_MODE = "145"`
- `QUERY_SECTION_DOESNT_MAKE_SENSE = "146"`
- `NOT_FOUND_FUNCTION_ELEMENT_FOR_AGGREGATE = "147"`
- `NOT_FOUND_RELATION_ELEMENT_FOR_CONDITION = "148"`
- `NOT_FOUND_RHS_ELEMENT_FOR_CONDITION = "149"`
- `EMPTY_LIST_OF_ATTRIBUTES_PASSED = "150"`
- `INDEX_OF_COLUMN_IN_SORT_CLAUSE_IS_OUT_OF_RANGE = "151"`
- `UNKNOWN_DIRECTION_OF_SORTING = "152"`
- `ILLEGAL_DIVISION = "153"`
- `AGGREGATE_FUNCTION_NOT_APPLICABLE = "154"`
- `UNKNOWN_RELATION = "155"`
- `DICTIONARIES_WAS_NOT_LOADED = "156"`
- `ILLEGAL_OVERFLOW_MODE = "157"`
- `TOO_MANY_ROWS = "158"`
- `TIMEOUT_EXCEEDED = "159"`
- `TOO_SLOW = "160"`
- `TOO_MANY_COLUMNS = "161"`
- `TOO_DEEP_SUBQUERIES = "162"`
- `TOO_DEEP_PIPELINE = "163"`
- `READONLY = "164"`
- `TOO_MANY_TEMPORARY_COLUMNS = "165"`
- `TOO_MANY_TEMPORARY_NON_CONST_COLUMNS = "166"`
- `TOO_DEEP_AST = "167"`
- `TOO_BIG_AST = "168"`
- `BAD_TYPE_OF_FIELD = "169"`
- `BAD_GET = "170"`
- `CANNOT_CREATE_DIRECTORY = "172"`
- `CANNOT_ALLOCATE_MEMORY = "173"`
- `CYCLIC_ALIASES = "174"`
- `CHUNK_NOT_FOUND = "176"`
- `DUPLICATE_CHUNK_NAME = "177"`
- `MULTIPLE_ALIASES_FOR_EXPRESSION = "178"`
- `MULTIPLE_EXPRESSIONS_FOR_ALIAS = "179"`
- `THERE_IS_NO_PROFILE = "180"`
- `ILLEGAL_FINAL = "181"`
- `ILLEGAL_PREWHERE = "182"`
- `UNEXPECTED_EXPRESSION = "183"`
- `ILLEGAL_AGGREGATION = "184"`
- `UNSUPPORTED_MYISAM_BLOCK_TYPE = "185"`
- `UNSUPPORTED_COLLATION_LOCALE = "186"`
- `COLLATION_COMPARISON_FAILED = "187"`
- `UNKNOWN_ACTION = "188"`
- `TABLE_MUST_NOT_BE_CREATED_MANUALLY = "189"`
- `SIZES_OF_ARRAYS_DOESNT_MATCH = "190"`
- `SET_SIZE_LIMIT_EXCEEDED = "191"`
- `UNKNOWN_USER = "192"`
- `WRONG_PASSWORD = "193"`
- `REQUIRED_PASSWORD = "194"`
- `IP_ADDRESS_NOT_ALLOWED = "195"`
- `UNKNOWN_ADDRESS_PATTERN_TYPE = "196"`
- `SERVER_REVISION_IS_TOO_OLD = "197"`
- `DNS_ERROR = "198"`
- `UNKNOWN_QUOTA = "199"`
- `QUOTA_DOESNT_ALLOW_KEYS = "200"`
- `QUOTA_EXCEEDED = "201"`
- `TOO_MANY_SIMULTANEOUS_QUERIES = "202"`
- `NO_FREE_CONNECTION = "203"`
- `CANNOT_FSYNC = "204"`
- `NESTED_TYPE_TOO_DEEP = "205"`
- `ALIAS_REQUIRED = "206"`
- `AMBIGUOUS_IDENTIFIER = "207"`
- `EMPTY_NESTED_TABLE = "208"`
- `SOCKET_TIMEOUT = "209"`
- `NETWORK_ERROR = "210"`
- `EMPTY_QUERY = "211"`
- `UNKNOWN_LOAD_BALANCING = "212"`
- `UNKNOWN_TOTALS_MODE = "213"`
- `CANNOT_STATVFS = "214"`
- `NOT_AN_AGGREGATE = "215"`
- `QUERY_WITH_SAME_ID_IS_ALREADY_RUNNING = "216"`
- `CLIENT_HAS_CONNECTED_TO_WRONG_PORT = "217"`
- `TABLE_IS_DROPPED = "218"`
- `DATABASE_NOT_EMPTY = "219"`
- `DUPLICATE_INTERSERVER_IO_ENDPOINT = "220"`
- `NO_SUCH_INTERSERVER_IO_ENDPOINT = "221"`
- `ADDING_REPLICA_TO_NON_EMPTY_TABLE = "222"`
- `UNEXPECTED_AST_STRUCTURE = "223"`
- `REPLICA_IS_ALREADY_ACTIVE = "224"`
- `NO_ZOOKEEPER = "225"`
- `NO_FILE_IN_DATA_PART = "226"`
- `UNEXPECTED_FILE_IN_DATA_PART = "227"`
- `BAD_SIZE_OF_FILE_IN_DATA_PART = "228"`
- `QUERY_IS_TOO_LARGE = "229"`
- `NOT_FOUND_EXPECTED_DATA_PART = "230"`
- `TOO_MANY_UNEXPECTED_DATA_PARTS = "231"`
- `NO_SUCH_DATA_PART = "232"`
- `BAD_DATA_PART_NAME = "233"`
- `NO_REPLICA_HAS_PART = "234"`
- `DUPLICATE_DATA_PART = "235"`
- `ABORTED = "236"`
- `NO_REPLICA_NAME_GIVEN = "237"`
- `FORMAT_VERSION_TOO_OLD = "238"`
- `CANNOT_MUNMAP = "239"`
- `CANNOT_MREMAP = "240"`
- `MEMORY_LIMIT_EXCEEDED = "241"`
- `TABLE_IS_READ_ONLY = "242"`
- `NOT_ENOUGH_SPACE = "243"`
- `UNEXPECTED_ZOOKEEPER_ERROR = "244"`
- `CORRUPTED_DATA = "246"`
- `INCORRECT_MARK = "247"`
- `INVALID_PARTITION_VALUE = "248"`
- `NOT_ENOUGH_BLOCK_NUMBERS = "250"`
- `NO_SUCH_REPLICA = "251"`
- `TOO_MANY_PARTS = "252"`
- `REPLICA_IS_ALREADY_EXIST = "253"`
- `NO_ACTIVE_REPLICAS = "254"`
- `TOO_MANY_RETRIES_TO_FETCH_PARTS = "255"`
- `PARTITION_ALREADY_EXISTS = "256"`
- `PARTITION_DOESNT_EXIST = "257"`
- `UNION_ALL_RESULT_STRUCTURES_MISMATCH = "258"`
- `CLIENT_OUTPUT_FORMAT_SPECIFIED = "260"`
- `UNKNOWN_BLOCK_INFO_FIELD = "261"`
- `BAD_COLLATION = "262"`
- `CANNOT_COMPILE_CODE = "263"`
- `INCOMPATIBLE_TYPE_OF_JOIN = "264"`
- `NO_AVAILABLE_REPLICA = "265"`
- `MISMATCH_REPLICAS_DATA_SOURCES = "266"`
- `STORAGE_DOESNT_SUPPORT_PARALLEL_REPLICAS = "267"`
- `CPUID_ERROR = "268"`
- `INFINITE_LOOP = "269"`
- `CANNOT_COMPRESS = "270"`
- `CANNOT_DECOMPRESS = "271"`
- `CANNOT_IO_SUBMIT = "272"`
- `CANNOT_IO_GETEVENTS = "273"`
- `AIO_READ_ERROR = "274"`
- `AIO_WRITE_ERROR = "275"`
- `INDEX_NOT_USED = "277"`
- `ALL_CONNECTION_TRIES_FAILED = "279"`
- `NO_AVAILABLE_DATA = "280"`
- `DICTIONARY_IS_EMPTY = "281"`
- `INCORRECT_INDEX = "282"`
- `UNKNOWN_DISTRIBUTED_PRODUCT_MODE = "283"`
- `WRONG_GLOBAL_SUBQUERY = "284"`
- `TOO_FEW_LIVE_REPLICAS = "285"`
- `UNSATISFIED_QUORUM_FOR_PREVIOUS_WRITE = "286"`
- `UNKNOWN_FORMAT_VERSION = "287"`
- `DISTRIBUTED_IN_JOIN_SUBQUERY_DENIED = "288"`
- `REPLICA_IS_NOT_IN_QUORUM = "289"`
- `LIMIT_EXCEEDED = "290"`
- `DATABASE_ACCESS_DENIED = "291"`
- `MONGODB_CANNOT_AUTHENTICATE = "293"`
- `INVALID_BLOCK_EXTRA_INFO = "294"`
- `RECEIVED_EMPTY_DATA = "295"`
- `NO_REMOTE_SHARD_FOUND = "296"`
- `SHARD_HAS_NO_CONNECTIONS = "297"`
- `CANNOT_PIPE = "298"`
- `CANNOT_FORK = "299"`
- `CANNOT_DLSYM = "300"`
- `CANNOT_CREATE_CHILD_PROCESS = "301"`
- `CHILD_WAS_NOT_EXITED_NORMALLY = "302"`
- `CANNOT_SELECT = "303"`
- `CANNOT_WAITPID = "304"`
- `TABLE_WAS_NOT_DROPPED = "305"`
- `TOO_DEEP_RECURSION = "306"`
- `TOO_MANY_BYTES = "307"`
- `UNEXPECTED_NODE_IN_ZOOKEEPER = "308"`
- `FUNCTION_CANNOT_HAVE_PARAMETERS = "309"`
- `INVALID_SHARD_WEIGHT = "317"`
- `INVALID_CONFIG_PARAMETER = "318"`
- `UNKNOWN_STATUS_OF_INSERT = "319"`
- `VALUE_IS_OUT_OF_RANGE_OF_DATA_TYPE = "321"`
- `BARRIER_TIMEOUT = "335"`
- `UNKNOWN_DATABASE_ENGINE = "336"`
- `DDL_GUARD_IS_ACTIVE = "337"`
- `UNFINISHED = "341"`
- `METADATA_MISMATCH = "342"`
- `SUPPORT_IS_DISABLED = "344"`
- `TABLE_DIFFERS_TOO_MUCH = "345"`
- `CANNOT_CONVERT_CHARSET = "346"`
- `CANNOT_LOAD_CONFIG = "347"`
- `CANNOT_INSERT_NULL_IN_ORDINARY_COLUMN = "349"`
- `INCOMPATIBLE_SOURCE_TABLES = "350"`
- `AMBIGUOUS_TABLE_NAME = "351"`
- `AMBIGUOUS_COLUMN_NAME = "352"`
- `INDEX_OF_POSITIONAL_ARGUMENT_IS_OUT_OF_RANGE = "353"`
- `ZLIB_INFLATE_FAILED = "354"`
- `ZLIB_DEFLATE_FAILED = "355"`
- `BAD_LAMBDA = "356"`
- `RESERVED_IDENTIFIER_NAME = "357"`
- `INTO_OUTFILE_NOT_ALLOWED = "358"`
- `TABLE_SIZE_EXCEEDS_MAX_DROP_SIZE_LIMIT = "359"`
- `CANNOT_CREATE_CHARSET_CONVERTER = "360"`
- `SEEK_POSITION_OUT_OF_BOUND = "361"`
- `CURRENT_WRITE_BUFFER_IS_EXHAUSTED = "362"`
- `CANNOT_CREATE_IO_BUFFER = "363"`
- `RECEIVED_ERROR_TOO_MANY_REQUESTS = "364"`
- `SIZES_OF_NESTED_COLUMNS_ARE_INCONSISTENT = "366"`
- `TOO_MANY_FETCHES = "367"`
- `ALL_REPLICAS_ARE_STALE = "369"`
- `DATA_TYPE_CANNOT_BE_USED_IN_TABLES = "370"`
- `INCONSISTENT_CLUSTER_DEFINITION = "371"`
- `SESSION_NOT_FOUND = "372"`
- `SESSION_IS_LOCKED = "373"`
- `INVALID_SESSION_TIMEOUT = "374"`
- `CANNOT_DLOPEN = "375"`
- `CANNOT_PARSE_UUID = "376"`
- `ILLEGAL_SYNTAX_FOR_DATA_TYPE = "377"`
- `DATA_TYPE_CANNOT_HAVE_ARGUMENTS = "378"`
- `UNKNOWN_STATUS_OF_DISTRIBUTED_DDL_TASK = "379"`
- `CANNOT_KILL = "380"`
- `HTTP_LENGTH_REQUIRED = "381"`
- `CANNOT_LOAD_CATBOOST_MODEL = "382"`
- `CANNOT_APPLY_CATBOOST_MODEL = "383"`
- `PART_IS_TEMPORARILY_LOCKED = "384"`
- `MULTIPLE_STREAMS_REQUIRED = "385"`
- `NO_COMMON_TYPE = "386"`
- `DICTIONARY_ALREADY_EXISTS = "387"`
- `CANNOT_ASSIGN_OPTIMIZE = "388"`
- `INSERT_WAS_DEDUPLICATED = "389"`
- `CANNOT_GET_CREATE_TABLE_QUERY = "390"`
- `EXTERNAL_LIBRARY_ERROR = "391"`
- `QUERY_IS_PROHIBITED = "392"`
- `THERE_IS_NO_QUERY = "393"`
- `QUERY_WAS_CANCELLED = "394"`
- `FUNCTION_THROW_IF_VALUE_IS_NON_ZERO = "395"`
- `TOO_MANY_ROWS_OR_BYTES = "396"`
- `QUERY_IS_NOT_SUPPORTED_IN_MATERIALIZED_VIEW = "397"`
- `UNKNOWN_MUTATION_COMMAND = "398"`
- `FORMAT_IS_NOT_SUITABLE_FOR_OUTPUT = "399"`
- `CANNOT_STAT = "400"`
- `FEATURE_IS_NOT_ENABLED_AT_BUILD_TIME = "401"`
- `CANNOT_IOSETUP = "402"`
- `INVALID_JOIN_ON_EXPRESSION = "403"`
- `BAD_ODBC_CONNECTION_STRING = "404"`
- `PARTITION_SIZE_EXCEEDS_MAX_DROP_SIZE_LIMIT = "405"`
- `TOP_AND_LIMIT_TOGETHER = "406"`
- `DECIMAL_OVERFLOW = "407"`
- `BAD_REQUEST_PARAMETER = "408"`
- `EXTERNAL_EXECUTABLE_NOT_FOUND = "409"`
- `EXTERNAL_SERVER_IS_NOT_RESPONDING = "410"`
- `PTHREAD_ERROR = "411"`
- `NETLINK_ERROR = "412"`
- `CANNOT_SET_SIGNAL_HANDLER = "413"`
- `ALL_REPLICAS_LOST = "415"`
- `REPLICA_STATUS_CHANGED = "416"`
- `EXPECTED_ALL_OR_ANY = "417"`
- `UNKNOWN_JOIN = "418"`
- `MULTIPLE_ASSIGNMENTS_TO_COLUMN = "419"`
- `CANNOT_UPDATE_COLUMN = "420"`
- `CANNOT_ADD_DIFFERENT_AGGREGATE_STATES = "421"`
- `UNSUPPORTED_URI_SCHEME = "422"`
- `CANNOT_GETTIMEOFDAY = "423"`
- `CANNOT_LINK = "424"`
- `SYSTEM_ERROR = "425"`
- `CANNOT_COMPILE_REGEXP = "427"`
- `UNKNOWN_LOG_LEVEL = "428"`
- `FAILED_TO_GETPWUID = "429"`
- `MISMATCHING_USERS_FOR_PROCESS_AND_DATA = "430"`
- `ILLEGAL_SYNTAX_FOR_CODEC_TYPE = "431"`
- `UNKNOWN_CODEC = "432"`
- `ILLEGAL_CODEC_PARAMETER = "433"`
- `CANNOT_PARSE_PROTOBUF_SCHEMA = "434"`
- `NO_COLUMN_SERIALIZED_TO_REQUIRED_PROTOBUF_FIELD = "435"`
- `PROTOBUF_BAD_CAST = "436"`
- `PROTOBUF_FIELD_NOT_REPEATED = "437"`
- `DATA_TYPE_CANNOT_BE_PROMOTED = "438"`
- `CANNOT_SCHEDULE_TASK = "439"`
- `INVALID_LIMIT_EXPRESSION = "440"`
- `CANNOT_PARSE_DOMAIN_VALUE_FROM_STRING = "441"`
- `BAD_DATABASE_FOR_TEMPORARY_TABLE = "442"`
- `NO_COLUMNS_SERIALIZED_TO_PROTOBUF_FIELDS = "443"`
- `UNKNOWN_PROTOBUF_FORMAT = "444"`
- `CANNOT_MPROTECT = "445"`
- `FUNCTION_NOT_ALLOWED = "446"`
- `HYPERSCAN_CANNOT_SCAN_TEXT = "447"`
- `BROTLI_READ_FAILED = "448"`
- `BROTLI_WRITE_FAILED = "449"`
- `BAD_TTL_EXPRESSION = "450"`
- `BAD_TTL_FILE = "451"`
- `SETTING_CONSTRAINT_VIOLATION = "452"`
- `MYSQL_CLIENT_INSUFFICIENT_CAPABILITIES = "453"`
- `OPENSSL_ERROR = "454"`
- `SUSPICIOUS_TYPE_FOR_LOW_CARDINALITY = "455"`
- `UNKNOWN_QUERY_PARAMETER = "456"`
- `BAD_QUERY_PARAMETER = "457"`
- `CANNOT_UNLINK = "458"`
- `CANNOT_SET_THREAD_PRIORITY = "459"`
- `CANNOT_CREATE_TIMER = "460"`
- `CANNOT_SET_TIMER_PERIOD = "461"`
- `CANNOT_DELETE_TIMER = "462"`
- `CANNOT_FCNTL = "463"`
- `CANNOT_PARSE_ELF = "464"`
- `CANNOT_PARSE_DWARF = "465"`
- `INSECURE_PATH = "466"`
- `CANNOT_PARSE_BOOL = "467"`
- `CANNOT_PTHREAD_ATTR = "468"`
- `VIOLATED_CONSTRAINT = "469"`
- `QUERY_IS_NOT_SUPPORTED_IN_LIVE_VIEW = "470"`
- `INVALID_SETTING_VALUE = "471"`
- `READONLY_SETTING = "472"`
- `DEADLOCK_AVOIDED = "473"`
- `INVALID_TEMPLATE_FORMAT = "474"`
- `INVALID_WITH_FILL_EXPRESSION = "475"`
- `WITH_TIES_WITHOUT_ORDER_BY = "476"`
- `INVALID_USAGE_OF_INPUT = "477"`
- `UNKNOWN_POLICY = "478"`
- `UNKNOWN_DISK = "479"`
- `UNKNOWN_PROTOCOL = "480"`
- `PATH_ACCESS_DENIED = "481"`
- `DICTIONARY_ACCESS_DENIED = "482"`
- `TOO_MANY_REDIRECTS = "483"`
- `INTERNAL_REDIS_ERROR = "484"`
- `SCALAR_ALREADY_EXISTS = "485"`
- `CANNOT_GET_CREATE_DICTIONARY_QUERY = "487"`
- `UNKNOWN_DICTIONARY = "488"`
- `INCORRECT_DICTIONARY_DEFINITION = "489"`
- `CANNOT_FORMAT_DATETIME = "490"`
- `UNACCEPTABLE_URL = "491"`
- `ACCESS_ENTITY_NOT_FOUND = "492"`
- `ACCESS_ENTITY_ALREADY_EXISTS = "493"`
- `ACCESS_ENTITY_FOUND_DUPLICATES = "494"`
- `ACCESS_STORAGE_READONLY = "495"`
- `QUOTA_REQUIRES_CLIENT_KEY = "496"`
- `ACCESS_DENIED = "497"`
- `LIMIT_BY_WITH_TIES_IS_NOT_SUPPORTED = "498"`
- S3_ERROR = "499"
- `AZURE_BLOB_STORAGE_ERROR = "500"`
- `CANNOT_CREATE_DATABASE = "501"`
- `CANNOT_SIGQUEUE = "502"`
- `AGGREGATE_FUNCTION_THROW = "503"`
- `FILE_ALREADY_EXISTS = "504"`
- `CANNOT_DELETE_DIRECTORY = "505"`
- `UNEXPECTED_ERROR_CODE = "506"`
- `UNABLE_TO_SKIP_UNUSED_SHARDS = "507"`
- `UNKNOWN_ACCESS_TYPE = "508"`
- `INVALID_GRANT = "509"`
- `CACHE_DICTIONARY_UPDATE_FAIL = "510"`
- `UNKNOWN_ROLE = "511"`
- `SET_NON_GRANTED_ROLE = "512"`
- `UNKNOWN_PART_TYPE = "513"`
- `ACCESS_STORAGE_FOR_INSERTION_NOT_FOUND = "514"`
- `INCORRECT_ACCESS_ENTITY_DEFINITION = "515"`
- `AUTHENTICATION_FAILED = "516"`
- `CANNOT_ASSIGN_ALTER = "517"`
- `CANNOT_COMMIT_OFFSET = "518"`
- `NO_REMOTE_SHARD_AVAILABLE = "519"`
- `CANNOT_DETACH_DICTIONARY_AS_TABLE = "520"`
- `ATOMIC_RENAME_FAIL = "521"`
- `UNKNOWN_ROW_POLICY = "523"`
- `ALTER_OF_COLUMN_IS_FORBIDDEN = "524"`
- `INCORRECT_DISK_INDEX = "525"`
- `NO_SUITABLE_FUNCTION_IMPLEMENTATION = "527"`
- `CASSANDRA_INTERNAL_ERROR = "528"`
- `NOT_A_LEADER = "529"`
- `CANNOT_CONNECT_RABBITMQ = "530"`
- `CANNOT_FSTAT = "531"`
- `LDAP_ERROR = "532"`
- `INCONSISTENT_RESERVATIONS = "533"`
- `NO_RESERVATIONS_PROVIDED = "534"`
- `UNKNOWN_RAID_TYPE = "535"`
- `CANNOT_RESTORE_FROM_FIELD_DUMP = "536"`
- `ILLEGAL_MYSQL_VARIABLE = "537"`
- `MYSQL_SYNTAX_ERROR = "538"`
- `CANNOT_BIND_RABBITMQ_EXCHANGE = "539"`
- `CANNOT_DECLARE_RABBITMQ_EXCHANGE = "540"`
- `CANNOT_CREATE_RABBITMQ_QUEUE_BINDING = "541"`
- `CANNOT_REMOVE_RABBITMQ_EXCHANGE = "542"`
- `UNKNOWN_MYSQL_DATATYPES_SUPPORT_LEVEL = "543"`
- `ROW_AND_ROWS_TOGETHER = "544"`
- `FIRST_AND_NEXT_TOGETHER = "545"`
- `NO_ROW_DELIMITER = "546"`
- `INVALID_RAID_TYPE = "547"`
- `UNKNOWN_VOLUME = "548"`
- `DATA_TYPE_CANNOT_BE_USED_IN_KEY = "549"`
- `CONDITIONAL_TREE_PARENT_NOT_FOUND = "550"`
- `ILLEGAL_PROJECTION_MANIPULATOR = "551"`
- `UNRECOGNIZED_ARGUMENTS = "552"`
- `LZMA_STREAM_ENCODER_FAILED = "553"`
- `LZMA_STREAM_DECODER_FAILED = "554"`
- `ROCKSDB_ERROR = "555"`
- `SYNC_MYSQL_USER_ACCESS_ERROR = "556"`
- `UNKNOWN_UNION = "557"`
- `EXPECTED_ALL_OR_DISTINCT = "558"`
- `INVALID_GRPC_QUERY_INFO = "559"`
- `ZSTD_ENCODER_FAILED = "560"`
- `ZSTD_DECODER_FAILED = "561"`
- `TLD_LIST_NOT_FOUND = "562"`
- `CANNOT_READ_MAP_FROM_TEXT = "563"`
- `INTERSERVER_SCHEME_DOESNT_MATCH = "564"`
- `TOO_MANY_PARTITIONS = "565"`
- `CANNOT_RMDIR = "566"`
- `DUPLICATED_PART_UUIDS = "567"`
- `RAFT_ERROR = "568"`
- `MULTIPLE_COLUMNS_SERIALIZED_TO_SAME_PROTOBUF_FIELD = "569"`
- `DATA_TYPE_INCOMPATIBLE_WITH_PROTOBUF_FIELD = "570"`
- `DATABASE_REPLICATION_FAILED = "571"`
- `TOO_MANY_QUERY_PLAN_OPTIMIZATIONS = "572"`
- `EPOLL_ERROR = "573"`
- `DISTRIBUTED_TOO_MANY_PENDING_BYTES = "574"`
- `UNKNOWN_SNAPSHOT = "575"`
- `KERBEROS_ERROR = "576"`
- `INVALID_SHARD_ID = "577"`
- `INVALID_FORMAT_INSERT_QUERY_WITH_DATA = "578"`
- `INCORRECT_PART_TYPE = "579"`
- `CANNOT_SET_ROUNDING_MODE = "580"`
- `TOO_LARGE_DISTRIBUTED_DEPTH = "581"`
- `NO_SUCH_PROJECTION_IN_TABLE = "582"`
- `ILLEGAL_PROJECTION = "583"`
- `PROJECTION_NOT_USED = "584"`
- `CANNOT_PARSE_YAML = "585"`
- `CANNOT_CREATE_FILE = "586"`
- `CONCURRENT_ACCESS_NOT_SUPPORTED = "587"`
- `DISTRIBUTED_BROKEN_BATCH_INFO = "588"`
- `DISTRIBUTED_BROKEN_BATCH_FILES = "589"`
- `CANNOT_SYSCONF = "590"`
- `SQLITE_ENGINE_ERROR = "591"`
- `DATA_ENCRYPTION_ERROR = "592"`
- `ZERO_COPY_REPLICATION_ERROR = "593"`
- BZIP2_STREAM_DECODER_FAILED = "594"
- BZIP2_STREAM_ENCODER_FAILED = "595"
- `INTERSECT_OR_EXCEPT_RESULT_STRUCTURES_MISMATCH = "596"`
- `NO_SUCH_ERROR_CODE = "597"`
- `BACKUP_ALREADY_EXISTS = "598"`
- `BACKUP_NOT_FOUND = "599"`
- `BACKUP_VERSION_NOT_SUPPORTED = "600"`
- `BACKUP_DAMAGED = "601"`
- `NO_BASE_BACKUP = "602"`
- `WRONG_BASE_BACKUP = "603"`
- `BACKUP_ENTRY_ALREADY_EXISTS = "604"`
- `BACKUP_ENTRY_NOT_FOUND = "605"`
- `BACKUP_IS_EMPTY = "606"`
- `CANNOT_RESTORE_DATABASE = "607"`
- `CANNOT_RESTORE_TABLE = "608"`
- `FUNCTION_ALREADY_EXISTS = "609"`
- `CANNOT_DROP_FUNCTION = "610"`
- `CANNOT_CREATE_RECURSIVE_FUNCTION = "611"`
- `OBJECT_ALREADY_STORED_ON_DISK = "612"`
- `OBJECT_WAS_NOT_STORED_ON_DISK = "613"`
- `POSTGRESQL_CONNECTION_FAILURE = "614"`
- `CANNOT_ADVISE = "615"`
- `UNKNOWN_READ_METHOD = "616"`
- LZ4_ENCODER_FAILED = "617"
- LZ4_DECODER_FAILED = "618"
- `POSTGRESQL_REPLICATION_INTERNAL_ERROR = "619"`
- `QUERY_NOT_ALLOWED = "620"`
- `CANNOT_NORMALIZE_STRING = "621"`
- `CANNOT_PARSE_CAPN_PROTO_SCHEMA = "622"`
- `CAPN_PROTO_BAD_CAST = "623"`
- `BAD_FILE_TYPE = "624"`
- `IO_SETUP_ERROR = "625"`
- `CANNOT_SKIP_UNKNOWN_FIELD = "626"`
- `BACKUP_ENGINE_NOT_FOUND = "627"`
- `OFFSET_FETCH_WITHOUT_ORDER_BY = "628"`
- `HTTP_RANGE_NOT_SATISFIABLE = "629"`
- `HAVE_DEPENDENT_OBJECTS = "630"`
- `UNKNOWN_FILE_SIZE = "631"`
- `UNEXPECTED_DATA_AFTER_PARSED_VALUE = "632"`
- `QUERY_IS_NOT_SUPPORTED_IN_WINDOW_VIEW = "633"`
- `MONGODB_ERROR = "634"`
- `CANNOT_POLL = "635"`
- `CANNOT_EXTRACT_TABLE_STRUCTURE = "636"`
- `INVALID_TABLE_OVERRIDE = "637"`
- `SNAPPY_UNCOMPRESS_FAILED = "638"`
- `SNAPPY_COMPRESS_FAILED = "639"`
- `NO_HIVEMETASTORE = "640"`
- `CANNOT_APPEND_TO_FILE = "641"`
- `CANNOT_PACK_ARCHIVE = "642"`
- `CANNOT_UNPACK_ARCHIVE = "643"`
- `REMOTE_FS_OBJECT_CACHE_ERROR = "644"`
- `NUMBER_OF_DIMENSIONS_MISMATCHED = "645"`
- `CANNOT_BACKUP_DATABASE = "646"`
- `CANNOT_BACKUP_TABLE = "647"`
- `WRONG_DDL_RENAMING_SETTINGS = "648"`
- `INVALID_TRANSACTION = "649"`
- `SERIALIZATION_ERROR = "650"`
- `CAPN_PROTO_BAD_TYPE = "651"`
- `ONLY_NULLS_WHILE_READING_SCHEMA = "652"`
- `CANNOT_PARSE_BACKUP_SETTINGS = "653"`
- `WRONG_BACKUP_SETTINGS = "654"`
- `FAILED_TO_SYNC_BACKUP_OR_RESTORE = "655"`
- `MEILISEARCH_EXCEPTION = "656"`
- `UNSUPPORTED_MEILISEARCH_TYPE = "657"`
- `MEILISEARCH_MISSING_SOME_COLUMNS = "658"`
- `UNKNOWN_STATUS_OF_TRANSACTION = "659"`
- `HDFS_ERROR = "660"`
- `CANNOT_SEND_SIGNAL = "661"`
- `FS_METADATA_ERROR = "662"`
- `INCONSISTENT_METADATA_FOR_BACKUP = "663"`
- `ACCESS_STORAGE_DOESNT_ALLOW_BACKUP = "664"`
- `CANNOT_CONNECT_NATS = "665"`
- `NOT_INITIALIZED = "667"`
- `INVALID_STATE = "668"`
- `NAMED_COLLECTION_DOESNT_EXIST = "669"`
- `NAMED_COLLECTION_ALREADY_EXISTS = "670"`
- `NAMED_COLLECTION_IS_IMMUTABLE = "671"`
- `INVALID_SCHEDULER_NODE = "672"`
- `RESOURCE_ACCESS_DENIED = "673"`
- `RESOURCE_NOT_FOUND = "674"`
- CANNOT_PARSE_IPV4 = "675"
- CANNOT_PARSE_IPV6 = "676"
- `THREAD_WAS_CANCELED = "677"`
- `IO_URING_INIT_FAILED = "678"`
- `IO_URING_SUBMIT_ERROR = "679"`
- `MIXED_ACCESS_PARAMETER_TYPES = "690"`
- `UNKNOWN_ELEMENT_OF_ENUM = "691"`
- `TOO_MANY_MUTATIONS = "692"`
- `AWS_ERROR = "693"`
- `ASYNC_LOAD_CYCLE = "694"`
- `ASYNC_LOAD_FAILED = "695"`
- `ASYNC_LOAD_CANCELED = "696"`
- `CANNOT_RESTORE_TO_NONENCRYPTED_DISK = "697"`
- `INVALID_REDIS_STORAGE_TYPE = "698"`
- `INVALID_REDIS_TABLE_STRUCTURE = "699"`
- `USER_SESSION_LIMIT_EXCEEDED = "700"`
- `CLUSTER_DOESNT_EXIST = "701"`
- `CLIENT_INFO_DOES_NOT_MATCH = "702"`
- `INVALID_IDENTIFIER = "703"`
- `QUERY_CACHE_USED_WITH_NONDETERMINISTIC_FUNCTIONS = "704"`
- `TABLE_NOT_EMPTY = "705"`
- `LIBSSH_ERROR = "706"`
- `GCP_ERROR = "707"`
- `ILLEGAL_STATISTICS = "708"`
- `CANNOT_GET_REPLICATED_DATABASE_SNAPSHOT = "709"`
- `FAULT_INJECTED = "710"`
- `FILECACHE_ACCESS_DENIED = "711"`
- `TOO_MANY_MATERIALIZED_VIEWS = "712"`
- `BROKEN_PROJECTION = "713"`
- `UNEXPECTED_CLUSTER = "714"`
- `CANNOT_DETECT_FORMAT = "715"`
- `CANNOT_FORGET_PARTITION = "716"`
- `EXPERIMENTAL_FEATURE_ERROR = "717"`
- `TOO_SLOW_PARSING = "718"`
- `QUERY_CACHE_USED_WITH_SYSTEM_TABLE = "719"`
- `USER_EXPIRED = "720"`
- `DEPRECATED_FUNCTION = "721"`
- `ASYNC_LOAD_WAIT_FAILED = "722"`
- `PARQUET_EXCEPTION = "723"`
- `TOO_MANY_TABLES = "724"`
- `TOO_MANY_DATABASES = "725"`
- `DISTRIBUTED_CACHE_ERROR = "900"`
- `CANNOT_USE_DISTRIBUTED_CACHE = "901"`
- `KEEPER_EXCEPTION = "999"`
- `POCO_EXCEPTION = "1000"`
- `STD_EXCEPTION = "1001"`
- `UNKNOWN_EXCEPTION = "1002"`


---

URL: https://www.tinybird.co/docs/forward/dev-reference/datafiles
Last update: 2025-06-16T16:46:57.000Z
Content:
---
title: "Datafiles · Tinybird Docs"
theme-color: "#171612"
description: "Datafiles describe your Tinybird resources, like data sources, pipes, and so on. They're the source code of your project."
---


# Datafiles [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles#datafiles)

Copy as MD Datafiles describe your Tinybird resources, like data sources, pipes, and so on. They're the source code of your project.

You can use datafiles to manage your projects as source code and take advantage of version control. Tinybird CLI helps you produce and push datafiles to the Tinybird platform.

## Data projects [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles#data-projects)

A data project is a collection of datafiles that describe your Tinybird resources.

The minimal data project includes:

- A .datasource file, which describes the source and schema of your data.
- A pipe, typically a .endpoint file, which describes how to process and serve the data.

For example:

- Data source
- Pipe

SCHEMA >
    `day` Date `json:$.day`,
    `station` String `json:$.station`,
    `abslevel` Nullable(Float32) `json:$.abslevel`,
    `percentagevolume` Float32 `json:$.percentagevolume`,
    `volume` Float32 `json:$.volume`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "station"
ENGINE_SORTING_KEY "day, station"
## Folder structure [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles#folder-structure)

Tinybird projects are organized in a folder structure that helps you organize the source for each resource type.

The datafiles that you generate when you run `tb create` are organized in the following folders:

- `connections`   : Contains[  connection files](/docs/forward/dev-reference/datafiles/connection-files)  .
- `copies`   : Contains the[  copy pipes](/docs/forward/dev-reference/datafiles/pipe-files#copy-pipes)  .
- `datasources`   : Contains the[  data sources](/docs/forward/dev-reference/datafiles/datasource-files)  .
- `endpoints`   : Contains the[  endpoint pipes](/docs/forward/dev-reference/datafiles/pipe-files#endpoint-pipes)  .
- `fixtures`   : Contains test fixtures and test data.
- `infra`   : Contains the infrastructure files. See[  tb infra](/docs/forward/dev-reference/commands/tb-infra)  .
- `materializations`   : Contains[  materialized views](/docs/forward/dev-reference/datafiles/pipe-files#materialized-pipes)  .
- `pipes`   : Contains non-published[  pipes](/docs/forward/dev-reference/datafiles/pipe-files)  .
- `sinks`   : Contains the[  sink pipes](/docs/forward/dev-reference/datafiles/pipe-files#sink-pipes)  .
- `tests`   : Contains the test suites.

The following example shows a typical `tinybird` project folder structure that includes subfolders for supported types:

##### Example file structure

.
├── .tinyb
├── connections
├── copies
├── datasources
│   └── user_actions.datasource
├── endpoints
│   ├── user_actions_line_chart.pipe
│   └── user_actions_total_widget.pipe
├── fixtures
│   ├── user_actions.prompt
│   └── user_actions_d1046873.ndjson
├── infra
│   └── aws
│       ├── config.json
│       ├── k8s.yaml
│       └── main.tf
├── materializations
├── pipes
├── sinks
└── tests
    └── user_actions_line_chart.yaml The .tinyb file contains the Tinybird project configuration, including the authentication token obtained by running `tb login` . See [.tinyb file](/docs/forward/dev-reference/datafiles/tinyb-file).

## Types of datafiles [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles#types-of-datafiles)

Tinybird uses the following types of datafiles:

- Datasource files (.datasource) represent data sources. See[  Datasource files](/docs/forward/dev-reference/datafiles/datasource-files)  .
- Pipe files (.pipe) represent pipes of various types. See[  Pipe files](/docs/forward/dev-reference/datafiles/pipe-files)  .

## Syntactic conventions [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles#syntactic-conventions)

Datafiles follow the same syntactic conventions.

### Casing [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles#casing)

Instructions always appear at the beginning of a line in upper case. For example:

##### Basic syntax

COMMAND value
ANOTHER_INSTR "Value with multiple words"
### Multiple lines [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles#multiple-lines)

Instructions can span multiples lines. For example:

##### Multiline syntax

SCHEMA >
    `d` DateTime,
    `total` Int32,
    `from_novoa` Int16
### Comments [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles#comments)

You can add comments using the `#` character. For example:

# This is a comment
COMMAND value
## Next steps [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles#next-steps)

- Learn about[  datasource files](/docs/forward/dev-reference/datafiles/datasource-files)  .
- Learn about[  connection files](/docs/forward/dev-reference/datafiles/connection-files)  .
- Learn about[  pipe files](/docs/forward/dev-reference/datafiles/pipe-files)  .


---

URL: https://www.tinybird.co/docs/forward/dev-reference/common-error-patterns
Last update: 2025-06-13T15:23:16.000Z
Content:
---
title: "Common Error Patterns · Tinybird Docs"
theme-color: "#171612"
description: "Learn about common error patterns in Tinybird and how to resolve them."
---


# Common Error Patterns [¶](https://www.tinybird.co/docs/forward/dev-reference/common-error-patterns#common-error-patterns)

Copy as MD This guide explains common error patterns you might encounter when working with Tinybird and provides solutions to resolve them. Check here for the [full list of errors](./list-of-errors).

### MEMORY_LIMIT_EXCEEDED (241) [¶](https://www.tinybird.co/docs/forward/dev-reference/common-error-patterns#memory-limit-exceeded-241)

This error occurs when a query requires more memory than the available limit.

**Common causes:**

- Large result sets
- Complex joins or query time computations
- Insufficient sorting key optimization

**Solutions:**

- Optimize your sorting key to better match query patterns
- Add appropriate indexes
- Break down complex queries into smaller parts
- Precompute functions (E.g.[  JSONExtract()](/docs/sql-reference/functions/json-functions#jsonextract-functions)   ) and materialize in an intermediate data source.
- Use `LIMIT`   clauses when appropriate

### TIMEOUT_EXCEEDED (159) [¶](https://www.tinybird.co/docs/forward/dev-reference/common-error-patterns#timeout-exceeded-159)

This error indicates that a query took too longer than the default limit of 10 seconds to execute.

**Common causes:**

- Complex queries without proper optimization
- Large data scans
- Inefficient sorting keys

**Solutions:**

- Review and optimize your sorting key (do they align with the query filters?)
- Add appropriate indexes
- Consider materializing frequently used queries
- Break down complex queries

### TOO_MANY_PARTS (252) [¶](https://www.tinybird.co/docs/forward/dev-reference/common-error-patterns#too-many-parts-252)

This error occurs when you have too many data parts in a partition.

**Common causes:**

- Writing to more than 12 partitions in a single ingest.
- Partition Key is too granular for regular ingests (eg. unique id or timestamp)
- Partition Key is too granular for large historic backfills (eg. month)

**Solutions:**

- Break your inserts into smaller chunks
- Reduce partition granularity


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "Tinybird CLI command reference · Tinybird Docs"
theme-color: "#171612"
description: "The Tinybird CLI allows you to use all the Tinybird functionality directly from the command line. Get to know the command reference."
---


# CLI command reference [¶](https://www.tinybird.co/docs/forward/dev-reference/commands#cli-command-reference)

Copy as MD The following list shows all available commands in the Tinybird command-line interface, their options, and their arguments.

- [  Global options](/docs/forward/dev-reference/commands/global-options)
- [  tb build](/docs/forward/dev-reference/commands/tb-build)
- [  tb connection](/docs/forward/dev-reference/commands/tb-connection)
- [  tb copy](/docs/forward/dev-reference/commands/tb-copy)
- [  tb create](/docs/forward/dev-reference/commands/tb-create)
- [  tb datasource](/docs/forward/dev-reference/commands/tb-datasource)
- [  tb deploy](/docs/forward/dev-reference/commands/tb-deploy)
- [  tb deployment](/docs/forward/dev-reference/commands/tb-deployment)
- [  tb dev](/docs/forward/dev-reference/commands/tb-dev)
- [  tb endpoint](/docs/forward/dev-reference/commands/tb-endpoint)
- [  tb info](/docs/forward/dev-reference/commands/tb-info)
- [  tb infra](/docs/forward/dev-reference/commands/tb-infra)
- [  tb job](/docs/forward/dev-reference/commands/tb-job)
- [  tb local](/docs/forward/dev-reference/commands/tb-local)
- [  tb login](/docs/forward/dev-reference/commands/tb-login)
- [  tb logout](/docs/forward/dev-reference/commands/tb-logout)
- [  tb materialization](/docs/forward/dev-reference/commands/tb-materialization)
- [  tb mock](/docs/forward/dev-reference/commands/tb-mock)
- [  tb open](/docs/forward/dev-reference/commands/tb-open)
- [  tb pipe](/docs/forward/dev-reference/commands/tb-pipe)
- [  tb secret](/docs/forward/dev-reference/commands/tb-secret)
- [  tb sink](/docs/forward/dev-reference/commands/tb-sink)
- [  tb sql](/docs/forward/dev-reference/commands/tb-sql)
- [  tb test](/docs/forward/dev-reference/commands/tb-test)
- [  tb token](/docs/forward/dev-reference/commands/tb-token)
- [  tb update](/docs/forward/dev-reference/commands/tb-update)
- [  tb workspace](/docs/forward/dev-reference/commands/tb-workspace)


---

URL: https://www.tinybird.co/docs/forward/analytics-agents/mcp
Last update: 2025-07-04T07:52:33.000Z
Content:
---
title: "MCP server · Tinybird Docs"
theme-color: "#171612"
description: "The Tinybird remote MCP server enables AI agents to connect directly to your Tinybird workspace to execute queries or use your endpoints as tools."
---


# MCP server [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#mcp-server)

Copy as MD The Tinybird remote MCP server enables AI agents to connect directly to your workspace to use endpoints as tools or execute queries. The [Model Context Protocol](https://modelcontextprotocol.io/) gives AI assistants access to your analytics APIs, data sources, and endpoints through a standardized interface.

This integration is ideal when you want AI agents to autonomously query your data, call your analytics endpoints, or build data-driven applications without requiring manual API integration.

Our server only supports Streamable HTTP as the transport protocol. If your MCP client doesn't support it, you'll need to use the `mcp-remote` package as a bridge.

## Quickstart [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#quickstart)

Get a [token](https://cloud.tinybird.co/tokens) and use this URL in your MCP client or agent framework:

https://mcp.tinybird.co?token=TINYBIRD_TOKEN Replace `TINYBIRD_TOKEN` with your actual Auth Token. Use resource-scoped static tokens or JWTs for fine-grained access control.

## Available tools [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#available-tools)

Depending on the token scopes, the following tools will be exposed:

### Endpoint Tools [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#endpoint-tools)

Every published API endpoint in your workspace becomes an individual tool with the endpoint's name. These tools:

- Accept the same parameters as your endpoint
- Return the same response as direct API calls, by default in CSV format, but JSON format is also supported
- Respect endpoint rate limits and authentication
- Support all parameter types (query parameters, filters, etc.)

**Example** : If you have an endpoint named `daily_active_users` , it becomes a tool named `daily_active_users` that accepts the same parameters.

### Core Tools [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#core-tools)

####  `explore_data` [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#explore_data)

Ask questions about your data and get answers in natural language. The same advanced exploration agent Tinybird uses internally.

**Parameters:**

- `question`   (string, required): The question to ask about your data

**Returns:** A natural language answer to the question

####  `text_to_sql` [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#text_to_sql)

Convert a natural language question into a SQL query.

**Parameters:**

- `question`   (string, required): The question to convert into a SQL query

**Returns:** A SQL query

####  `execute_query` [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#execute_query)

Runs SQL queries against the Tinybird SQL API

**Parameters:**

- `sql`   (string, required): The SQL query to execute
- `format`   (string, optional): The response format, default is CSVWithNames but[  other formats](/docs/api-reference/query-api#id2)   are supported

**Returns:** Query results in CSV format by default.

####  `list_datasources` [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#list_datasources)

List all data sources in your workspace.

**Parameters:** None

**Returns:** Array of data source objects with names, schemas, and metadata

####  `list_service_datasources` [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#list_service_datasources)

List all organization and tinybird service data sources that are available to your workspace.

**Parameters:** None

**Returns:** Array of service data source objects with names, schemas, and metadata

####  `list_endpoints` [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#list_endpoints)

List all published API endpoints in your workspace.

**Parameters:** None

**Returns:** Array of endpoint objects with names, parameters, and descriptions

### Tool Availability by Token Scope [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#tool-availability-by-token-scope)

| Tool | JWT | admin token |
| --- | --- | --- |
| Endpoint tools | Only specific endpoint | ✅ |
| `list_endpoints` | ✅ | ✅ |
| `list_datasources` | ✅ | ✅ |
| `list_service_datasources` | ✅ | ✅ |
| `execute_query` | ✅ | ✅ |
| `explore_data` | ✅ | ✅ |
| `text_to_sql` | ✅ | ✅ |

## When to use MCP vs Direct API Integration [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#when-to-use-mcp-vs-direct-api-integration)

**Use MCP when:**

- Building AI agents that need autonomous access to your analytics
- Creating conversational interfaces for data exploration
- Developing AI-powered dashboards or reports
- Prototyping data analysis workflows with AI assistance

**Use direct API integration when:**

- Building production applications with predictable query patterns
- Need maximum performance and minimal latency
- Require fine-grained control over API calls and caching

### MCP Monitoring [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#mcp-monitoring)

Monitor SQL queries executed by AI agents for unexpected patterns, using [Tinybird service data sources](/docs/monitoring/service-datasources).

SELECT *
FROM tinybird.pipe_stats_rt
WHERE url LIKE '%from=mcp%'
AND start_datetime > now() - INTERVAL 1 HOUR
## See also [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp#see-also)

- [  Auth Tokens](/docs/administration/auth-tokens)   - Learn about creating and managing authentication tokens
- [  API Endpoints](/docs/work-with-data/pipes#publish-your-pipe)   - Understand how to create and publish endpoints
- [  Query API](/docs/api-reference/query-api)   - Direct API access for comparison with MCP usage
- [  Model Context Protocol Documentation](https://modelcontextprotocol.io/)   - Official MCP specification and guides


---

URL: https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets
Last update: 2025-07-07T10:07:42.000Z
Content:
---
title: "MCP server integration examples · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to integrate the Tinybird MCP server to build analytics agents"
---


# Agents code examples [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#agents-code-examples)

Copy as MD Here are some of examples of simple agents using Tinybird's MCP Server:

- They are based on the[  web analytics project](https://www.tinybird.co/templates/web-analytics-starter-kit)   but you can adapt it to your own project by using your `TINYBIRD_TOKEN`  .
- Model and libraries set up (such as API keys and other environmenta variables) is omitted

Building an agent? Want to know which LLM generates best SQL queries? Explore the results in the [LLM Benchmark](https://llm-benchmark.tinybird.live/).

## Code Snippets [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#code-snippets)

### Basic Query Execution with Pydantic AI [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#basic-query-execution-with-pydantic-ai)

import os
from dotenv import load_dotenv
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStreamableHTTP
import asyncio

load_dotenv()

tinybird_token = os.getenv('TINYBIRD_TOKEN')
SYSTEM_PROMPT = "YOUR SYSTEM PROMPT"

async def main():
    tinybird = MCPServerStreamableHTTP(
        f"https://mcp.tinybird.co?token={tinybird_token}",
        timeout=120,
    )

    agent = Agent(
        model="anthropic:claude-4-opus-20250514",  # use your favorite model
        mcp_servers=[tinybird],
        system_prompt=SYSTEM_PROMPT
    )

    async with agent.run_mcp_servers():
        result = await agent.run("top 5 pages with the most visits in the last 24 hours")
        print(result.output)


asyncio.run(main())
### Basic Query Execution with Agno [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#basic-query-execution-with-agno)

from agno.agent import Agent
from agno.models.anthropic import Claude
from agno.tools.mcp import MCPTools

import asyncio
import os

tinybird_api_key = os.getenv("TINYBIRD_TOKEN")
SYSTEM_PROMPT = "YOUR SYSTEM PROMPT"

async def main():
    async with MCPTools(
        transport="streamable-http",
        url=f"https://mcp.tinybird.co?token={tinybird_api_key}",
        timeout_seconds=120) as mcp_tools:
        agent = Agent(
            model=Claude(id="claude-4-opus-20250514"),
            tools=[mcp_tools],  # use your favorite model
            instructions=SYSTEM_PROMPT
        )
        await agent.aprint_response("top 5 pages with the most visits in the last 24 hours", stream=True)

if __name__ == "__main__":
    asyncio.run(main())
### Basic Query Execution with Vercel AI SDK [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#basic-query-execution-with-vercel-ai-sdk)

import { anthropic } from "@ai-sdk/anthropic";
import {
  generateText,
  experimental_createMCPClient as createMCPClient,
  type Message,
} from "ai";
import {
  StreamableHTTPClientTransport,
} from "@modelcontextprotocol/sdk/client/streamableHttp";
import * as dotenv from 'dotenv';

dotenv.config();
const SYSTEM_PROMPT = "YOUR SYSTEM PROMPT"

async function main() {
  const messages: Message[] = [{
    id: "1",
    role: "user",
    content: "top 5 pages with more visits in the last 24 hours"
  }];

  const url = new URL(
    `https://mcp.tinybird.co?token=${process.env.TINYBIRD_TOKEN}`
  );

  const mcpClient = await createMCPClient({
    transport: new StreamableHTTPClientTransport(url, {
      sessionId: "session_123",
    }),
  });

  const tbTools = await mcpClient.tools();

  const result = await generateText({
    model: anthropic("claude-3-7-sonnet-20250219"),  // use your favorite model
    messages,
    maxSteps: 5,
    tools: {...tbTools},
    system: SYSTEM_PROMPT
  });

  console.log(result.text);
}

main();
### Advanced Analytics with OpenAI Agents SDK [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#advanced-analytics-with-openai-agents-sdk)

import asyncio
from agents import Agent, Runner
from agents.mcp import MCPServerStreamableHttp

SYSTEM_PROMPT = """
You are a data analyst. When analyzing user behavior:
1. First list available endpoints to understand what data is available
2. Use appropriate endpoints or execute_query for analysis
3. Provide insights with specific numbers and trends
4. Suggest actionable recommendations
"""

async def analyze_user_behavior():
    try:
        server = MCPServerStreamableHttp(
            name="tinybird",
            params={
                "url": "https://mcp.tinybird.co?token=TINYBIRD_TOKEN",
            },
        )

        async with server:
            agent = Agent(
                name="user_behavior_analyst", 
                model="YOUR_FAVORITE_MODEL",  # e.g., "openai:gpt-4"
                mcp_servers=[server],
                instructions=SYSTEM_PROMPT
            )
            
            result = await Runner.run(
                agent,
                input="""
                Analyze our user engagement patterns:
                1. What are the current weekly active user trends?
                2. Which features are most popular?
                3. Are there any concerning drops in engagement?
                """
            )
            print("Engagement Analysis:", result.final_output)
            
    except Exception as e:
        print(f"Analysis failed: {e}")

if __name__ == "__main__":
    asyncio.run(analyze_user_behavior())
### Real-time Dashboard Assistant [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#real-time-dashboard-assistant)

import asyncio
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStdio

SYSTEM_PROMPT = "YOUR SYSTEM PROMPT"

async def dashboard_assistant():
    server = MCPServerStdio(
        command="npx",
        args=["-y", "mcp-remote", "https://mcp.tinybird.co?token=TINYBIRD_TOKEN"],
    )

    agent = Agent(
        name="dashboard_assistant",
        model=MODEL,  # e.g. "anthropic:claude-3-5-sonnet-20241022",
        mcp_servers=[server],
        system_prompt=SYSTEM_PROMPT
    )

    async with agent.run_mcp_servers():
        while True:
            try:
                user_question = input("\nAsk about your data (or 'quit' to exit): ")
                if user_question.lower() == 'quit':
                    break
                
                result = await agent.run(user_question)
                print(f"Assistant: {result.output}")
                
            except KeyboardInterrupt:
                break
            except Exception as e:
                print(f"Error processing question: {e}")

if __name__ == "__main__":
    asyncio.run(dashboard_assistant())
## Example prompts [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#example-prompts)

Use the prompts from the links below as the `SYSTEM_PROMPT` value in the snippets to build analytics agents for your data.

### Tinybird organization metrics [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#tinybird-organization-metrics)

Build analytics agents that report summaries on your organization metrics using [service data sources](/forward/monitoring/service-datasources#organization-service-data-sources).

Configure the MCP Server with an Organization Admin Token. You can manage your Tokens in the [Tinybird UI](https://cloud.tinybird.co/tokens).

`https://mcp.tinybird.co?token={organization_admin_token}`

- [  CPU spikes](https://github.com/tinybirdco/ai/blob/main/agents/birdwatcher/missions/cpu_spikes.md)   : Analyzes CPU spikes in your dedicated cluster and finds culprits.
- [  Requests summary](https://github.com/tinybirdco/ai/blob/main/agents/birdwatcher/missions/org_endpoints.md)   : Reports a summary of the requests in your Organization for a given date range.
- [  Ingestion summary](https://github.com/tinybirdco/ai/blob/main/agents/birdwatcher/missions/org_ingestion.md)   : Reports a summary of ingestion in your Organization for a given date range.
- [  Storage summary](https://github.com/tinybirdco/ai/blob/main/agents/birdwatcher/missions/org_storage.md)   : Reports a summary of storage in your Organization for a given date range.

### Tinybird workspace metrics [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#tinybird-workspace-metrics)

Build analytics agents that report summaries on metrics for a specific Workspace using [service data sources](/forward/monitoring/service-datasources#service-data-sources).

Configure the MCP server with a Workspace Admin Token. You can manage your Tokens in the [Tinybird UI](https://cloud.tinybird.co/tokens).

`https://mcp.tinybird.co?token={admin_token}`

- [  MCP usage](https://github.com/tinybirdco/ai/blob/main/agents/birdwatcher/missions/mcp_usage.md)   : Reports the most-called Endpoints for a given Workspace by agents using MCP.

### Analytics agents over your data [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#analytics-agents-over-your-data)

Every Endpoint in a Workspace is published as an MCP tool. Use a [resource-scoped token](/docs/forward/administration/tokens) to create analytics agents for your data.

Some examples:

- [  PagerDuty incidents](https://github.com/tinybirdco/ai/blob/main/agents/birdwatcher/missions/pagerduty.md)   : Summarizes[  PagerDuty](/forward/get-data-in/guides/ingest-from-pagerduty)   incidents
- [  Plain support summaries](https://github.com/tinybirdco/ai/blob/main/agents/birdwatcher/missions/plain_summary.md)   : Summarizes the most important Plain support issues
- [  Vercel logs](https://github.com/tinybirdco/ai/blob/main/agents/birdwatcher/missions/vercel_logs.md)   : Summarizes errors and metrics from your[  Vercel application logs](/forward/get-data-in/guides/ingest-vercel-logdrains)
- [  Web analytics](https://github.com/tinybirdco/ai/blob/main/agents/birdwatcher/missions/web_analytics.md)   : Summarizes your[  web analytics](https://www.tinybird.co/templates/web-analytics-starter-kit)   metrics.

## Next steps [¶](https://www.tinybird.co/docs/forward/analytics-agents/mcp-server-snippets#next-steps)

- Learn[  best practices to build analytics agents](/docs/analytics-agents/best-practices)
- Check[  this repository](https://github.com/tinybirdco/ai/tree/main/agents/birdwatcher/missions)   for more examples


---

URL: https://www.tinybird.co/docs/forward/analytics-agents/best-practices
Last update: 2025-07-04T07:52:33.000Z
Content:
---
title: "Analytics agents and MCP best practices · Tinybird Docs"
theme-color: "#171612"
description: "Learn some tips and tricks to leverage your data with MCP server to build analytics agents"
---


# Best practices to build analytics agents with the Tinybird MCP server [¶](https://www.tinybird.co/docs/forward/analytics-agents/best-practices#best-practices-to-build-analytics-agents-with-the-tinybird-mcp-server)

Copy as MD Tinybird Workspaces are fully managed remote MCP servers that you can instantly connect to LLMs and agents with no additional setup.

Here are some best practices for effectively building analytics agents using your data in Tinybird.

## Add context to your APIs for language models [¶](https://www.tinybird.co/docs/forward/analytics-agents/best-practices#add-context-to-your-apis-for-language-models)

The Tinybird MCP server provides `list_datasources` and `list_endpoints` tools to publish useful context for LLMs.

Add LLM-friendly descriptions to your resources. Helping the LLM understand what a data source contains or how an API endpoint works without having to infer it from the schema or raw data makes it much easier to generate accurate responses.

Use the `DESCRIPTION` field in .datasource and .pipe files, and the `description`, `required` , and `example` metadata fields in pipe parameters.

`.datasource` example:

DESCRIPTION >
    - `analytics_events` contains web analytics events, such as `page_hit` actions or custom events.
    - The `action` column specifies the event type for each `session_id` and `timestamp`.
    - The `payload` is a JSON string with metadata about the action, such as the `user_agent`.

TOKEN "tracker" APPEND

SCHEMA >
    `timestamp` DateTime `json:$.timestamp`,
    `session_id` Nullable(String) `json:$.session_id`,
    `action` LowCardinality(String) `json:$.action`,
    `version` LowCardinality(String) `json:$.version`,
    `payload` String `json:$.payload`

ENGINE MergeTree
ENGINE_PARTITION_KEY toYYYYMM(timestamp)
ENGINE_SORTING_KEY timestamp
ENGINE_TTL timestamp + toIntervalDay(60) `.pipe` example:

DESCRIPTION >
    - Use this tool when you need to get most visited pages for a given period.
    - Parameters:
        - `date_from` and `date_to`: Optional date filters, defaulting to the last 7 days.
        - `skip` and `limit`: Pagination parameters.
    - Response: `pathname`, unique `visits`, and total `hits` for the given period.

TOKEN "dashboard" READ

NODE endpoint
SQL >
    %
    select pathname, uniqMerge(visits) as visits, countMerge(hits) as hits
    from analytics_pages_mv
    where
        {% if defined(date_from) %}
            date
            >=
            {{ Date(date_from, description="Starting day for filtering a date range", required=False, example="2025-05-01") }}
        {% else %} date >= timestampAdd(today(), interval -7 day)
        {% end %}
        {% if defined(date_to) %}
            and date
            <=
            {{ Date(date_to, description="Finishing day for filtering a date range", required=False, example="2025-05-01") }}
        {% else %} and date <= today()
        {% end %}
    group by pathname
    order by visits desc
    limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}

TYPE endpoint
## 2. Build APIs for semantic context [¶](https://www.tinybird.co/docs/forward/analytics-agents/best-practices#2-build-apis-for-semantic-context)

Your data represents semantic context for LLMs.

If the LLM cannot infer the meaning of certain terms or formats, you can create dedicated tools to provide that context. For example, if your analytics data encodes device types with internal notations, create a pipe that lists the possible device values and document it clearly:

DESCRIPTION >
  - Retrieves the list of available devices.
  - Use this list to filter other endpoints that require a `device` parameter.

NODE uniq_devices
SQL >
    SELECT distinct(device) as device
    FROM analytics_events

TYPE endpoint
## 3. Return LLM friendly errors [¶](https://www.tinybird.co/docs/forward/analytics-agents/best-practices#3-return-llm-friendly-errors)

Requests from the MCP server include a `from=mcp` parameter. Use this to return error messages optimized for LLM interpretation:

NODE validate_endpoint
SQL >
    %
    {% if defined(from) and from == 'mcp' %}
        {% if defined(device) and device not in ['Android', 'iPhone'] %}
              {{error('Parameter error. There was an error with the parameter you provided. The supported values for the `device` parameters are Android or iPhone, fix the issue by retrying again the request with a valid `device` value.')}}
        {% end %}
    {% end %}
    SELECT * FROM endpoint
    
TYPE endpoint
## 4. Mind your LLM tokens [¶](https://www.tinybird.co/docs/forward/analytics-agents/best-practices#4-mind-your-llm-tokens)

The MCP server returns data in CSV format to reduce LLM token usage. Additional tips to minimize tokens:

- Use pagination ( `skip`  , `limit`   ) and ranks in APIs.
- Instruct agents to `SELECT`   only necessary columns.
- Avoid high-precision types: round numbers to 1–2 decimals; avoid `DateTime64`  .
- Leverage aggregated materialized views.

## 5. More tools ≠ better agents [¶](https://www.tinybird.co/docs/forward/analytics-agents/best-practices#5-more-tools-better-agents)

More tools does not correlate with better agents. In Tinybird all your API endpoints in a workspace are published as MCP tools, but you have control over them with authorization tokens.

Use [resource-scoped tokens](/docs/administration/tokens) to restrict tool access and build domain-specific agents.

## 6. Limit agent steps [¶](https://www.tinybird.co/docs/forward/analytics-agents/best-practices#6-limit-agent-steps)

Agents degrade rapidly when more than three steps are required. A simple, effective workflow is:

- Data Retrieval: Use Tinybird tools to gather data based on user inputs.
- LLM Processing: Let the LLM summarize, explain, or analyze the data.
- Automation: Use third-party tools for alerts or workflows based on the results.

## 7. Build public facing analytics agents [¶](https://www.tinybird.co/docs/forward/analytics-agents/best-practices#7-build-public-facing-analytics-agents)

Use JSON Web Tokens (JWTs) for multi-tenant public agents.

Tinybird JWTs allow fine-grained access control, automatically filtering data by tenant ID. For example, for an `org_id` parameter in your pipe, use a JWT like:

{
    "workspace_id": "<workspaces_id>",
    "name": "frontend_jwt",
    "exp": 123123123123,
    "scopes": [
        {
            "type": "PIPES:READ",
            "resource": "top_pages",
            "fixed_params": {
                "org_id": "<org_uid>"
            }
        }
    ],
    "limits": {
      "rps": 10
    }
} Include the token in your MCP URL to support multi-tenancy: `https://mcp.tinybird.co?token=<jwt_token>`

For integrations, check out:

- [  Clerk JWT Template](https://www.tinybird.co/templates/clerk-jwt)
- [  Auth0 JWT Template](https://www.tinybird.co/templates/auth0-jwt)


---

URL: https://www.tinybird.co/docs/forward/administration/workspaces
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Workspaces · Tinybird Docs"
theme-color: "#171612"
description: "Workspaces contain your project's resources."
---


# Workspaces [¶](https://www.tinybird.co/docs/forward/administration/workspaces#workspaces)

Copy as MD A workspace is a set of Tinybird resources, like data sources, pipes, nodes, endpoints, and tokens. workspaces are always created inside organizations.

You can use workspaces to manage separate projects, use cases, and dev, staging, or production environments in Tinybird. Each workspace has administrators and members who can view and edit resources.

## Create a workspace [¶](https://www.tinybird.co/docs/forward/administration/workspaces#create-a-workspace)

When you authenticate using `tb login` , your default browser opens Tinybird Cloud. Select **Create workspace**.

If you don't have an organization defined, Tinybird Cloud allows you to also create an organization. Fill out the required information.

## Workspace ID [¶](https://www.tinybird.co/docs/forward/administration/workspaces#workspace-id)

The workspace ID is a unique identifier for each workspace.

To find the workspace ID using the CLI, run `tb workspace current` from the CLI:

tb workspace current

** Current workspace:
--------------------------------------------------------------------------------------------
| name                   | id                                   | role  | plan   | current |
--------------------------------------------------------------------------------------------
| tinybird_web_analytics | xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | guest | Custom | True    |
--------------------------------------------------------------------------------------------
## List workspaces [¶](https://www.tinybird.co/docs/forward/administration/workspaces#list-workspaces)

List the workspaces you have access to, and the one that you're currently authenticated to:

##### List workspaces

tb workspace ls
## Delete a workspace [¶](https://www.tinybird.co/docs/forward/administration/workspaces#delete-a-workspace)

Deleting a workspace deletes all resources within the workspace, including data sources, ingested data, pipes, and published endpoints.

Deleted workspaces can't be recovered.

To delete a workspace using the CLI, use the following command:

tb workspace delete Provide the name of the workspace and your user token. For example:

tb workspace delete my_workspace --user_token <your user token>
## Manage workspace members [¶](https://www.tinybird.co/docs/forward/administration/workspaces#manage-workspace-members)

In Tinybird Cloud, go to **Settings** > **Members** to review any members already part of your workspace. You can invite as many members to a workspace as you want. A member can belong to multiple workspaces.

Add a new member by entering their email address and confirming their role from the menu. You can invite multiple users at the same time by adding multiple email addresses separated by a comma.

The users you invite get an email notifying them that they have been invited. If they don't already have a Tinybird account, they're prompted to create one to accept your invite.

Invited users appear in the user management modal and by default have the **Guest** role. If the user loses their invite link, you can resend it here too, or copy the link to your clipboard. You can also remove members from here using the "..." menu and selecting "Remove".

## User roles [¶](https://www.tinybird.co/docs/forward/administration/workspaces#user-roles)

Members always have a role assigned. You can modify the role of a user at any time.

Tinybird has the following member roles for workspaces:

| Role | Manage resources | Manage users | Create a branch |
| --- | --- | --- | --- |
| `Admin` | Yes | Yes | Yes |
| `Guest` | Yes | No | Yes |
| `Viewer` | No | No | Yes |


---

URL: https://www.tinybird.co/docs/forward/administration/tokens
Last update: 2025-06-16T08:31:59.000Z
Content:
---
title: "Authentication · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to authenticate your requests to Tinybird."
---


# Tokens [¶](https://www.tinybird.co/docs/forward/administration/tokens#tokens)

Copy as MD Tinybird uses tokens to authenticate CLI and API requests. Tokens protect access to your resources. Any operation to manage your resources using the CLI or REST API requires a valid token with the necessary permissions.

There are two types of tokens:

- [  Static tokens](/docs/forward/administration/tokens/static-tokens)   : Use them to perform operations on your account, like importing data, creating data sources, or publishing APIs using the CLI or REST API. Use them to read data as well, just be mindful of their permanent nature.
- [  JSON Web tokens](/docs/forward/administration/tokens/jwt)   : Use them to read from published endpoints that expose your data to an application, when you want to implement filtering per user via fixed parameters (RBAC) or to apply rate limiting for different end users of Tinybird endpoints.

## Authenticate from local [¶](https://www.tinybird.co/docs/forward/administration/tokens#authenticate-from-local)

When working with [Tinybird Local](/docs/forward/install-tinybird/local) , you can authenticate by running `tb login` . For example:

tb login The command opens a browser window where you can sign in. See [tb login](/docs/forward/dev-reference/commands/tb-login).

Credentials are stored in the `.tinyb` file. See [.tinyb file](/docs/forward/dev-reference/datafiles/tinyb-file).


---

URL: https://www.tinybird.co/docs/forward/administration/organizations
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Organizations · Tinybird Docs"
theme-color: "#171612"
description: "Tinybird Organizations provide enterprise customers with a single pane of glass to monitor usage across multiple workspaces."
---


# Organizations [¶](https://www.tinybird.co/docs/forward/administration/organizations#organizations)

Copy as MD Organizations provide a centralized way of managing workspaces and members in a region. From the **Organization settings** section you can monitor resource usage and check your current plan's usage and billing if you're on a paid plan. See [Plans](../pricing).

The **Organization settings** section consists of the following areas:

- Observability (Org. admins only)
- Workspaces
- Members
- Billing (Org. admins only)
- Managed regions

All workspaces must belong to an organization in Tinybird. See [Workspaces](./workspaces).

## Users and organizations [¶](https://www.tinybird.co/docs/forward/administration/organizations#users-and-organizations)

Users can be members of one or more organizations. A user can only be the admin of one organization.

## Access the organization settings [¶](https://www.tinybird.co/docs/forward/administration/organizations#access-the-organization-settings)

To access the **Organization settings** screen, log in and select Settings from the sidebar.

## Observability [¶](https://www.tinybird.co/docs/forward/administration/organizations#observability)

The **Observability** page shows details about your resource usage, followed by a detailed breakdown of your consumption.

### Usage charts [¶](https://www.tinybird.co/docs/forward/administration/organizations#usage-charts)

The following charts are available depending on your plan:

| Plan | Charts |
| --- | --- |
| Free | Max/Average vCPU time, Max/Average QPS, Max/Average memory, Storage, Accumulated daily requests, Total errors |
| Developer | Max/Average vCPU time, Max/Average QPS, Max/Average memory, Storage, Total errors |
| Enterprise (Shared infrastructure) | Max/Average vCPU time, Max/Average QPS, Max/Average memory, Storage, Total errors |
| Enterprise (Dedicated infrastructure) | Max load, Max QPS, Max memory, Max storage, Data transfer, Total errors |

You can select between average and maximum values for all usage charts, as well as the period, from **Last hour** to **Last 7 days**.

### Usage table [¶](https://www.tinybird.co/docs/forward/administration/organizations#usage-table)

The **Usage** table shows resource usage by Workspace or resource. You can filter by resource name by typing in the text box.

### Refreshing your organization's Observability token [¶](https://www.tinybird.co/docs/forward/administration/organizations#refreshing-your-organizations-observability-token)

If your organization's observability Token gets compromised or is lost, refresh it using the following endpoint:

`/v0/organizations/<organization-id>/tokens/Observability%20%28builtin%29/refresh?token=<your-user-token>`

You must use your `user token` for this call, which you can copy from any of your workspaces.

## Workspaces [¶](https://www.tinybird.co/docs/forward/administration/organizations#workspaces)

The **Workspaces** page shows details of all your workspaces, including their name, members, and resources used by each Workspace.

New workspaces created by a user with an email domain linked to, or matching an organization are automatically added to that organization.

## Members [¶](https://www.tinybird.co/docs/forward/administration/organizations#members)

The **Members** page shows details of your organization members, the workspaces they belong to, and their roles. From this page you can manage existing members, their permissions, or invite new members.

The table shows the following information:

- Email
- workspaces and roles

To view the detail of a member’s workspaces and roles, select the arrow next to the Workspace count. A menu shows all the workspaces that user is part of, plus their role in each Workspace.

To change a user’s role or remove them from a Workspace, hover over the Workspace name and follow the arrow. Select a new role from **Admin** or **Viewer** , or remove them from the Workspace. You don't need to be a user in that Workspace to make changes to its users.

### Add or remove members [¶](https://www.tinybird.co/docs/forward/administration/organizations#add-or-remove-members)

To add a user, select **Add member**.

To remove a user from the organization, select **Remove member** in the menu.

Only organization administrators can manage users in the **Members** page.

### Add an organization admin [¶](https://www.tinybird.co/docs/forward/administration/organizations#add-an-organization-admin)

To add another user as an organization administrator, follow these steps:

1. Navigate to the**  Organization settings**   page.
2. Go to the**  Members**   section.
3. Locate the user you want to make an administrator.
4. Select the**  More actions (⋯)**   icon.
5. Select**  Organization admin**   in the menu.

This grants organization administrator access to the selected users.

## Billing [¶](https://www.tinybird.co/docs/forward/administration/organizations#billing)

The **Billing** page contains a summary of the credits balance for your plan, with links to billing details and a summary of your plan. See [Billing](../pricing/billing).

From the **Billing** page you can upgrade or resize your plan, or cancel it if you wish to downgrade to Free.

If you're on a Developer plan, the usage diagram shows the total expenditure by monthly invoice.

Only organization administrators can access the **Billing** section.

## Managed regions [¶](https://www.tinybird.co/docs/forward/administration/organizations#managed-regions)

The **Managed regions** page shows the self-managed regions where your organization has workspaces. See [Self-managed regions](/docs/forward/install-tinybird/self-managed).

### Add a region [¶](https://www.tinybird.co/docs/forward/administration/organizations#add-a-region)

To add a region, select **Add region**.

### Remove a region [¶](https://www.tinybird.co/docs/forward/administration/organizations#remove-a-region)

To remove a region, select the **More actions (⋯)** and then select **Delete**.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks
Last update: 2025-06-26T06:36:23.000Z
Content:
---
title: "Sinks · Tinybird Docs"
theme-color: "#171612"
description: "Sinks are the destinations for your data. They are the places where you can store your data after it has been transformed."
---


# Sinks [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks#sinks)

Copy as MD Tinybird Sinks allow you to export data from your Tinybird Workspace to external systems on a scheduled or on-demand basis. Sinks are built on top of [Pipes](../pipes) and provide a fully managed way to push data to various destinations.

## Available Sinks [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks#available-sinks)

Tinybird supports the following Sink destinations:

- [  Kafka Sink](./sinks/kafka-sink)
- [  S3 Sink](./sinks/s3-sink)
- [  GCS Sink](./sinks/gcs-sink)

Sinks are available on the Developer and Enterprise plans. See [Plans](/docs/forward/pricing).

## Key features [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks#key-features)

- Fully Managed: Sinks require no additional tooling or infrastructure management.
- Scheduled or On-Demand: Run exports on a defined schedule using cron expressions or trigger them manually when needed.
- Query Parameters: Support for parameterized queries allows flexible data filtering and transformation.
- Observability: Monitor Sink operations and data transfer through Service Data Sources.

## Common use cases [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks#common-use-cases)

Sinks enable various data integration scenarios:

- Regular data exports to clients or partner systems.
- Feeding data lakes and data warehouses.
- Real-time data synchronization with external systems.
- Event-driven architectures and data pipelines.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "API endpoints guides · Tinybird Docs"
theme-color: "#171612"
description: "Guides for using Tinybird API endpoints."
---


# API endpoints guides [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides#api-endpoints-guides)

Copy as MD Tinybird API endpoints make it easy to use your data in applications. These guides show you how to integrate API endpoints into different tools and frameworks:

Each guide provides step-by-step instructions and code examples to help you get started quickly. The guides cover common integration patterns like authentication, data filtering, and visualization.

The following guides are available:

- [  Connect Grafana to Tinybird](/docs/forward/work-with-data/publish-data/guides/connect-grafana)
- [  Consume APIs in a Next.js frontend](/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs)
- [  Consume APIs in a notebook](/docs/forward/work-with-data/publish-data/guides/consume-apis-in-a-notebook)
- [  Consume APIs in Prometheus format](/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format)
- [  Multi-tenant real-time APIs with Clerk and Tinybird](/docs/forward/work-with-data/publish-data/guides/multitenant-real-time-apis-with-clerk-and-tinybird)
- [  Share API docs with team members](/docs/forward/work-with-data/publish-data/guides/share-endpoint-documentation)
- [  Use advanced dynamic endpoint functions](/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions)


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints
Last update: 2025-06-10T14:43:18.000Z
Content:
---
title: "Endpoints · Tinybird Docs"
theme-color: "#171612"
description: "Endpoints make it easy to use the results of your queries in applications."
---


# Endpoints [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#endpoints)

Copy as MD Endpoints are a type of pipe that you can call from other applications. For example, you can ingest your data, build SQL logic inside a pipe, and then publish the result of your query as a REST API endpoint.

## Create an endpoint [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#create-an-endpoint)

Endpoints are defined in a .pipe file using the `TYPE ENDPOINT` directive. See [Endpoint pipes](/docs/forward/dev-reference/datafiles/pipe-files#endpoint-pipes).

You can list your endpoints and their URLs and tokens using the `tb endpoint` command. See [tb endpoint](/docs/forward/dev-reference/commands/tb-endpoint).

## Query parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#query-parameters)

Query parameters are great for any value of the query that you might want control dynamically from your applications. For example, you can get your endpoint to answer different questions by passing a different value as query parameter.

Use dynamic parameters means to do things like:

- Filtering as part of a `WHERE`   clause.
- Changing the number of results as part of a `LIMIT`   clause.
- Sorting order as part of an `ORDER BY`   clause.
- Selecting specific columns for `ORDER BY`   or `GROUP BY`   clauses.

### Define dynamic parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#define-dynamic-parameters)

To make a query dynamic, start the query with a `%` character. That signals the engine that it needs to parse potential parameters.

After you have created a dynamic query, you can define parameters by using the following pattern `{{<data_type>(<name_of_parameter>[,<default_value>, description=<"This is a description">, required=<True|False>])}}` . For example:

##### Simple select clause using dynamic parameters

%
SELECT * FROM TR LIMIT {{Int32(lim, 10, description="Limit the number of rows in the response", required=False)}} The previous query returns 10 results by default, or however many are specified on the `lim` parameter when requesting data from that API endpoint.

Boolean data type does not support the `description` or `required` arguments.

### Call an endpoint with dynamic parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#call-an-endpoint-with-dynamic-parameters)

When using an endpoint that uses parameters, pass in the desired parameters.

Using the previous example where `lim` sets the amount of maximum rows you want to get, the request would look like this:

##### Using an endpoint containing dynamic parameters

curl -d https://api.europe-west2.gcp.tinybird.co/v0/pipes/tr_pipe?lim=20&token=<your_token>
### Available data types for dynamic parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#available-data-types-for-dynamic-parameters)

You can use the following data types for dynamic parameters:

- `Boolean`   : Accepts `True`   and `False`   as values, as well as strings like `'TRUE'`  , `'FALSE'`  , `'true'`  , `'false'`  , `'1'`   , or `'0'`   , or the integers `1`   and `0`  .
- `String`   : For any string values.
- `DateTime64`  , `DateTime`   and `Date`   : Accepts values like `YYYY-MM-DD HH:MM:SS.MMM`  , `YYYY-MM-DD HH:MM:SS`   and `YYYYMMDD`   respectively.
- `Float32`   and `Float64`   : Accepts floating point numbers of either 32 or 64 bit precision.
- `Int`   or `Integer`   : Accepts integer numbers of any precision.
- `Int8`  , `Int16`  , `Int32`  , `Int64`  , `Int128`  , `Int256`   and `UInt8`  , `UInt16`  , `UInt32`  , `UInt64`  , `UInt128`  , `UInt256`   : Accepts signed or unsigned integer numbers of the specified precision.

### Use column parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#use-column-parameters)

You can use `column` to pass along column names of a defined type as parameters, like:

##### Using column dynamic parameters

%
SELECT * FROM TR 
ORDER BY {{column(order_by, 'timestamp')}}
LIMIT {{Int32(lim, 10)}} Always define the `column` function's second argument, the one for the default value. The alternative for not defining the argument is to validate that the first argument is defined, but this only has an effect on the execution of the endpoint. A placeholder is used in the development of the pipes.

##### Validate the column parameter when not defining a default value

%
SELECT * FROM TR
{% if defined(order_by) %}
ORDER BY {{column(order_by)}}
{% end %}
### Pass arrays [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#pass-arrays)

You can pass along a list of values with the `Array` function for parameters, like so:

##### Passing arrays as dynamic parameters

%
SELECT * FROM TR WHERE 
access_type IN {{Array(access_numbers, 'Int32', default='101,102,110')}}
### Send stringified JSON as parameter [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#send-stringified-json-as-parameter)

Consider the following stringified JSON:

"filters": [
    {
        "operand": "date",
        "operator": "equals",
        "value": "2018-01-02"
    },
    {
        "operand": "high",
        "operator": "greater_than",
        "value": "100"
    },
    {
        "operand": "symbol",
        "operator": "in_list",
        "value": "AAPL,AMZN"
    }
] You can use the `JSON()` function to use `filters` as a query parameter. The following example shows to use the `filters` field from the JSON snippet with the stock_prices_1m sample dataset.

%
SELECT symbol, date, high
FROM stock_prices_1m
WHERE
    1
    {% if defined(filters) %}
        {% for item in JSON(filters, '[]') %}
            {% if item.get('operator', '') == 'equals' %}
                AND {{ column(item.get('operand', '')) }} == {{ item.get('value', '') }}
            {% elif item.get('operator') == 'greater_than' %}
                AND {{ column(item.get('operand', '')) }} > {{ item.get('value', '') }}
            {% elif item.get('operator') == 'in_list' %}
                AND {{ column(item.get('operand', '')) }} IN splitByChar(',',{{ item.get('value', '') }})
            {% end %}
        {% end %}
    {% end %} When accessing the fields in a JSON object, use the following syntax:

item.get('Field', 'Default value to avoid SQL errors').
## Pagination [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#pagination)

You can paginate results by adding `LIMIT` and `OFFSET` clauses to your query. You can parameterize the values of these clauses, allowing you to pass pagination values as query parameters to your endpoint.

Use the `LIMIT` clause to select only the first `n` rows of a query result. Use the `OFFSET` clause to skip `n` rows from the beginning of a query result. Together, you can dynamically chunk the results of a query up into pages.

For example, the following query introduces two dynamic parameters `page_size` and `page` which lets you control the pagination of a query result using query parameters on the URL of an endpoint.

##### Paging results using dynamic parameters

%
SELECT * FROM TR
LIMIT {{Int32(page_size, 100)}}
OFFSET {{Int32(page, 0) * Int32(page_size, 100)}} You can also use pages to perform calculations such as `count()` . The following example counts the total number of pages:

##### Operation on a paginated endpoint

%
SELECT count() as total_rows, ceil(total_rows/{{Int32(page_size, 100)}}) pages FROM endpoint_to_paginate The addition of a `LIMIT` clause to a query also adds the `rows_before_limit_at_least` field to the response metadata. `rows_before_limit_at_least` is the lower bound on the number of rows returned by the query after transformations but before the limit was applied, and can be useful for response handling calculations.

To get consistent pagination results, add an `ORDER BY` clause to your paginated queries.

## Advanced templating [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#advanced-templating)

To build more complex queries, use flow control operators like `if`, `else` and `elif` in combination with the `defined()` function, which helps you to check if a parameter whether a parameter has been received and act accordingly.

Tinybird's templating system is based on the [Tornado Python framework](https://github.com/tornadoweb/tornado) , and uses Python syntax. You must enclose control statements in curly brackets with percentages `{%..%}` as in the following example:

##### Advanced templating using dynamic parameters

%
SELECT
  toDate(start_datetime) as day,
  countIf(status_code < 400) requests,
  countIf(status_code >= 400) errors,
  avg(duration) avg_duration
FROM
  log_events
WHERE
  endsWith(user_email, {{String(email, 'gmail.com')}}) AND 
  start_datetime >= {{DateTime(start_date, '2019-09-20 00:00:00')}} AND
  start_datetime <= {{DateTime(end_date, '2019-10-10 00:00:00')}}
  {% if method != 'All' %} AND method = {{String(method,'POST')}} {% end %}
GROUP BY
  day
ORDER BY
  day DESC
### Validate presence of a parameter [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#validate-presence-of-a-parameter)

To validate if a parameter is present in the query, use the `defined()` function. For example:

##### Validate if a param is in the query

%
select * from table
{% if defined(my_filter) %}
where attr > {{Int32(my_filter)}}
{% end %} When you call the endpoint with `/v0/pipes/:PIPE.json?my_filter=20` it applies the filter.

### Default parameter values and placeholders [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#default-parameter-values-and-placeholders)

Following best practices, you should set default parameter values as follows:

##### Default parameter values

%
SELECT * FROM table
WHERE attr > {{Int32(my_filter, 10)}} When you call the endpoint with `/v0/pipes/:PIPE.json` without setting any value to `my_filter` , it automatically applies the default value of 10.

If you don't set a default value for a parameter, you should validate that the parameter is defined before using it in the query as explained previously.

If you don't validate the parameter and it's not defined, the query might fail. Tinybird populates the parameter with a placeholder value based on the data type. For instance, numerical data types are populated with 0, strings with `__no_value__` , and date and timestamps with `2019-01-01` and `2019-01-01 00:00:00` respectively. You could try yourself with a query like this:

##### Get placeholder values

%
  SELECT 
      {{String(param)}} as placeholder_string,
      {{Int32(param)}} as placeholder_num,
      {{Boolean(param)}} as placeholder_bool,
      {{Float32(param)}} as placeholder_float,
      {{Date(param)}} as placeholder_date,
      {{DateTime(param)}} as placeholder_ts,
      {{Array(param)}} as placeholder_array This returns the following values:

{
  "placeholder_string": "__no_value__",
  "placeholder_num": 0,
  "placeholder_bool": 0,
  "placeholder_float": 0,
  "placeholder_date": "2019-01-01",
  "placeholder_ts": "2019-01-01 00:00:00",
  "placeholder_array": ["__no_value__0","__no_value__1"]
} When using `defined()` functions in the `WHERE` clause, make sure to add `1` and use the `AND` operator to avoid SQL syntax errors like:

SELECT * FROM table
WHERE 1
{% if defined(my_filter) %}
AND attr > {{Int32(my_filter)}}
{% end %}
### Cascade parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#cascade-parameters)

Parameters with the same name in different pipes are cascaded down the dependency chain.

For example, if you publish pipe A with the parameter `foo` , and then pipe B which uses pipe A as a data source also with the parameter `foo` , then when you call the endpoint of pipe B with `foo=bar` , the value of `foo` will be `bar` in both pipes.

## Throw custom errors [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#throw-custom-errors)

The following example stops the endpoint processing and returns a 400 error:

##### Validate if a param is defined and throw an error if it's not defined

%
{% if not defined(my_filter) %}
{{ error('my_filter (int32) query param is required') }}
{% end %}
select * from table
where attr > {{Int32(my_filter)}} The `custom_error` function is an advanced version of `error` where you can customize the response and other aspects. The function gets an object as the first argument, which is sent as JSON, and the status_code as a second argument, which defaults to 400.

##### Validate if a param is defined and throw an error if it's not defined

%
{% if not defined(my_filter) %}
{{ custom_error({'error_id': 10001, 'error': 'my_filter (int32) query param is required'}) }}
{% end %}
select * from table
where attr > {{Int32(my_filter)}}
## Errors and retries [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#errors-and-retries)

Endpoints return standard HTTP success or error codes. For errors, the response also includes extra information about what went wrong, encoded in the response as JSON.

### Error codes [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#error-codes)

Endpoints might return the following HTTP error codes:

| Code | Description |
| --- | --- |
| 400 | Bad request. A `HTTP400`   can be returned in several scenarios and typically represents a malformed request such as errors in your SQL queries or missing query parameters. |
| 403 | Forbidden. The auth Token doesn't have the correct scopes. |
| 404 | Not found. This usually occurs when the name of the endpoint is wrong or hasn't been published. |
| 405 | HTTP Method not allowed. Requests to endpoints must use the `GET`   method. |
| 408 | Request timeout. This occurs when the query takes too long to complete by default this is 10 seconds. |
| 414 | Request-URI Too Large. Not all APIs have the same limit but it's usually 2KB for GET requests. Reduce the URI length or use a POST request to avoid the limit. |
| 429 | Too many requests. Usually occurs when an endpoint is hitting into rate limits. |
| 499 | Connection closed. This occurs if the client closes the connection after 1 second, if this is unexpected increase the connection timeout on your end. |
| 500 | Internal Server Error. Usually an unexpected transient service error. |

Errors when running a query are usually reported as 400 Bad request or 500 Internal Server Error, depending on whether the error can be fixed by the caller or not.

In those cases the API response has an additional HTTP header, `X-DB-Exception-Code` where you can check the internal database error, reported as a stringified number.

### Retries [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/endpoints#retries)

When implementing an API Gateway, make sure to handle potential errors and implement retry strategies where appropriate.

Implement automatic retries for the following errors:

- HTTP 429: Too many requests
- HTTP 500: Internal Server Error

Follow an exponential backoff when retrying requests that produce the previous errors.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/optimize/materialized-views
Last update: 2025-05-22T22:26:17.000Z
Content:
---
title: "Materialized views · Tinybird Docs"
theme-color: "#171612"
description: "Materialized views process your data at ingest time to increase the performance of your queries and endpoints."
---


# Materialized views [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/materialized-views#materialized-views)

Copy as MD A materialized view is the continuous, streaming result of a pipe saved as a new data source. As new data is ingested into the origin data source, the transformed results from the pipe are continually inserted in the new materialized view, which you can query as any other data source.

Preprocessing data at ingest time reduces latency and cost-per-query, and can significantly improve the performance of your endpoints. For example, you can transform the data through SQL queries, using calculations such as counts, sums, averages, or arrays, or transformations like string manipulations or joins. The resulting materialized view acts as a data source you can query or publish.

Typical use cases of Materialized views include:

- Aggregating, sorting, or filtering data at ingest time.
- Improving the speed of a query that's taking too much time to run.
- Simplifying query development by automating common filters and aggregations.
- Reducing the amount of data processed by a single query.
- Changing an existing schema for a different use case.

You can create a new materialized view and populate it with all existing data without any cost.

## Create materialized views [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/materialized-views#create-materialized-views)

Materialized views are defined in a .pipe file using the `TYPE MATERIALIZED` directive. See [Materialized pipes](/docs/forward/dev-reference/datafiles/pipe-files#materialized-pipes).

Consider an `origin` data source, for example `my_origin.datasource` , like the following:

##### Origin data source

SCHEMA >
    `id` Int16,
    `local_date` Date,
    `name` String,
    `count` Int64 You might want to create an optimized version of the data source that preaggregates `count` for each ID. To do this, create a new data source that uses a `SimpleAggregateFunction` as a materialized view.

First, define the `destination` data source, for example `my_destination.datasource`:

##### Destination data source

SCHEMA >
    `id` Int16,
    `local_date` Date,
    `name` String,
    `total_count` SimpleAggregateFunction(sum, UInt64)

ENGINE "AggregatingMergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(local_date)"
ENGINE_SORTING_KEY "local_date,id,name" Write a materialized pipe, for example `my_transformation.pipe`:

##### Transformation pipe

NODE transformation_node

SQL >
    SELECT
        id,
        local_date,
        name,
        sum(count) as total_count
    FROM
        my_origin
    GROUP BY
        id,
        local_date,
        name

TYPE materialized
DATASOURCE my_destination Once you have the origin and destination data sources defined and the materialized pipe, deploy the project to apply the changes. Materialized views are populated automatically in the target environment.

## Populating materialized views [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/materialized-views#populating-materialized-views)

Materialized views are populated automatically when you deploy your project. The deployment process handles the initial population and subsequent data migrations.

When you first deploy a project with a materialized view, Tinybird will create the view and populate it with all existing data from the origin data source. This process happens automatically and is managed by Tinybird's deployment system.

If you need to make changes to a materialized view after deployment (such as changing its schema or engine settings), you'll need to follow the data source evolution process. This might require using a [forward query](/docs/forward/test-and-deploy/evolve-data-source#forward-query) to ensure historical data is properly migrated.

Learn more about:

- [  Deployments in Tinybird](/docs/forward/test-and-deploy/deployments)   to understand how deployments manage state changes
- [  Evolving data sources](/docs/forward/test-and-deploy/evolve-data-source#materialized-data-sources)   for details on making changes to materialized views after deployment

## Limitations [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/materialized-views#limitations)

As explained, materialized views work as insert triggers, which means a delete or truncate operation on your original data source doesn't affect the related materialized views. But be sure to understand [materialized views iterations](/docs/forward/test-and-deploy/evolve-data-source#materialized-data-sources) before deleting data or setting TTLs to the origin data source.

As transformation and ingestion in the materialization is done on each block of inserted data in the original Data Source, some operations such as `GROUP BY`, `ORDER BY`, `DISTINCT` and `LIMIT` might need a specific `engine` , such as `AggregatingMergeTree` or `SummingMergeTree` , which can handle data aggregations.

The Data Source resulting from a materialization generated using `JOIN` is automatically updated **only** if and when a new operation is performed over the Data Source in the `FROM`.

You can't create Materialized Views that depend on the `UNION` of several Data Sources.

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/materialized-views#next-steps)

- Learn about[  Copy pipes](/docs/forward/work-with-data/optimize/copy-pipes)  .
- Publish your data with[  API endpoints](/docs/forward/work-with-data/publish-data)  .


---

URL: https://www.tinybird.co/docs/forward/work-with-data/optimize/guides
Last update: 2025-05-22T22:31:33.000Z
Content:
---
title: "Optimization guides · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to optimize your data processing and storage in Tinybird with these practical guides."
---


# Optimization guides [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides#optimization-guides)

Copy as MD These guides provide practical strategies and patterns for optimizing your data processing and storage in Tinybird. Whether you're dealing with data deduplication, complex processing patterns, or need to balance performance with data freshness, these guides will help you make the right architectural decisions.

## Available guides [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides#available-guides)

| Guide | Description |
| --- | --- |
| [  Deduplicate data in your data source](/docs/forward/work-with-data/optimize/guides/deduplication-strategies) | Learn several strategies for deduplicating data, from simple query-time approaches to more complex engine-based solutions. |
| [  Build a lambda architecture](/docs/forward/work-with-data/optimize/guides/lambda-architecture) | Discover how to implement a lambda architecture pattern when the typical Tinybird flow doesn't fit your needs. |

## When to use these guides [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides#when-to-use-these-guides)

- Use the deduplication guide when you need to handle updates or changes to your data, especially in CDC (Change Data Capture) scenarios
- Use the lambda architecture guide when you need to balance data freshness with processing efficiency, particularly in complex scenarios where materialized views alone aren't sufficient

## Related resources [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides#related-resources)

- [  Materialized Views](/docs/forward/work-with-data/optimize/materialized-views)
- [  Copy Pipes](/docs/forward/work-with-data/optimize/copy-pipes)
- [  Engine settings](/docs/sql-reference/engines/engine-settings)


---

URL: https://www.tinybird.co/docs/forward/work-with-data/optimize/copy-pipes
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "Copy pipes · Tinybird Docs"
theme-color: "#171612"
description: "Copy pipes capture the result of a pipe at a moment in time and write the result into a target data source. They can be run on a schedule, or executed on demand."
---


# Copy pipes [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/copy-pipes#copy-pipes)

Copy as MD Copy pipes capture the result of a pipe at a moment in time and write the result into a target data source. They can be run on a schedule, or executed on demand.

Use copy pipes for:

- Event-sourced snapshots, such as change data capture (CDC).
- Copy data from Tinybird to another location in Tinybird to experiment.
- De-duplicate with snapshots.

Copy pipes should not be confused with [materialized views](/docs/forward/work-with-data/optimize/materialized-views) . While materialized views continuously update as new events are inserted, copy pipes generate a single snapshot at a specific point in time.

## Create a copy pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/copy-pipes#create-a-copy-pipe)

Copy pipes are defined in a .pipe file, including defining nodes that contain your SQL queries. See [.pipe files](/docs/forward/dev-reference/datafiles/pipe-files) for more information.

In the .pipe file you define the queries that filter and transform the data as needed. The final result of all queries is the result that you want to write into a data source.

The file must contain a `TYPE COPY` node that defines which node contains the final result. To do this, include the following parameters at the end of a node:

TYPE COPY
TARGET_DATASOURCE datasource_name
COPY_SCHEDULE --(optional) a cron expression or @on-demand. If not defined, it would default to @on-demand.
COPY_MODE append --(Optional) The strategy to ingest data for copy jobs. One of `append` or `replace`, if empty the default strategy is `append`. There can be only one copy node per pipe, and no other outputs, such as [materialized views](/docs/forward/work-with-data/optimize/materialized-views) or [API endpoints](/docs/forward/work-with-data/publish-data/endpoints).

## Schedule a copy pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/copy-pipes#schedule-a-copy-pipe)

You can schedule copy pipes to run at a specific time using a cron expression. To schedule a copy pipe, configure `COPY_SCHEDULE` with a cron expression. On-demand copy pipes are defined by configuring `COPY_SCHEDULE` with the value `@on-demand`.

Here is an example of a copy pipe scheduled to run every hour and that writes the results of a query into the `sales_hour_copy` data source:

NODE daily_sales
SQL >
    %
    SELECT toStartOfDay(starting_date) day, country, sum(sales) as total_sales
    FROM teams
    WHERE
    day BETWEEN toStartOfDay({{DateTime(job_timestamp)}}) - interval 1 day AND toStartOfDay({{DateTime(job_timestamp)}})
    and country = {{ String(country, 'US')}}
    GROUP BY day, country

TYPE COPY
TARGET_DATASOURCE sales_hour_copy
COPY_SCHEDULE 0 * * * * Before pushing the copy pipe to your workspace, make sure that the target data source already exists and has a schema that matches the output of the query result.

All schedules are executed in the UTC time zone. If you are configuring a schedule that runs at a specific time, be careful to consider that you will need to convert the desired time from your local time zone to UTC.

## List your copy pipes [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/copy-pipes#list-your-copy-pipes)

Use the `tb copy ls` command to list all your copy pipes. See [tb copy](/docs/forward/dev-reference/commands/tb-copy).

## Run, pause, or resume a copy pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/copy-pipes#run-pause-or-resume-a-copy-pipe)

Use the `tb copy` command to run, pause, or resume a copy pipe. See [tb copy](/docs/forward/dev-reference/commands/tb-copy).

You can run `tb job ls` to see any running jobs, as well as any jobs that have finished during the last 48 hours.

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/copy-pipes#next-steps)

- Learn about[  Materialized views](/docs/forward/work-with-data/optimize/materialized-views)  .
- Publish your data with[  API endpoints](/docs/forward/work-with-data/publish-data)  .


---

URL: https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cli
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "CLI · Tinybird Docs"
theme-color: "#171612"
description: "Deploy your Tinybird project using the CLI."
---


# Deploying to Tinybird using the CLI [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cli#deploying-to-tinybird-using-the-cli)

Copy as MD You can deploy your data projects to Tinybird Cloud directly from the command line using the [Tinybird CLI](/docs/forward/dev-reference/commands).

To deploy to Tinybird Cloud, create a staging deployment using the `--cloud` flag. This prepares all the resources in the cloud environment.

1
## Check the deployment [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cli#check-the-deployment)

Before creating the deployment, you can check the deployment with the `--check` flag. This runs a series of checks to ensure the deployment is ready. This is similar to a dry run.

# Checks the deployment
tb --cloud deployment create --check 2
## Create a staging deployment [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cli#create-a-staging-deployment)

Create a new staging deployment in Tinybird Cloud. Pass the `--wait` flag to wait for the deployment to finish:

# Prepares all resources in Tinybird Cloud
tb --cloud deployment create --wait To run commands against the staging deployment, use the `--staging` flag. For example:

tb --staging --cloud endpoint ls 3
## Promote to live [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cli#promote-to-live)

When the staging deployment is ready, promote it to a live deployment in Tinybird Cloud:

# Enables the deployment in Tinybird Cloud
tb --cloud deployment promote To deploy and promote in one step, use the `tb deploy` alias. For example: `tb --cloud deploy`.

## Next steps [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cli#next-steps)

- Learn about datafiles, like .datasource and .pipe files. See[  Datafiles](/docs/forward/dev-reference/datafiles)  .
- Browse the Tinybird CLI commands reference. See[  Commands reference](/docs/forward/dev-reference/commands)  .


---

URL: https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cicd
Last update: 2025-05-21T09:32:07.000Z
Content:
---
title: "CI/CD · Tinybird Docs"
theme-color: "#171612"
description: "Deploy your Tinybird project through CI/CD workflows."
---


# Deploying to Tinybird through CI/CD [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cicd#deploying-to-tinybird-through-cicd)

Copy as MD After you create your data project in Git, you can implement continuous integration (CI) and continuous deployment (CD) workflows to automate interaction with Tinybird.

When you create a project using `tb create` , Tinybird generates CI/CD templates that you can use in GitHub and GitLab to automate testing and deployment.

The Tinybird Local container is a key part of the CI workflow. See [Local container](/docs/forward/install-tinybird/local) for more information.

## CI workflow [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cicd#ci-workflow)

As you expand and iterate on your data projects, you can continuously validate your changes. In the same way that you write integration and acceptance tests for source code in a software project, you can write automated tests for your API endpoints to run on each pull or merge request.

A potential CI workflow could run the following steps when you open a pull request:

1. Install Tinybird CLI and Tinybird Local: Sets up dependencies and installs the Tinybird CLI to run the required commands.
2. Build project: Checks the datafile syntax and correctness.
3. Test project: Runs fixture tests, data quality tests, or both to validate changes.
4. Deployment check: Validates the deployment before creating it, similar to a dry run.

The following templates are available for GitHub and GitLab:

- GitHub
- GitLab

name: Tinybird - CI Workflow

on:
  workflow_dispatch:
  pull_request:
    branches:
      - main
      - master
    types: [opened, reopened, labeled, unlabeled, synchronize]

concurrency: ${{ github.workflow }}-${{ github.event.pull_request.number }}

env:
  TINYBIRD_HOST: ${{ secrets.TINYBIRD_HOST }}
  TINYBIRD_TOKEN: ${{ secrets.TINYBIRD_TOKEN }}

jobs:
  ci:
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: '.'
    services:
      tinybird:
        image: tinybirdco/tinybird-local:latest
        ports:
          - 7181:7181
    steps:
      - uses: actions/checkout@v3
      - name: Install Tinybird CLI
        run: curl https://tinybird.co | sh
      - name: Build project
        run: tb build
      - name: Test project
        run: tb test run
      - name: Deployment check
        run: tb --cloud --host ${{ env.TINYBIRD_HOST }} --token ${{ env.TINYBIRD_TOKEN }} deploy --check
## CD workflow [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cicd#cd-workflow)

Once your changes are validated by the CI pipeline, you can automate the deployment process and let Tinybird handle the migration for you with no downtime.

A potential CD workflow could run the following steps when you merge a pull request:

1. Install Tinybird CLI: Sets up dependencies and installs the Tinybird CLI to run the required commands.
2. Deploy project: Creates a staging deployment in Tinybird Cloud, migrates data, promotes to live, and removes previous deployment.

The following templates are available for GitHub and GitLab:

- GitHub
- GitLab

name: Tinybird - CD Workflow

on:
  push:
    branches:
      - main
      - master

concurrency: ${{ github.workflow }}-${{ github.event.ref }}

env:
  TINYBIRD_HOST: ${{ secrets.TINYBIRD_HOST }}
  TINYBIRD_TOKEN: ${{ secrets.TINYBIRD_TOKEN }}

jobs:
  cd:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Install Tinybird CLI
        run: curl https://tinybird.co | sh
      - name: Deploy project
        run: tb --cloud --host ${{ env.TINYBIRD_HOST }} --token ${{ env.TINYBIRD_TOKEN }} deploy
## Secrets [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cicd#secrets)

Make sure to provide the values for the following secrets in your CI/CD settings:

- `TINYBIRD_HOST`
- `TINYBIRD_TOKEN`

Run `tb info` to get the value for the secret `TINYBIRD_HOST` . It is **api** url in Tinybird Cloud. For example:

tb info

» Tinybird Cloud:
--------------------------------------------------------------------------------------------
user: tinybird@domain.co
workspace_name: forward
workspace_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX
token: YOUR-ADMIN-TOKEN
user_token: YOUR-USER-TOKEN
api: https://api.tinybird.co
ui: https://cloud.tinybird.co/gcp/europe-west2/forward
--------------------------------------------------------------------------------------------

» Tinybird Local:
--------------------------------------------------------------------------------------------
user: tinybird@domain.co
workspace_name: forward
workspace_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX
token: YOUR-LOCAL-ADMIN-TOKEN
user_token: YOUR-LOCAL-USER-TOKEN
api: http://localhost:7181
ui: http://cloud.tinybird.co/local/7181/forward
--------------------------------------------------------------------------------------------

» Project:
---------------------------------------------------
current: /path/to/your/project
.tinyb: /path/to/your/project/.tinyb
project: /path/to/your/project
--------------------------------------------------- And run `tb --cloud token copy "admin <your_email>"` to get the value of `TINYBIRD_TOKEN`.

tb --cloud token copy "admin your_user@email.com"
Running against Tinybird Cloud: Workspace forward
** Token 'admin your_user@email.com' copied to clipboard When running `tb test run` , Tinybird creates a fresh workspace for each test run. Secrets will not persist between test runs. To avoid test failures, add a default value to your secrets. For example:

GCS_SERVICE_ACCOUNT_CREDENTIALS_JSON {{ tb_secret("secret_name", "default_value") }} `tb_secrets` replacements happen at parser time in the server. If a secret is changed after a deployment is done, Tinybird won’t detect it automatically and will require an extra deployment.

## Next steps [¶](https://www.tinybird.co/docs/forward/test-and-deploy/deployments/cicd#next-steps)

- Learn more about[  deployments](/docs/forward/test-and-deploy/deployments)  .
- Learn about the[  Local container](/docs/forward/install-tinybird/local)  .
- Learn about datafiles, like .datasource and .pipe files. See[  Datafiles](/docs/forward/dev-reference/datafiles)  .
- Browse the Tinybird CLI commands reference. See[  Commands reference](/docs/forward/dev-reference/commands)  .


---

URL: https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "Add a self-managed region manually · Tinybird Docs"
theme-color: "#171612"
description: "Create your own Tinybird Cloud region in the cloud service provider of your choice."
---


# Add a self-managed region manually [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual#add-a-self-managed-region-manually)

Copy as MD To add a self-managed region manually, you need to:

1. Create a region in Tinybird Cloud.
2. Deploy the Tinybird Local container on your cloud provider.

Self-managed regions are currently in beta. Use them for development environments only; avoid production workloads. Features, requirements, and implementation details might change as we continue to improve this capability.

## Resource requirements [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual#resource-requirements)

When planning your self-managed region deployment, consider the following resource requirements:

- Compute: At least 4 vCPUs and 16 GB RAM for development environments. Production environments may require more resources depending on your workload.
- Storage: Allocate at least 100 GB for the ClickHouse volume and 10 GB for the Redis volume. Scale according to your data volume needs.
  - Required paths in the container are `/redis-data`     and `/var/lib/clickhouse`
- Network: Ensure your deployment has sufficient bandwidth for your expected query and ingestion rates.
  - You also need a publicly accessible HTTPS URL. For example, `https://tinybird.example.com`    .
- Init parameters: Provide the following parameters to the deployed container. These parameters can be provided in the `tb infra add`   command or automatically generated by `tb infra init`  :
  - `TB_INFRA_TOKEN`
  - `TB_INFRA_WORKSPACE`
  - `TB_INFRA_ORGANIZATION`
  - `TB_INFRA_USER`

## Create a region manually [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual#create-a-region-manually)

To add a new self-managed region to Tinybird Cloud manually, follow these steps:

1
### Log into Tinybird Cloud [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual#log-into-tinybird-cloud)

Log into Tinybird Cloud using the `tb login` command.

tb login 2
### Add the new region [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual#add-the-new-region)

Before you deploy the `tblocal` container on your cloud provider, add a self-managed region using `tb infra add`:

tb infra add

Running against Tinybird Cloud: Workspace example_workspace
Enter name: example
Enter host: https://tinybird.example.com
» Adding infrastructure 'example' in Tinybird...
✓ Infrastructure 'example' added
» Required environment variables:
TB_INFRA_TOKEN=example_token
TB_INFRA_WORKSPACE=example_workspace
TB_INFRA_ORGANIZATION=example_organization
TB_INFRA_USER=user@example.com The host is optional, as you might not known it yet. You can update it later using `tb infra update` . Copy the environment variables and their values to use in the next step.

If you're an organization admin, you can also list, add, update, or remove infrastructure regions in Tinybird Cloud. Go to **Settings**, **Managed regions** in the Tinybird Cloud.

3
### Deploy Tinybird Local [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual#deploy-tinybird-local)

Deploy the [Tinybird Local](/docs/forward/install-tinybird/local) container on your cloud provider. At the end of the deployment, the `tblocal` container must be publicly accessible on an HTTP URL, which is the host of the self-managed region.

Make sure to expose and set the following environment variables to the `tblocal` container using the values provided in the previous step:

- `TB_INFRA_TOKEN`
- `TB_INFRA_WORKSPACE`
- `TB_INFRA_ORGANIZATION`
- `TB_INFRA_USER`

4
### Log into your instance [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual#log-into-your-instance)

Navigate to your Tinybird project and run `tb login --host <host>` , where `<host>` is the host of your self-managed region that you set up in the previous step.

tb login --host https://<host> 5
### Use your instance [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual#use-your-instance)

After you're logged in, you can run commands against it using the `--cloud` flag.

tb --cloud workspace ls
## Examples [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual#examples)

The following examples show how to add a self-managed region manually in different cloud providers.

- [  Minimal setup on a dedicated VM](https://gist.github.com/alejandromav/f4b61d77580ad9596a0a92c6a2df7c19)

## Troubleshooting [¶](https://www.tinybird.co/docs/forward/install-tinybird/self-managed/manual#troubleshooting)

If you encounter issues with your self-managed region:

- Check the container logs for error messages.
- Verify that all required environment variables are set correctly.
- Ensure your network configuration allows the necessary connections.
- Confirm that your persistent volumes are properly mounted and have sufficient space.
- For connectivity issues, verify that your HTTPS endpoint is properly configured.

For additional help, contact Tinybird support with details about your deployment.


---

URL: https://www.tinybird.co/docs/forward/install-tinybird/self-managed/assisted
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "Use the CLI to add a self-managed region · Tinybird Docs"
theme-color: "#171612"
description: "Create your own Tinybird Cloud region in the cloud service provider of your choice."
---


- AWS

The following tools are required to deploy Tinybird Local on AWS:

- [  Terraform CLI](https://developer.hashicorp.com/terraform/install)   to create the Kubernetes cluster.
- [  Kubectl CLI](https://kubernetes.io/docs/tasks/tools/)   to manage your Kubernetes installation.
- [  AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)   with credentials configured.

Before initiating deployment, you need to set up the following in AWS:

- A zone and domain name in[  Route53 zone](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/CreatingHostedZone.html)  .  

  - Write down the hosted zone name and the zone ID for your domain.
- An[  EKS cluster](https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html)   with the following components:  

  - [    AWS Load Balancer Controller](https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/)     installed.
  - [    external-dns](https://github.com/kubernetes-sigs/external-dns)     configured.
  - Both components need sufficient permissions to manage resources.
  - Optionally, set a storage class in the EKS cluster with high IOPS and throughput.


---

URL: https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod
Last update: 2025-06-16T08:31:59.000Z
Content:
---
title: "Chapter 1: Idea to Production · Tinybird Docs"
theme-color: "#171612"
description: "Build the first version of your data project locally and deploy it to Tinybird Cloud."
---


# Chapter 1: Idea to Production [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#chapter-1-idea-to-production)

Copy as MD
## What you'll build [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#what-youll-build)

In this chapter, you'll build the first version of your data project locally and deploy it to Tinybird Cloud.

The desired features are:

- `total-revenue`   : for the merchant to see their total revenue in realtime.
- `products-usually-bought-together`   : to increase ticket size based on other buyers' behavior.

## Data structure [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#data-structure)

You'll work with two event types:

- `orders`   : Overall order information
- `order_items`   : Items within each order

### Example data [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#example-data)

**Orders:**

{"storeId": 1, "orderId": "order123", "userId": "user456", "timestamp": "2023-04-24T10:30:00Z", "totalAmount": 125.50}
{"storeId": 1, "orderId": "order124", "userId": "user789", "timestamp": "2023-04-24T11:45:00Z", "totalAmount": 75.25} **Order Items:**

{"storeId": 1, "orderId": "order123", "items": [{"productId": "prod1", "quantity": 2, "priceAtTime": 45.00}, {"productId": "prod2", "quantity": 1, "priceAtTime": 35.50}]}
{"storeId": 1, "orderId": "order124", "items": [{"productId": "prod3", "quantity": 1, "priceAtTime": 50.25}, {"productId": "prod2", "quantity": 1, "priceAtTime": 25.00}]} 1
## Set up the project [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#set-up-the-project)

First, create a new directory for your Tinybird project. `tb create` will create the scaffolding for you.

mkdir ecommerce-analytics && cd ecommerce-analytics tb create

# » Creating new project structure...
# ✓ /datasources
# ✓ /endpoints
# ✓ /materializations
# ✓ /copies
# ✓ /pipes
# ✓ /fixtures
# ✓ /tests
# ✓ /connections
# ✓ Scaffolding completed!

# » Creating CI/CD files for GitHub and GitLab...
# ✓ /.gitignore
# ✓ .github/workflows/tinybird-ci.yml
# ✓ .github/workflows/tinybird-cd.yml
# ✓ ./.gitlab-ci.yml
# ✓ .gitlab/tinybird/tinybird-ci.yml
# ✓ .gitlab/tinybird/tinybird-cd.yml
# ✓ Done!

# » Creating .cursorrules...
# ✓ Done! You just created the folder structure (/datasources, /endpoints, /fixtures…) where the code will be stored. Also, note the CI/CD files; **You're prototyping now, so don't pay too much attention to them. You'll need them when you go to production**.

2
## Inserting and storing data [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#inserting-and-storing-data)

Next, define the tables and how the events are going to be stored in them.

Create a tmp-seed-data directory, and add samples for orders and order-items events.

mkdir tmp-seed-data

touch tmp-seed-data/orders.ndjson

echo '{"storeId": 1, "orderId": "order123", "userId": "user456", "timestamp": "2023-04-24T10:30:00Z", "totalAmount": 125.50}
{"storeId": 1, "orderId": "order124", "userId": "user789", "timestamp": "2023-04-24T11:45:00Z", "totalAmount": 75.25}' > tmp-seed-data/orders.ndjson

touch tmp-seed-data/order-items.ndjson

echo '{"storeId": 1, "orderId": "order123", "items": [{"productId": "prod1", "quantity": 2, "priceAtTime": 45.00}, {"productId": "prod2", "quantity": 1, "priceAtTime": 35.50}]}
{"storeId": 1, "orderId": "order124", "items": [{"productId": "prod3", "quantity": 1, "priceAtTime": 50.25}, {"productId": "prod2", "quantity": 1, "priceAtTime": 25.00}]}' > tmp-seed-data/order-items.ndjson Create two [.datasource files](/docs/forward/dev-reference/datafiles/datasource-files). **Data sources are the definition of the database tables where you will store the order events** . They specify how to access the data and how to store it.

You can proceed directly if you know the syntax, or use tb create with the `--data` flag to pass the path to sample data.

tb create --data tmp-seed-data/orders.ndjson 

# » Creating resources...
# ✓ /datasources/orders.datasource

# » Generating fixtures...
# ✓ /fixtures/orders

tb create --data tmp-seed-data/order-items.ndjson

# » Creating resources...
# ✓ /datasources/order_items.datasource
# ✓ Done!

# » Generating fixtures...
# ✓ /fixtures/order_items If you don't sample data, you can use `--prompt` to pass an LLM prompt to generate a .datasource file and a fixture.

Now that the CLI created `/fixtures` , you're good to delete `tmp-seed-data`.

Examining the .datasource files, they should look like this:

##### datasources/orders.datasource

SCHEMA >
	orderId String `json:$.orderId`,
	storeId UInt32 `json:$.storeId`,
	timestamp String `json:$.timestamp`,
	totalAmount Float32 `json:$.totalAmount`,
	userId String `json:$.userId`

ENGINE "MergeTree"
ENGINE_SORTING_KEY "userId, orderId" **SCHEMA** defines the column names, types, and [jsonpath](/docs/forward/dev-reference/datafiles/datasource-files#jsonpath-expressions) to extract the fields where column values are stored in the json.

**ENGINE** is set to [MergeTree](/docs/sql-reference/engines/mergetree) , the default table engine that is append only

And **ENGINE_SORTING_KEY** defines the order in which to physically store the data.

##### datasources/order_items.datasource

SCHEMA >
	orderId String `json:$.orderId`,
	storeId UInt32 `json:$.storeId`,
	items__productId Array(String) `json:$.items[:].productId`,
	items__priceAtTime Array(Float32) `json:$.items[:].priceAtTime`,
	items__quantity Array(Int16) `json:$.items[:].quantity`

ENGINE "MergeTree"
ENGINE_SORTING_KEY "orderId, storeId" Similar to the previous one, but note the Array types and jsonpaths that will convert this JSON:

{"storeId": 1, "orderId": "order123", "items": [{"productId": "prod1", "quantity": 2, "priceAtTime": 45.00}, {"productId": "prod2", "quantity": 1, "priceAtTime": 35.50}]} Into this row:

| storeId | orderId  | items__productId | items__priceAtTime | items__quantity |
|---------|----------|------------------|--------------------|-----------------|
|      1  | order123 | [prod1, prod2]   | [45.00, 35.00]     | [2, 1]          | 3
## Validate the project locally [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#validate-the-project-locally)

Run the docker image from CLI with `tb local start` . Orbstack, Docker desktop, or the runtime of your preference is needed.

tb local start

# » Starting Tinybird Local...
# ✓ Tinybird Local is ready! Build the project to validate your datafiles. If something is wrong (like the syntax), you'll receive an error and can correct it before deploying to the cloud.

tb build

# » Building project...
# ✓ datasources/orders.datasource created
# ✓ datasources/order_items.datasource created

# ✓ Build completed in 0.3s The build step also ingests your fixtures so you can test locally with data. Verify it is OK, too:

tb sql "select * from orders"

# Running against Tinybird Local
#  storeId  orderId   timestamp             totalAmount  userId
#  UInt32   String    String                Float32      String
#  ───────────────────────────────────────────────────────────────
#       1    order123  2023-04-24T10:30:00Z  125.5        user456
# ───────────────────────────────────────────────────────────────
#       1    order124  2023-04-24T11:45:00Z  75.25        user789 4
## Create the "total_revenue" endpoint [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#create-the-total-revenue-endpoint)

Now, create your first endpoint to calculate total revenue metrics. **Endpoints are [.pipe files](/docs/forward/dev-reference/datafiles/pipe-files)** , a convenient way to chain SQL queries together like a notebook. With the `TYPE` command, you can state the desired behavior. You need to use `endpoint` for exposing an API endpoint that you can call from other services.

##### endpoints/total_revenue.pipe

NODE total_revenue_endpoint
SQL >
    SELECT 
        count() AS orders,
        sum(totalAmount) AS revenue,
        revenue / orders AS average_order_value
    FROM orders

TYPE endpoint 5
## Test the API endpoint [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#test-the-api-endpoint)

Verify the API endpoint works as expected. **You need the url** (by default, tb local runs on port 7181) **and the admin user token** . The path is `v0/pipes/<endpoint name>.<format>?token=<token>` and you need a token to authenticate. The token can be passed as a query parameter or in the header.

tb token copy "admin local_testing@tinybird.co" && TB_LOCAL_TOKEN=$(pbpaste)

curl -X GET "http://localhost:7181/v0/pipes/total_revenue.json?token=$TB_LOCAL_TOKEN"
# {
#     "meta":
#     [
#             {
#                    	"name": "orders",
#                    	"type": "UInt64"
#            	},
#             {
#                    	"name": "revenue",
#                    	"type": "Float64"
#            	},
#             {
#                    	"name": "average_order_value",
#                    	"type": "Float64"
#            	}
#    	],
#     "data":
#     [
#             {
#                    	"orders": 2,
#                    	"revenue": 200.75,
#                    	"average_order_value": 100.375
#            	}
#    	],
#     "rows": 1,
#     "statistics":
#     {
#            	"elapsed": 0.005939524,
#            	"rows_read": 2,
#            	"bytes_read": 8
#     	}
# } Check meta and statistics, and more importantly, data. Data is in data array: 2 orders, numbers are OK, you're good to go.

6
## Create the "products_usually_bought_together" endpoint [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#create-the-products-usually-bought-together-endpoint)

Next, let's create your product recommendation endpoint:

The idea is to end up with a table that contains the product you will be looking at, and a ranking of the products that are bought with it.

So, for these items:

["prod1", "prod2"]
["prod3", "prod2"]
["prod2", "prod6", "prod1"]
["prod2", "prod1"]
["prod7"] We want a table like this:

productId, boughtWith, count
----------------------------
prod1, prod2, 3
prod1, prod6, 1
prod2, prod1, 3
prod2, prod3, 1
prod2, prod6, 1
prod3, prod2, 1
prod6, prod1, 1
prod6, prod2, 1 There isn't enough sample data in `order-items.ndjson` , so let's remove the data present in 1 and add new rows:

##### sample-order-items.ndjson

{"storeId": 1, "orderId": "order123", "items": [{"productId": "prod1", "quantity": 2, "priceAtTime": 45.00}, {"productId": "prod2", "quantity": 1, "priceAtTime": 35.50}]}
{"storeId": 1, "orderId": "order124", "items": [{"productId": "prod3", "quantity": 1, "priceAtTime": 50.25}, {"productId": "prod2", "quantity": 1, "priceAtTime": 25.00}]}
{"storeId": 1, "orderId": "order126", "items": [{"productId": "prod2", "quantity": 1, "priceAtTime": 35.50}, {"productId": "prod6", "quantity": 2, "priceAtTime": 75.00}, {"productId": "prod1", "quantity": 2, "priceAtTime": 45.00}]}
{"storeId": 1, "orderId": "order123", "items": [{"productId": "prod1", "quantity": 2, "priceAtTime": 45.00}, {"productId": "prod2", "quantity": 1, "priceAtTime": 35.50}]}
{"storeId": 1, "orderId": "order124", "items": [{"productId": "prod3", "quantity": 1, "priceAtTime": 50.25}, {"productId": "prod2", "quantity": 1, "priceAtTime": 25.00}]}
{"storeId": 1, "orderId": "order126", "items": [{"productId": "prod2", "quantity": 1, "priceAtTime": 35.50}, {"productId": "prod6", "quantity": 2, "priceAtTime": 75.00}, {"productId": "prod1", "quantity": 2, "priceAtTime": 45.00}]}
{"storeId": 1, "orderId": "order127", "items": [{"productId": "prod7", "quantity": 1, "priceAtTime": 40.00}]} tb datasource truncate order_items --yes && tb datasource append order_items order-items.ndjson

# Running against Tinybird Local
# ** Data Source 'order_items' truncated
# Running against Tinybird Local
# Importing data to order_items...
# ✓ Done!

tb sql "select items__productId from order_items"

# Running against Tinybird Local
#  items__productId             
#  Array(String)                
#  ───────────────────────────────
#   ['prod1', 'prod2']           
#  ───────────────────────────────
#   ['prod3', 'prod2']           
#  ───────────────────────────────
#   ['prod2', 'prod6', 'prod1']  
#  ───────────────────────────────
#   ['prod1', 'prod2']           
#  ───────────────────────────────
#   ['prod3', 'prod2']           
#  ───────────────────────────────
#   ['prod2', 'prod6', 'prod1']  
#  ───────────────────────────────
#   ['prod7'] The data is ready, so let's proceed with the .pipe and SQL logic. Node by node:

- **  orders_multiprod**   : first, let's filter out the orders that only have 1 product.
- **  unrolled_prods**   : flatten the arrays to generate one row per array element.
- **  prod_pairs**   : take the products of the same order in sets of 2
- **  ranking**   : count coincidences

##### endpoints/products_bought_together.pipe

NODE orders_multiprod

SQL >
	SELECT orderId, items__productId, FROM order_items WHERE length(items__productId) > 1

NODE unrolled_prods

SQL >
	SELECT orderId, prods FROM orders_multiprod ARRAY JOIN items__productId as prods

NODE prod_pairs

SQL >
	SELECT distinct t1.orderId, t1.prods as product1, t2.prods as product2
    	FROM unrolled_prods t1
    	JOIN unrolled_prods t2 ON t1.orderId = t2.orderId
    	WHERE product1 != product2

NODE ranking

SQL >
	SELECT product1, product2, count() as pair_frequency
	FROM prod_pairs
	GROUP BY product1, product2
	ORDER BY product1, product2

TYPE ENDPOINT Tip: as this pipe is more complex, checking the results of the nodes in the UI is super helpful, so I'd do `tb dev --ui` and edit nodes in the UI.

As always, verify the API endpoint works by making a request:

curl -X GET "http://localhost:7181/v0/pipes/products_bought_together.csv?token=$TB_LOCAL_TOKEN"

# "product1","product2","pair_frequency"
# "prod1","prod2",3
# "prod1","prod6",1
# "prod2","prod1",3
# "prod2","prod3",1
# "prod2","prod6",1
# "prod3","prod2",1
# "prod6","prod1",1
# "prod6","prod2",1 7
## Deploy to Tinybird Cloud [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#deploy-to-tinybird-cloud)

You have the endpoints, so **you're ready to deploy to production** . Use the Tinybird CLI to push your changes.

Since your project is not in production yet, it is safe to YOLO and deploy directly with `tb --cloud deploy` . However, it is a good practice to run `--check` before.

tb --cloud deploy --check

# Running against Tinybird Cloud: Workspace ecommerce-analytics

# » Validating deployment...


# * Changes to be deployed:
# -------------------------------------------------------------------------------
# | status | name                 	| path                                	|
# -------------------------------------------------------------------------------
# | new	| orders               	| datasources/orders.datasource       	|
# | new	| order_items          	| datasources/order_items.datasource  	|
# | new	| products_bought_together | endpoints/products_bought_together.pipe |
# | new	| total_revenue        	| endpoints/total_revenue.pipe        	|
# -------------------------------------------------------------------------------
# * No changes in tokens to be deployed


# ✓ Deployment is valid
## Conclusion [¶](https://www.tinybird.co/docs/forward/get-started/learn/chapter1-idea-to-prod#conclusion)

You started with some sample data, created a project, and now have a working project in production.

Next steps:

- [  Send data](/docs/forward/get-data-in)   to Tinybird Cloud.
- Add[  query parameters](/docs/forward/work-with-data/query-parameters)   to make the API dynamic. Filter on a store so you don't expose data from store 3 to store 2's owner.
- Secure the endpoint with a[  token](/docs/forward/administration/tokens)   that is not the admin token.
- [  Optimize](/docs/forward/work-with-data/optimize)   the project. You want the API calls to be fast.
- Get ready for production with[  testing and CI/CD](/docs/forward/test-and-deploy)   . You want your existing APIs to keep working while you develop new features.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/table-functions/url
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "URL table function · Tinybird Docs"
theme-color: "#171612"
description: "Documentation for the Tinybird URL table function"
---


# URL table function BETA

[¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/url#url-table-function)

Copy as MD The Tinybird `url()` table function is currently in private beta. If you're interested in early access, reach out to support.

The Tinybird `url()` table function allows you to read data from an existing URL into Tinybird, then schedule a regular copy pipe to orchestrate synchronization. You can load full tables, and every run performs a full replace on the data source.

To use it, define a node using standard SQL and the `url` function keyword, then publish the node as a copy pipe that does a sync on every run. See [Table functions](../table-functions) for general information and tips.

## Syntax [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/url#syntax)

Create a new pipe node. Call the `url` table function and pass the URL. Optionally, pass the format and the structure:

##### Example query logic

SELECT
    JSONExtractString(data, 'article') AS article,
    JSONExtractInt(data, 'views') AS views,
    JSONExtractInt(data, 'rank') AS rank
FROM
    (
        SELECT toJSONString(arrayJoin(items.articles)) AS data
        FROM
            url(
                'https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia.org/all-access/2024/03/all-days',
                'JSONColumns',
                'items Tuple(access Nullable(String), articles Array(Tuple(article Nullable(String), rank Nullable(Int64), views Nullable(Int64))), day Nullable(String), month Nullable(String), project Nullable(String), year Nullable(String))'
            )
    ) Publish this node as a copy pipe. You can choose to append only new data or replace all data.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/url#see-also)

- [  Table functions](../table-functions)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/table-functions/postgresql
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "PostgreSQL table function · Tinybird Docs"
theme-color: "#171612"
description: "Documentation for the Tinybird PostgreSQL table function"
---


# PostgreSQL table function BETA

[¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/postgresql#postgresql-table-function)

Copy as MD The Tinybird `postgresql()` table function is currently in public beta.

The Tinybird `postgresql()` table function allows you to read data from your existing PostgreSQL database into Tinybird, then schedule a regular copy pipe to orchestrate synchronization. You can load full tables, and every run performs a full replace on the data source.

To use it, define a node using standard SQL and the `postgresql` function keyword, then publish the node as a copy pipe that does a sync on every run. See [Table functions](../table-functions) for general information and tips.

## Syntax [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/postgresql#syntax)

Create a new pipe node. Call the `postgresql` table function and pass the hostname and port, database, table, user, and password:

##### Example query logic

SELECT *
FROM postgresql(
  'aws-0-eu-central-1.TODO.com:3866',
  'postgres',
  'orders',
  {{tb_secret('pg_username')}},
  {{tb_secret('pg_password')}},
) Publish this node as a copy pipe. You can choose to append only new data or replace all data.

## Type support and inference [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/postgresql#type-support-and-inference)

Here's a detailed conversion table:

| PostgreSQL data type | Tinybird data type |
| --- | --- |
| BOOLEAN | UInt8 or Bool |
| SMALLINT | Int16 |
| INTEGER | Int32 |
| BIGINT | Int64 |
| REAL | Float32 |
| DOUBLE PRECISION | Float64 |
| NUMERIC or DECIMAL | Decimal(p, s) |
| CHAR(n) | FixedString(n) |
| VARCHAR (n) | String |
| TEXT | String |
| BYTEA | String |
| TIMESTAMP | DateTime |
| TIMESTAMP WITH TIME ZONE | DateTime (with appropriate timezone handling) |
| DATE | Date |
| TIME | String (since there is no direct TIME type) |
| TIME WITH TIME ZONE | String |
| INTERVAL | String |
| UUID | UUID |
| ARRAY | Array(T) where T is the array element type |
| JSON | String or JSON |
| JSONB | String |
| INET | String |
| CIDR | String |
| MACADDR | String |
| ENUM | Enum8 or Enum16 |
| GEOMETRY | String |

## Considerations [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/postgresql#considerations)

The following considerations apply to the `postgresql()` table function:

- Tinybird doesn't support all PostgreSQL types directly, so some types are mapped to String, which is the most flexible type for arbitrary data.
- For the `NUMERIC`   and `DECIMAL`   types, `Decimal(p, s)`   in Tinybird requires specifying precision (p) and scale (s).
- Time zone support in Tinybird's `DateTime`   can be managed via additional functions or by ensuring consistent storage and retrieval time zones.
- Some types like `INTERVAL`   don't have a direct equivalent in Tinybird and are usually stored as String or decomposed into separate fields.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/postgresql#see-also)

- [  Table functions](../table-functions)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/table-functions/mysql
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "MySQL table function · Tinybird Docs"
theme-color: "#171612"
description: "Documentation for the Tinybird MySQL table function."
---


# MySQL table function BETA

[¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/mysql#mysql-table-function)

Copy as MD The Tinybird `mysql()` table function is currently in private beta. If you're interested in early access, reach out to support.

The Tinybird `mysql()` table function allows you to read data from your existing MySQL database into Tinybird, then schedule a regular copy pipe to orchestrate synchronization. You can load full tables, and every run performs a full replace on the data source.

To use it, define a node using standard SQL and the `mysql` function keyword, then publish the node as a copy pipe that does a sync on every run. See [Table functions](../table-functions) for general information and tips.

## Syntax [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/mysql#syntax)

Create a new pipe node. Call the `mysql` table function and pass the hostname and port, database, table, user, and password:

##### Example query logic

SELECT *
FROM mysql(
  'aws-0-eu-central-1.TODO.com:3866',
  'mysql',
  'orders',
  {{tb_secret('my_username')}},
  {{tb_secret('my_password')}},
) Publish this node as a copy pipe. You can choose to append only new data or replace all data.

## Type support and inference [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/mysql#type-support-and-inference)

Here's a detailed conversion table:

| MySQL data type | Tinybird data type |
| --- | --- |
| UNSIGNED TINYINT | UInt8 |
| TINYINT | Int8 |
| UNSIGNED SMALLINT | UInt16 |
| SMALLINT | Int16 |
| UNSIGNED INT, UNSIGNED MEDIUMINT | UInt32 |
| INT, MEDIUMINT | Int32 |
| UNSIGNED BIGINT | UInt64 |
| BIGINT | Int64 |
| FLOAT | Float32 |
| DOUBLE | Float64 |
| DATE | Date |
| DATETIME, TIMESTAMP | DateTime |
| BINARY | FixedString |

## Considerations [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/mysql#considerations)

The following considerations apply to the `mysql()` table function:

- Tinybird doesn't support all MySQL types directly, so some types are mapped to String, which is the most flexible type for arbitrary data.
- Time zone support in Tinybird's `DateTime`   can be managed via additional functions or by ensuring consistent storage and retrieval time zones.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/mysql#see-also)

- [  Table functions](../table-functions)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/table-functions/iceberg
Last update: 2025-05-27T13:35:37.000Z
Content:
---
title: "Apache Iceberg table function · Tinybird Docs"
theme-color: "#171612"
description: "Documentation for the Tinybird Iceberg table function."
---


# Iceberg table function BETA

[¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/iceberg#iceberg-table-function)

Copy as MD The Tinybird `iceberg()` table function is currently in private beta. If you're interested in early access, reach out to support.

The Tinybird `iceberg()` table function allows you to read data from your existing Apache Iceberg database in S3 into Tinybird, then schedule a regular copy pipe to orchestrate synchronization. You can load full tables, and every run performs a full replace on the data source.

To use it, define a node using standard SQL and the `iceberg` function keyword, then publish the node as a copy pipe that does a sync on every run. See [Table functions](../table-functions) for general information and tips.

Additionally you can use the `iceberg` table function in API endpoints.

## Syntax [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/iceberg#syntax)

Create a new pipe node. Call the `iceberg` table function and pass the AWS access key and secret as Tinybird secrets:

##### Example query logic

SELECT *
FROM iceberg(
  's3://your_bucket/iceberg/db/table',
  {{tb_secret('aws_key')}},
  {{tb_secret('aws_secret')}},
) Publish this node as a copy pipe. You can choose to append only new data or replace all data.

Check a full working example in this [GitHub repository](https://github.com/tinybirdco/iceberg-tinybird)

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/table-functions/iceberg#see-also)

- [  How to effectively use table functions](../table-functions)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/connectors/s3
Last update: 2025-07-02T21:49:40.000Z
Content:
---
title: "S3 connector · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to configure the S3 connector for Tinybird."
---


# S3 connector [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#s3-connector)

Copy as MD You can set up an S3 connector to load your CSV, NDJSON, or Parquet files into Tinybird from any S3 bucket. Tinybird can detect new files in your buckets and ingest them automatically.

Setting up the S3 connector requires:

1. Configuring AWS[  permissions](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#aws-permissions)   using[  IAM roles](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html)  .
2. Creating a connection file in Tinybird.
3. Creating a data source that uses this connection.

## Environment considerations [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#environment-considerations)

Before setting up the connector, understand how it works in different environments.

### Cloud environment [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#cloud-environment)

In the Tinybird Cloud environment, Tinybird uses its own AWS account to assume the IAM role you create, allowing it to access your S3 bucket.

### Local environment [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#local-environment)

When using the S3 connector in the Tinybird Local environment, which runs in a container, you need to pass your local AWS credentials to the container. These credentials must have the [permissions described in the AWS permissions section](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#aws-permissions) , including access to S3 operations like `GetObject`, `ListBucket` , etc. This allows Tinybird Local to assume the IAM role you specify in your connection.

To pass your AWS credentials, use the `--use-aws-creds` flag when starting Tinybird Local:

tb local start --use-aws-creds

» Starting Tinybird Local...
✓ AWS credentials found and will be passed to Tinybird Local (region: us-east-1)
* Waiting for Tinybird Local to be ready...
✓ Tinybird Local is ready! If you're using a specific AWS profile, you can specify it using the `AWS_PROFILE` environment variable:

AWS_PROFILE=my-profile tb local start --use-aws-creds When using the S3 connector in the `--local` environment, continuous file ingestion is limited. For continuous ingestion of new files, use the Cloud environment.

## Set up the connector [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#set-up-the-connector)

1
### Create an S3 connection [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#create-an-s3-connection)

You can create an S3 connection in Tinybird using either the guided CLI process or by manually creating a connection file.

#### Option 1: Use the guided CLI process (recommended) [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#option-1-use-the-guided-cli-process-recommended)

The Tinybird CLI provides a guided process that helps you set up the required AWS permissions and creates the connection file automatically:

tb connection create s3 When prompted, you'll need to:

1. Enter a name for your connection.
2. Specify whether you'll use this connection for sinking or ingesting data.
3. Enter the S3 bucket name.
4. Enter the AWS region where your bucket is located.
5. Copy the displayed AWS IAM policy to your clipboard (you'll need this to set up permissions in AWS).
6. Copy the displayed AWS IAM role trust policy for your Local environment, then enter the ARN of the role you create.
7. Copy the displayed AWS IAM role trust policy for your Cloud environment, then enter the ARN of the role you create.
8. The ARN values will be stored securely using[  tb secret](/docs/forward/dev-reference/commands/tb-secret)   , which will allow you to have different roles for each environment.

#### Option 2: Create a connection file manually [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#option-2-create-a-connection-file-manually)

You can also set up a connection manually by creating a [connection file](/docs/forward/dev-reference/datafiles/connection-files) with the required credentials:

##### s3sample.connection

TYPE s3
S3_REGION "<S3_REGION>"
S3_ARN "<IAM_ROLE_ARN>" When creating your connection manually, you need to set up the required AWS IAM role with appropriate permissions. See the [AWS permissions](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#aws-permissions) section for details on the required access policy and trust policy configurations.

See [Connection files](/docs/forward/dev-reference/datafiles/connection-files) for more details on how to create a connection file and manage secrets.

You need to create separate connections for each environment you're working with, Local and Cloud.

For example, you can create:

- `my-s3-local`   for your Local environment
- `my-s3-cloud`   for your Cloud environment

2
### Create an S3 data source [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#create-an-s3-data-source)

After creating the connection, you need to create a data source that uses it.

Create a [.datasource](forward/dev-reference/datafiles/datasource-files) file using `tb datasource create --s3` or manually:

##### s3sample.datasource

DESCRIPTION >
    Analytics events landing data source

SCHEMA >
    `timestamp` DateTime `json:$.timestamp`,
    `session_id` String `json:$.session_id`,
    `action` LowCardinality(String) `json:$.action`,
    `version` LowCardinality(String) `json:$.version`,
    `payload` String `json:$.payload`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "timestamp"
ENGINE_TTL "timestamp + toIntervalDay(60)"

IMPORT_CONNECTION_NAME s3sample
IMPORT_BUCKET_URI s3://my-bucket/*.csv
IMPORT_SCHEDULE @auto The `IMPORT_CONNECTION_NAME` setting must match the name of the .connection file you created in the previous step.

3
### Deploy [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#deploy)

After defining your S3 data source and connection, test it by running a deploy check:

tb --cloud deploy --check This runs the connection locally and checks if the connection is valid. To see the connection details, run `tb --cloud connection ls`.

When ready, push the datafile to your Workspace using `tb deploy` to create the S3 data source:

tb --cloud deploy
## .connection settings [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#connection-settings)

The S3 connector use the following settings in .connection files:

| Instruction | Required | Description |
| --- | --- | --- |
| `S3_REGION` | Yes | Region of the S3 bucket. |
| `S3_ARN` | Yes | ARN of the IAM role with the required permissions. |

Once a connection is used in a data source, you can't change the ARN account ID or region. To modify these values, you must:

1. Remove the connection from the data source.
2. Deploy the changes.
3. Add the connection again with the new values.

## .datasource settings [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#datasource-settings)

The S3 connector uses the following settings in .datasource files:

| Instruction | Required | Description |
| --- | --- | --- |
| `IMPORT_SCHEDULE` | Yes | Use `@auto`   to ingest new files automatically, or `@once`   to only execute manually. Note that in the `--local`   environment, even if you set `@auto`   , only the initial sync will be performed, loading all existing files, but the connector will not continue to automatically ingest new files afterwards. |
| `IMPORT_CONNECTION_NAME` | Yes | Name given to the connection inside Tinybird. For example, `'my_connection'`   . This is the name of the connection file you created in the previous step. |
| `IMPORT_BUCKET_URI` | Yes | Full bucket path, including the `s3://`   protocol, bucket name, object path, and an optional pattern to match against object keys. For example, `s3://my-bucket/my-path`   discovers all files in the bucket `my-bucket`   under the prefix `/my-path`   . You can use patterns in the path to filter objects, for example, ending the path with `*.csv`   matches all objects that end with the `.csv`   suffix. |
| `IMPORT_FROM_TIMESTAMP` | No | Sets the date and time from which to start ingesting files on an S3 bucket. The format is `YYYY-MM-DDTHH:MM:SSZ`  . |

The only supported change is updating `IMPORT_SCHEDULE` from `@once` to `@auto` which makes the connector ingest all files that match the bucket URI pattern since the last on-demand ingestion.

For any other parameter changes, you must:

1. Remove the connection from the data source.
2. Deploy the changes.
3. Add the connection again with the new values.
4. Deploy again.

## Syncing Your Data [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#syncing-your-data)

In case you go with the `@on-demand` option for your `IMPORT_SCHEDULE` , you can always trigger a **Sync now** action at any time. To do this, run the `tb datasource sync <datasource_name>` command from the CLI. The command prompts for confirmation to sync the Data Source. Enter `y` to confirm. The Data Source will then sync data from its last synchronization point, preventing duplicates.

Be careful when using `IMPORT_SCHEDULE` with `@on-demand` . If you trigger a **Sync now** action while simultaneously uploading a large file to S3, a race condition may cause data loss.

The system syncs files based on their `creation_time` . However, S3 sets the `creation_time` only when the upload is fully completed. If you trigger a `sync now` at time T, but a file upload started before T and finishes *after* T, the file will be assigned a `creation_time` that is earlier than T. This means the sync process initiated at T will not detect the file.

## S3 file URI [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#s3-file-uri)

The S3 connector supports the following wildcard patterns:

- Single asterisk or `*`   : matches zero or more characters within a single directory level, excluding `/`   . It doesn't cross directory boundaries. For example, `s3://bucket-name/*.ndjson`   matches all `.ndjson`   files in the root of your bucket but doesn't match files in subdirectories.
- Double asterisk or `**`   : matches zero or more characters across multiple directory levels, including `/`   . It can cross directory boundaries recursively. For example: `s3://bucket-name/**/*.ndjson`   matches all `.ndjson`   files in the bucket, regardless of their directory depth.

Use the full S3 file URI and wildcards to select multiple files. The file extension is required to accurately match the desired files in your pattern.

Due to a limitation in Amazon S3, you can't create different S3 data sources with path expressions that collide. For example: `s3://my_bucket/**/*.csv` and `s3://my_bucket/transactions/*.csv`.

### Examples [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#examples)

The following are examples of patterns you can use and whether they'd match the example file path:

| File path | S3 File URI | Will match? |
| --- | --- | --- |
| example.ndjson | `s3://bucket-name/*.ndjson` | Yes. Matches files in the root directory with the `.ndjson`   extension. |
| example.ndjson.gz | `s3://bucket-name/**/*.ndjson.gz` | Yes. Recursively matches `.ndjson.gz`   files anywhere in the bucket. |
| example.ndjson | `s3://bucket-name/example.ndjson` | Yes. Exact match to the file path. |
| pending/example.ndjson | `s3://bucket-name/*.ndjson` | No. `*`   doesn't cross directory boundaries. |
| pending/example.ndjson | `s3://bucket-name/**/*.ndjson` | Yes. Recursively matches `.ndjson`   files in any subdirectory. |
| pending/example.ndjson | `s3://bucket-name/pending/example.ndjson` | Yes. Exact match to the file path. |
| pending/example.ndjson | `s3://bucket-name/pending/*.ndjson` | Yes. Matches `.ndjson`   files within the `pending`   directory. |
| pending/example.ndjson | `s3://bucket-name/pending/**/*.ndjson` | Yes. Recursively matches `.ndjson`   files within `pending`   and all its subdirectories. |
| pending/example.ndjson | `s3://bucket-name/**/pending/example.ndjson` | Yes. Matches the exact path to `pending/example.ndjson`   within any preceding directories. |
| pending/example.ndjson | `s3://bucket-name/other/example.ndjson` | No. Doesn't match because the path includes directories which aren't part of the file's actual path. |
| pending/example.ndjson.gz | `s3://bucket-name/pending/*.csv.gz` | No. The file extension `.ndjson.gz`   doesn't match `.csv.gz` |
| pending/o/inner/example.ndjson | `s3://bucket-name/*.ndjson` | No. `*`   doesn't cross directory boundaries. |
| pending/o/inner/example.ndjson | `s3://bucket-name/**/*.ndjson` | Yes. Recursively matches `.ndjson`   files anywhere in the bucket. |
| pending/o/inner/example.ndjson | `s3://bucket-name/**/inner/example.ndjson` | Yes. Matches the exact path to `inner/example.ndjson`   within any preceding directories. |
| pending/o/inner/example.ndjson | `s3://bucket-name/**/ex*.ndjson` | Yes. Recursively matches `.ndjson`   files starting with `ex`   at any depth. |
| pending/o/inner/example.ndjson | `s3://bucket-name/**/**/*.ndjson` | Yes. Matches `.ndjson`   files at any depth, even with multiple `**`   wildcards. |
| pending/o/inner/example.ndjson | `s3://bucket-name/pending/**/*.ndjson` | Yes. Matches `.ndjson`   files within `pending`   and all its subdirectories. |
| pending/o/inner/example.ndjson | `s3://bucket-name/inner/example.ndjson` | No. Doesn't match because the path includes directories which aren't part of the file's actual path. |
| pending/o/inner/example.ndjson | `s3://bucket-name/pending/example.ndjson` | No. Doesn't match because the path includes directories which aren't part of the file's actual path. |
| pending/o/inner/example.ndjson.gz | `s3://bucket-name/pending/*.ndjson.gz` | No. `*`   doesn't cross directory boundaries. |
| pending/o/inner/example.ndjson.gz | `s3://bucket-name/other/example.ndjson.gz` | No. Doesn't match because the path includes directories which aren't part of the file's actual path. |

### Considerations [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#considerations)

When using patterns:

- Use specific directory names or even specific file URIs to limit the scope of your search. The more specific your pattern, the narrower the search.
- Combine wildcards: you can combine `**`   with other patterns to match files in subdirectories selectively. For example, `s3://bucket-name/**/logs/*.ndjson`   matches `.ndjson`   files within any logs directory at any depth.
- Avoid unintended matches: be cautious with `**`   as it can match many files, which might impact performance and return partial matches.

## Supported file types [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#supported-file-types)

The S3 connector supports the following file types:

| File type | Accepted extensions | Compression formats supported |
| --- | --- | --- |
| CSV | `.csv`  , `.csv.gz` | `gzip` |
| NDJSON | `.ndjson`  , `.ndjson.gz`  , `.jsonl`  , `.jsonl.gz`  , `.json`  , `.json.gz` | `gzip` |
| Parquet | `.parquet`  , `.parquet.gz` | `snappy`  , `gzip`  , `lzo`  , `brotli`  , `lz4`  , `zstd` |

You can upload files with .json extension, provided they follow the Newline Delimited JSON (NDJSON) format. Each line must be a valid JSON object and every line has to end with a `\n` character.

Parquet schemas use the same format as NDJSON schemas, using [JSONPath](/docs/forward/dev-reference/datafiles/datasource-files#jsonpath-expressions) syntax.

## AWS permissions [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/s3#aws-permissions)

The S3 connector requires an IAM Role with specific permissions to access objects in your Amazon S3 bucket:

- `s3:GetObject`
- `s3:ListBucket`
- `s3:GetBucketNotification`
- `s3:PutBucketNotification`
- `s3:GetBucketLocation`

You need to create both an access policy and a trust policy in AWS:

- AWS Access Policy
- AWS Trust Policy

{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": [
          "s3:GetObject",
          "s3:ListBucket",
          "s3:GetBucketNotification",
          "s3:PutBucketNotification",
          "s3:GetBucketLocation"
        ],
        "Resource": [
          "arn:aws:s3:::{bucket-name}",
          "arn:aws:s3:::{bucket-name}/*"
        ]
      }
    ]
}

---

URL: https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka
Last update: 2025-06-20T12:45:25.000Z
Content:
---
title: "Kafka connector · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to configure the Kafka connector for Tinybird."
---


# Kafka connector [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-connector)

Copy as MD You can set up a Kafka connector to consume data from a Kafka topic and store it in Tinybird by creating a [.connection](forward/dev-reference/datafiles/connection-files) and [.datasource](forward/dev-reference/datafiles/datasource-files) file. Use `tb datasource create --kafka` command for a guided the process.

## Set up the connector [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#set-up-the-connector)

To set up the Kafka connector, follow these steps.

1
### Create a Kafka connection [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#create-a-kafka-connection)

You can create a Kafka connection in Tinybird using either the CLI or by manually creating a connection file.

#### Option 1: Use the CLI (recommended) [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#option-1-use-the-cli-recommended)

Run the following command to create a connection:

tb connection create kafka You will be prompted to enter:

1. A name for your connection.
2. The bootstrap server
3. The Kafka key
4. The Kafka secret

If you need to add `KAFKA_SCHEMA_REGISTRY_URL` or any of the [Kafka .connection settings](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-connection-settings) , edit the .connection file manually.

#### Option 2: Manually create a connection file [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#option-2-manually-create-a-connection-file)

Create a [.connection file](/docs/forward/dev-reference/datafiles/connection-files) with the required credentials stored in secrets. For example:

##### kafka_sample.connection

TYPE kafka
KAFKA_BOOTSTRAP_SERVERS bootsrap_servers:port
KAFKA_SECURITY_PROTOCOL SASL_SSL
KAFKA_SASL_MECHANISM PLAIN
KAFKA_KEY {{ tb_secret("KAFKA_KEY", "key") }}
KAFKA_SECRET {{ tb_secret("KAFKA_SECRET", "secret") }} For a complete list of Kafka connection settings, see [Kafka .connection settings](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#connection-settings).

Set the values of the secrets using [tb secret](/docs/forward/dev-reference/commands/tb-secret):

tb [--cloud] secret set KAFKA_KEY mykey Secrets are only replaced in your resources when you deploy. If you change a secret, you need to deploy for the changes to take effect.

2
### Create a Kafka data source [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#create-a-kafka-data-source)

Create a .datasource file using `tb datasource create --kafka` or manually.

Define the data source schema as with any non-Kafka datasource and specify the required Kafka settings. The value of `KAFKA_CONNECTION_NAME` must match the name of the .connection file you created in the previous step.

Default .datasource created will store the whole message in a column called `data` . Then, you can use [JSONExtract functions](/docs/sql-reference/functions/json-functions#jsonextract-functions) to access the message fields, either at query time or using materialized views.

##### kafka_default.datasource

SCHEMA >
    `data` String `json:$`

KAFKA_CONNECTION_NAME kafka_connection # The name of the .connection file
KAFKA_TOPIC topic_name
KAFKA_GROUP_ID {{ tb_secret("KAFKA_GROUP_ID") }} You can always use [JSONPaths](/docs/forward/dev-reference/datafiles/datasource-files#jsonpath-expressions) syntax to extract the message fields into separate columns at ingest time.

##### kafka_sample.datasource

SCHEMA >
   `timestamp` DateTime(3) `json:$.timestamp`,
   `session_id` String `json:$.session_id`,
   `action` LowCardinality(String) `json:$.action`,
   `version` LowCardinality(String) `json:$.version`,
   `payload` String `json:$.payload`,
   `data` String `json:$`

KAFKA_CONNECTION_NAME kafka_sample # The name of the .connection file
KAFKA_TOPIC test_topic
KAFKA_GROUP_ID {{ tb_secret("KAFKA_GROUP_ID") }} In addition to the columns specified in `SCHEMA` , Kafka data sources have additional columns that store metadata of the messages ingested. See [Kafka meta columns](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-meta-columns) for more information.

For a complete list of Kafka data source settings, see [Kafka .datasource settings](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#datasource-settings).

Use different consumer group values for `KAFKA_GROUP_ID` at different environments to isolate consumers and their committed offset.

3
### Connectivity check [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#connectivity-check)

After defining your Kafka data source and connection, validate the setup by running a deploy check:

tb --cloud deploy --check This will check that the Kafka broker is reachable and that Tinybird can connect to it with the provided credentials. Remember to set any secrets used by the connection.

## Compatibility [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#compatibility)

The connector is compatible with Apache Kafka and works with any compatible implementation and vendor. The following are tried and tested:

- Apache Kafka
- Confluent Platform and Confluent Cloud
- Redpanda
- AWS MSK
- Azure Event Hubs for Apache Kafka
- Estuary

## Kafka .datasource settings [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-datasource-settings)

| Instruction | Required | Description |
| --- | --- | --- |
| `KAFKA_CONNECTION_NAME` | Yes | Name of the configured Kafka connection in Tinybird. It must match the name of the   connection file (without the extension). |
| `KAFKA_TOPIC` | Yes | Name of the Kafka topic to consume from. |
| `KAFKA_GROUP_ID` | Yes | Consumer Group ID to use when consuming from Kafka. |
| `KAFKA_AUTO_OFFSET_RESET` | No | Offset to use when no previous offset can be found, like when creating a new consumer. Supported values are `latest`   and `earliest`   . Default: `latest`  . |
| `KAFKA_STORE_HEADERS` | No | Adds a `__headers Map(String, String)`   column to the data source, and stores Kafka headers in it for later processing. Default value is `False`  . |
| `KAFKA_STORE_RAW_VALUE` | No | Stores the raw message in its entirety in the `__value`   column. Default: `False`  . |
| `KAFKA_KEY_FORMAT` | No | Format of the message key. Valid values are `avro`  , `json_with_schema`   , and `json_without_schema`   . Using `avro`   or `json_with_schema`   requires `KAFKA_SCHEMA_REGISTRY_URL`   to be set in the connection file used by the data source. |
| `KAFKA_VALUE_FORMAT` | No | Format of the message value. Valid values are `avro`  , `json_with_schema`   , and `json_without_schema`   . Using `avro`   or `json_with_schema`   requires `KAFKA_SCHEMA_REGISTRY_URL`   to be set in the connection file used by the data source. |

## Kafka .connection settings [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-connection-settings)

| Instruction | Required | Description |
| --- | --- | --- |
| `KAFKA_BOOTSTRAP_SERVERS` | Yes | Comma-separated list of one or more Kafka brokers, including Port numbers. |
| `KAFKA_KEY` | Yes | Key used to authenticate with Kafka. Sometimes called Key, Client Key, or Username depending on the Kafka distribution. |
| `KAFKA_SECRET` | Yes | Secret used to authenticate with Kafka. Sometimes called Secret, Secret Key, or Password depending on the Kafka distribution. |
| `KAFKA_SECURITY_PROTOCOL` | No | Security protocol for the connection. Accepted values are `PLAINTEXT`   and `SASL_SSL`   . Default value is `SASL_SSL`  . |
| `KAFKA_SASL_MECHANISM` | No | SASL mechanism to use for authentication. Supported values are `PLAIN`  , `SCRAM-SHA-256`  , `SCRAM-SHA-512`   . Default value is `PLAIN`  . |
| `KAFKA_SCHEMA_REGISTRY_URL` | No | URL of the Kafka schema registry. Used for `avro`   and `json_with_schema`   deserialization of   keys and values. If Basic Auth is required, it must be included in the URL as in `https://user:password@registry_url` |
| `KAFKA_SSL_CA_PEM` | No | Content of the CA certificate in PEM format for SSL connections. |

## Kafka connector in the local environment [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-connector-in-the-local-environment)

You can use the Kafka connector in the [Tinybird Local container](/docs/forward/install-tinybird/local) to consume messages from a local Kafka server or a Kafka server in the cloud.

### Local Kafka server with Docker Compose [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#local-kafka-server-with-docker-compose)

When using a local Kafka server, ensure the Tinybird Local container can access it. If you are running Kafka using Docker, Docker Compose is the best option to set up both Kafka and Tinybird Local in the same network. Here's an example using `apache/kafka`:

networks:
  kafka_network:
    driver: bridge

volumes:
  kafka-data:

services:

  tinybird-local:
    image: tinybirdco/tinybird-local:latest
    container_name: tinybird-local
    platform: linux/amd64
    ports:
      - "7181:7181"
    networks:
      - kafka_network

  kafka:
    image: apache/kafka:latest
    hostname: broker
    container_name: broker
    ports:
      - 9092:9092
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_PROCESS_ROLES: "broker,controller"
      KAFKA_CONTROLLER_QUORUM_VOTERS: "1@broker:29093"
      KAFKA_CONTROLLER_LISTENER_NAMES: "CONTROLLER"
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT,CONTROLLER:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:29092,PLAINTEXT_HOST://0.0.0.0:9092,CONTROLLER://0.0.0.0:29093
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
    volumes:
      - kafka-data:/var/lib/kafka/data
    networks:
      - kafka_network
### Network Configuration [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#network-configuration)

The key points about the network configuration:

1. The example uses a bridge network ( `kafka_network`   ) to enable communication between containers
2. The Kafka service exposes ports for both internal container communication and external access
3. Tinybird Local connects to Kafka using the internal network address
4. The bootstrap servers address in your Kafka Connection should match the `KAFKA_ADVERTISED_LISTENERS`   in the `docker-compose.yml`   file (e.g., `kafka:29092`   )

### Creating the Kafka Connection and Data Source [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#creating-the-kafka-connection-and-data-source)

The following examples use default values in the `tb_secret()` function, which are suitable for this local setup. When deploying to Tinybird Cloud, you'll set these secrets in the Cloud environment instead.

Connection file `/connections/kafka_conn.connection`

##### /connections/kafka_conn.connection

TYPE kafka
KAFKA_BOOTSTRAP_SERVERS {{ tb_secret("KAFKA_PROD_SERVER", "kafka:29092") }}
KAFKA_SECURITY_PROTOCOL {{ tb_secret("KAFKA_PROD_SECURITY_PROTOCOL", "PLAINTEXT") }}
KAFKA_SASL_MECHANISM {{ tb_secret("KAFKA_PROD_SASL_MECHANISM", "PLAIN") }}
KAFKA_KEY {{ tb_secret("KAFKA_KEY", "key") }}
KAFKA_SECRET {{ tb_secret("KAFKA_SECRET", "secret") }} A Schemaless kafka data source file `/datasources/kafka_ds.datasource`

##### datasources/kafka_ds.datasource

SCHEMA >
    `data` String `json:$`

KAFKA_CONNECTION_NAME kafka_conn
KAFKA_TOPIC sample-topic
KAFKA_GROUP_ID my_group_id
### Usage example [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#usage-example)

1
#### Start the Docker containers [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#start-the-docker-containers)

docker compose up 2
#### Create the `sample-topic` [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#create-the)

docker exec -it broker /opt/kafka/bin/kafka-topics.sh --create --topic sample-topic --bootstrap-server localhost:9092

# Created topic sample-topic. 3
#### Deploy the project [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#deploy-the-project)

tb deploy
# Running against Tinybird Local

# * Changes to be deployed:
# ------------------------------------------------------------------------
# | status | name       | type       | path                              |
# ------------------------------------------------------------------------
# | new    | kafka_ds   | datasource | datasources/kafka_ds.datasource   |
# | new    | kafka_conn | connection | connections/kafka_conn.connection |
# ------------------------------------------------------------------------
# * No changes in tokens to be deployed

# Deployment URL: http://cloud.tinybird.co/local/7181/None/deployments/1

# * Deployment submitted
# » Waiting for deployment to be ready...
# ✓ Deployment is ready
# » Removing old deployment
# ✓ Old deployment removed
# » Waiting for deployment to be promoted...
# ✓ Deployment #1 is live!
# A deployment with no data is useless. Learn how to ingest at https://www.tinybird.co/docs/forward/get-data-in 4
#### Send data to the topic and query it [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#send-data-to-the-topic-and-query-it)

echo '{"data": "test"}' | docker exec -i broker /opt/kafka/bin/kafka-console-producer.sh --topic sample-topic --bootstrap-server localhost:9092

tb sql "select * from kafka_ds"
# Running against Tinybird Local
#   data               __value   __topic                  __partition   __offset   __timestamp           __key   
#   String             String    LowCardinality(String)         Int16      Int64   DateTime              String  
# ───────────────────────────────────────────────────────────────────────────────────────────────────────────────
#   {"data": "test"}             sample-topic                       0          1   2025-06-20 12:17:49
### Docker Compose troubleshooting [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#docker-compose-troubleshooting)

If you encounter connection issues:

1. Ensure all containers are running: `docker-compose ps`
2. Check container logs: `docker-compose logs kafka`
3. Ensure the bootstrap servers address in your Connection file matches the `KAFKA_ADVERTISED_LISTENERS`   value in your `docker-compose.yml`   file.

## Kafka meta columns [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-meta-columns)

When you connect a data source to Kafka, the following columns are added to store metadata from Kafka messages:

| name | type | description |
| --- | --- | --- |
| `__value` | `String` | A String representing the entire unparsed value of the Kafka message. It is only populated if `KAFKA_STORE_RAW_VALUE`   is set to `True`  . |
| `__topic` | `LowCardinality(String)` | The topic that the message was read from. |
| `__partition` | `Int16` | The partition that the message was read from. |
| `__offset` | `Int16` | The offset of the message. |
| `__timestamp` | `Datetime` | The timestamp of the message. |
| `__key` | `String` | The key of the message. |

Optionally, when `KAFKA_STORE_HEADERS` is set to `True` , the following column is added and populated:

| name | type | description |
| --- | --- | --- |
| `__headers` | `Map(String, String)` | Kafka headers of the message. |

When you iterate your Kafka data source, you might need to use the meta columns in the [FORWARD_QUERY](/docs/forward/test-and-deploy/evolve-data-source#forward-query) . Tinybird suggests a valid forward query that you can tweak to get the desired values for each column.

## Kafka logs [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-logs)


You can find global logs in the `datasources_ops_log` [Service Data Source](/docs/forward/monitoring/service-datasources#tinybird-datasources-ops-log) . Filter by `datasource_id` to select the correct datasource, and by `event_type='append-kafka'`.

For example, to select all Kafka releated logs in the last day, run the following query:

SELECT *
FROM tinybird.datasources_ops_log
WHERE datasource_id = 't_1234'
  AND event_type = 'append-kafka'
  AND timestamp > now() - INTERVAL 1 day
ORDER BY timestamp DESC If you can't find logs in `datasources_ops_log` , the `kafka_ops_log` [Service Data Source](/docs/forward/monitoring/service-datasources#tinybird-kafka-ops-log) contains more detailed logs. Filter by `datasource_id` to select the correct datasource, and use `msg_type` to select the desired log level ( `info`, `warning` , or `error` ).

SELECT *
FROM tinybird.kafka_ops_log
WHERE datasource_id = 't_1234'
  AND timestamp > now() - interval 1 day
  AND msg_type IN ['info', 'warning', 'error']
## Troubleshooting [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#troubleshooting)

Each combination of `KAFKA_TOPIC` and `KAFKA_GROUP_ID` can only be used in one data source, otherwise the offsets committed by the consumers of different data sources clash.

If you connect a data source to Kafka using a `KAFKA_TOPIC` and `KAFKA_GROUP_ID` that were previously used by another data source in your workspace, the data source only receives data from the last committed offset, even if `KAFKA_AUTO_OFFSET_RESET` is set to `earliest`.

To prevent these issues, always use unique `KAFKA_GROUP_ID` s when testing Kafka data sources.

See [Kafka logs](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-logs) to learn how to diagnose any other issues

### Compressed messages [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#compressed-messages)

Tinybird can consume from Kafka topics where Kafka compression is turned on; decompressing the message is a standard function of the Kafka consumer. However, if you compressed the message before passing it through the Kafka producer, Tinybird can't do post-consumer processing to decompress the message.

For example, if you compressed a JSON message through gzip and produced it to a Kafka topic as a `bytes` message, it would be ingested by Tinybird as `bytes` . If you produced a JSON message to a Kafka topic with the Kafka producer setting `compression.type=gzip` , while it would be stored in Kafka as compressed bytes, it would be decoded on ingestion and arrive to Tinybird as JSON.

## Connecting an existing data source to Kafka [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#connecting-an-existing-data-source-to-kafka)

You can connect an existing, default data source to Kafka.

Create the Kafka .connection file if it does not exist, add the desired Kafka settings to the .datasource file, and add a<a href="/docs/forward/test-and-deploy"> `FORWARD_QUERY`</a> to provide default values for the [Kafka meta columns](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-meta-columns).

## Disconnecting a data source from Kafka [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#disconnecting-a-data-source-from-kafka)

To disconnect a data source from Kafka, remove the Kafka settings from the .datasource file.

If you want to keep any of the [Kafka meta columns](https://www.tinybird.co/docs/forward/get-data-in/connectors/kafka#kafka-meta-columns) , add them to the schema with a default value and adjust the `FORWARD_QUERY` accordingly.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs
Last update: 2025-07-02T21:49:40.000Z
Content:
---
title: "GCS Connector · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to configure the GCS connector for Tinybird."
---


# GCS connector [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#gcs-connector)

Copy as MD You can set up a GCS connector to load your CSV, NDJSON, or Parquet files into Tinybird from any GCS bucket. Tinybird does **not** automatically detect new files; ingestion must be triggered manually.

Setting up the GCS connector requires:

1. Configuring a[  Service Account](https://cloud.google.com/iam/docs/service-accounts-create)   with these[  permissions](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#gcs-permissions)   in GCP.
2. Creating a connection file in Tinybird.
3. Creating a data source that uses this connection.

## Set up the connector [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#set-up-the-connector)

1
### Create a GCS connection [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#create-a-gcs-connection)

You can create a GCS connection in Tinybird using either the CLI or by manually creating a connection file.

#### Option 1: Use the CLI (recommended) [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#option-1-use-the-cli-recommended)

Run the following command to create a connection:

tb connection create gcs You will be prompted to enter:

1. A name for your connection.
2. The GCS bucket name.
3. The service account credentials (JSON key file). You can check[  Google Cloud docs](https://cloud.google.com/iam/docs/keys-create-delete)   for mode details.
4. Whether to create the connection for your Cloud environment.

#### Option 2: Manually create a connection file [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#option-2-manually-create-a-connection-file)

Create a `.connection` file with the required credentials:

##### gcs_sample.connection

TYPE gcs
GCS_SERVICE_ACCOUNT_CREDENTIALS_JSON {{ tb_secret("GCS_KEY") }} Ensure your GCP Service Account has the `roles/storage.objectViewer` role.

Use different Service Account keys for each environment leveraging [Tinybird Secrets](/docs/forward/dev-reference/commands/tb-secret).

2
### Create a GCS data source [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#create-a-gcs-data-source)

After setting up the connection, create a data source.

Create a [.datasource](forward/dev-reference/datafiles/datasource-files) file using `tb datasource create --gcs` or manually:

##### gcs_sample.datasource

DESCRIPTION >
    Analytics events landing data source

SCHEMA >
    `timestamp` DateTime `json:$.timestamp`,
    `session_id` String `json:$.session_id`,
    `action` LowCardinality(String) `json:$.action`,
    `version` LowCardinality(String) `json:$.version`,
    `payload` String `json:$.payload`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "timestamp"
ENGINE_TTL "timestamp + toIntervalDay(60)"

IMPORT_CONNECTION_NAME gcs_sample
IMPORT_BUCKET_URI gs://my-bucket/*.csv
IMPORT_SCHEDULE '@on-demand' The `IMPORT_CONNECTION_NAME` setting must match the name of your `.connection` file.

3
### Sync data [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#sync-data)

Since **automatic ingestion ( `@auto` mode) is not supported** , you must manually sync data when new files are available.

#### Using the API [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#using-the-api)

curl -X POST "https://api.tinybird.co/v0/datasources/<datasource_name>/scheduling/runs" \
  -H "Authorization: Bearer <your-tinybird-token>"
#### Using the CLI [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#using-the-cli)

tb datasource sync <datasource_name>
## .connection settings [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#connection-settings)

The GCS connector use the following settings in .connection files:

| Instruction | Required | Description |
| --- | --- | --- |
| `GCS_SERVICE_ACCOUNT_CREDENTIALS_JSON` | Yes | Service Account Key in JSON format, inlined. We recommend using[  Tinybird Secrets](/docs/forward/dev-reference/commands/tb-secret)  . |

Once a connection is used in a data source, you can't change the Service Account Key. To modify it, you must:

1. Remove the connection from the data source.
2. Deploy the changes.
3. Add the connection again with the new values.

## .datasource settings [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#datasource-settings)

The GCS connector uses the following settings in .datasource files:

| Instruction | Required | Description |
| --- | --- | --- |
| `IMPORT_CONNECTION_NAME` | Yes | Name given to the connection inside Tinybird. For example, `'my_connection'`   . This is the name of the connection file you created in the previous step. |
| `IMPORT_BUCKET_URI` | Yes | Full bucket path, including the `gs://`   protocol, bucket name, object path, and an optional pattern to match against object keys. For example, `gs://my-bucket/my-path`   discovers all files in the bucket `my-bucket`   under the prefix `/my-path`   . You can use patterns in the path to filter objects, for example, ending the path with `*.csv`   matches all objects that end with the `.csv`   suffix. |
| `IMPORT_SCHEDULE` | Yes | Use `@on-demand`   to sync new files as needed, only files added to the bucket since the last execution will be appended to the datasource. You can also use `@once`   , which behaves the same as `@on-demand`   . However, `@auto`   mode is not supported yet; if you use this option, only the initial sync will be executed. |
| `IMPORT_FROM_TIMESTAMP` | No | Sets the date and time from which to start ingesting files on an GCS bucket. The format is `YYYY-MM-DDTHH:MM:SSZ`  . |

We don't support changing these settings after the data source is created. If you need to do that, you must:

1. Remove the connection from the data source.
2. Deploy the changes.
3. Add the connection again with the new values.
4. Deploy again.

## GCS file URI [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#gcs-file-uri)

Use GCS wildcards to match multiple files:

- `*`   (single asterisk): Matches files at one directory level.
  - Example: `gs://bucket-name/*.ndjson`     (matches all `.ndjson`     files in the root directory, but not in subdirectories).
- `**`   (double asterisk): Recursively matches files across multiple directory levels.
  - Example: `gs://bucket-name/**/*.ndjson`     (matches all `.ndjson`     files anywhere in the bucket).

GCS does not allow overlapping ingestion paths. For example, you cannot have:

- `gs://my_bucket/**/*.csv`
- `gs://my_bucket/transactions/*.csv`

## Supported file types [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#supported-file-types)

The GCS Connector supports the following formats:

| File Type | Accepted Extensions | Supported Compression |  |  |
| --- |  |  |
| CSV | `.csv`  , `.csv.gz`   | `gzip` | NDJSON | `.ndjson`  , `.ndjson.gz`  , `.jsonl`  , `.jsonl.gz`   | `gzip` | Parquet | `.parquet`  , `.parquet.gz`   | `snappy`  , `gzip`  , `lzo`  , `brotli`  , `lz4`  , `zstd` |

JSON files must follow the **Newline Delimited JSON (NDJSON)** format. Each line must be a valid JSON object and must end with a `\n` character.

## GCS Permissions [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#gcs-permissions)

To authenticate Tinybird with GCS, you need a GCP service account key in JSON format with the **Object Storage Viewer** role.

1. In the Google Cloud Console, create or use an existing service account.
2. Assign the `roles/storage.objectViewer`   role.
3. Generate a JSON key file and download it.
4. Store the key as a Tinybird secret in a `.env.local`   file to work in local:

GCS_KEY='<your-json-key-content>'
1. Store the key in Cloud as a Tinybird secret:

tb --cloud secret set GCS_KEY '<your-json-key-content>'
## Limitations [¶](https://www.tinybird.co/docs/forward/get-data-in/connectors/gcs#limitations)

- **  No `@auto`   mode**   : Ingestion must be triggered manually.
- **  File format support**   : Only CSV, NDJSON, and Parquet are supported.
- **  Permissions**   : Ensure your service account has the correct role assigned.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/python-sdk
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Python logs to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Send your Python logs to Tinybird using the standard logging library and Tinybird Python SDK."
---


# Send Python logs to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/python-sdk#send-python-logs-to-tinybird)

Copy as MD You can send logs from a Python application or service to Tinybird using the standard Python logging library and the [tinybird-python-sdk](https://pypi.org/project/tinybird-python-sdk/).

## Prerequisites [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/python-sdk#prerequisites)

To use the Tinybird Python SDK you need Python 3.11 or higher.

## Configure the logging handler [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/python-sdk#configure-the-logging-handler)

First, configure a Tinybird logging handler in your application. For example:

import logging
from multiprocessing import Queue
from tb.logger import TinybirdLoggingQueueHandler

logger = logging.getLogger('your-logger-name')
handler = TinybirdLoggingHandler(<YOUR_TB_API_URL>, <YOUR_TB_WRITE_TOKEN>, 'your-app-name')
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler) Each time you call the logger, the SDK sends an event to the `tb_logs` data source in your workspace.

To configure the data source name, initialize the `TinybirdLoggingHandler` like this:

handler = TinybirdLoggingHandler(<YOUR_TB_API_URL>, <YOUR_TB_WRITE_TOKEN>, 'your-app-name', ds_name="your_tb_ds_name")
## Non-blocking logging [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/python-sdk#non-blocking-logging)

If you want to avoid blocking the main thread, use a queue to send the logs to a different thread. For example:

import logging
from multiprocessing import Queue
from tb.logger import TinybirdLoggingQueueHandler
from dotenv import load_dotenv

load_dotenv()
TB_API_URL = os.getenv("<YOUR_TB_API_URL>")
TB_WRITE_TOKEN = os.getenv("<YOUR_TB_WRITE_TOKEN>")

logger = logging.getLogger('your-logger-name')
handler = TinybirdLoggingQueueHandler(Queue(-1), TB_API_URL, TB_WRITE_TOKEN, 'your-app-name', ds_name="your_tb_ds_name")
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)

---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/postgres-cdc-with-redpanda-connect
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Postgres CDC with Redpanda Connect · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to ingest data from a Postgres database using Redpanda Connect and the Postgres CDC input."
---


# PostgreSQL CDC with Redpanda Connect [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/postgres-cdc-with-redpanda-connect#postgresql-cdc-with-redpanda-connect)

Copy as MD [Redpanda Connect](https://www.redpanda.com/connect) is an ecosystem of high-performance streaming connectors that serves as a simplified and powerful alternative to Kafka Connect.

Tinybird is the ideal complement to Postgres for handling OLAP workloads. The following guide shows you how to use Redpanda Connect to ingest data from a Postgres database into Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/postgres-cdc-with-redpanda-connect#before-you-start)

Before you connect Postgres to Redpanda, ensure:

- You have a Redpanda cluster and Redpanda Connect installed with version 4.43.0 or higher. The following instructions use Redpanda Serverless, but you can use Redpanda Cloud Dedicated or self-hosted.
- You have a PostgreSQL database with logical replication enabled.

## Connect Postgres to Redpanda [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/postgres-cdc-with-redpanda-connect#connect-postgres-to-redpanda)

1. In the Redpanda Cloud console, select**  Connect**   from the navigation menu, then select**  Create pipeline**  .
2. Add the pipeline configuration. You need the following information:

- Postgres connection string ( `dsn`   )
- Redpanda brokers ( `seed_brokers`   )
- SASL mechanism ( `mechanism`   )
- Username ( `username`   )
- Password ( `password`   )

Use the following YAML template:

input:
  label: "postgres_cdc"
  postgres_cdc:
    dsn: <<postgresql://user:pass@host:port/db>> 
    include_transaction_markers: false
    slot_name: test_slot_native_decoder
    snapshot_batch_size: 100000
    stream_snapshot: true
    temporary_slot: true
    schema: public
    tables:
      - <<Table name>>

output:
 redpanda:
   seed_brokers:
     - ${REDPANDA_BROKERS}
   topic: <<Topic name>>
   tls:
     enabled: false
   sasl:
     - mechanism: SCRAM-SHA-512
       password: <<Password>>
       username: <<Username>> See the Redpanda Connect docs for more information on the<a href="https://docs.redpanda.com/redpanda-connect/components/outputs/redpanda/"> `redpanda` output</a> and<a href="https://docs.redpanda.com/redpanda-connect/components/inputs/postgres_cdc/"> `postgres_cdc` input</a>.

1. Start the Redpanda Connect pipeline

Select **Create** to save and create the pipeline. This takes you back to the pipeline screen, where you can find your new pipeline. Open the new pipeline to view the logs and confirm that the pipeline is running.

Select the Topics page from the navigation menu and confirm that the topic exists and that messages are being produced.

1. Connect Redpanda to Tinybird

Create a new Kafka connection and data source. See [Kafka](../connectors/kafka) for more information.

Redpanda Connect continuosly consumes changes from Postgres and pushes them to your Redpanda topic. Tinybird consumes the changes from Redpanda in real time, making them available to query with minimal latency.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/postgres-cdc-with-redpanda-connect#see-also)

- <a href="https://docs.redpanda.com/redpanda-connect/components/outputs/redpanda/"> `redpanda`   output</a>
- <a href="https://docs.redpanda.com/redpanda-connect/components/inputs/postgres_cdc/"> `postgres_cdc`   input</a>


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-with-estuary
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Ingest with Estuary · Tinybird Docs"
theme-color: "#171612"
description: "In this guide, you'll learn how to use Estuary to push data streams to Tinybird."
---


# Ingest with Estuary [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-with-estuary#ingest-with-estuary)

Copy as MD In this guide, you'll learn how to use Estuary to push data streams to Tinybird.

[Estuary](https://estuary.dev/) is a real-time ETL tool that allows you capture data from a range of source, and push it to a range of destinations. Using Estuary's Dekaf, you can connect Tinybird to Estuary as if it was a Kafka broker - meaning you can use Tinybird's native Kafka Connector to consume data from Estuary.

[Read more about Estuary Dekaf.](https://docs.estuary.dev/reference/Connectors/materialization-connectors/Dekaf/)

## Prerequisites [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-with-estuary#prerequisites)

- An Estuary account and collection
- A Tinybird account and workspace

## Connecting to Estuary [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-with-estuary#connecting-to-estuary)

In Estuary, create a new Dekaf materialization to use for the Tinybird connection.

You can create it from the [Estuary destinations tab](https://dashboard.estuary.dev/materializations/create). You have all the details on the [Tinybird Dekaf Estuary docs page.](https://docs.estuary.dev/reference/Connectors/materialization-connectors/Dekaf/tinybird/)

In your Tinybird workspace, create a new data source and use the [Kafka Connector](../connectors/kafka).

To configure the connection details, use the following settings (these can also be found in the [Estuary Dekaf docs](https://docs.estuary.dev/guides/dekaf_reading_collections_from_kafka/#connection-details) ).

- Bootstrap servers: `dekaf.estuary-data.com`
- SASL Mechanism: `PLAIN`
- SASL Username: Your materialization task name, such as `YOUR-ORG/YOUR-PREFIX/YOUR-MATERIALIZATION`
- SASL Password: Auth token provided when you the Dekaf materialization was created on Estuary

Tick the `Decode Avro messages with Schema Register` box, and use the following settings:

- URL: `https://dekaf.estuary-data.com`
- Username: The same Materialization name as above, `YOUR-ORG/YOUR-PREFIX/YOUR-MATERIALIZATION`
- Password: The same Auth token created on the Dekaf materialization as above

Click **Next** and you will see a list of topics. These topics are the collections you have in Estuary. Select the collection you want to ingest into Tinybird, and click **Next**.

Configure your consumer group as needed.

Finally, you will see a preview of the data source schema. Feel free to make any modifications as required, then click **Create data source**.

This will complete the connection with Estuary, and new data from the Estuary collection will arrive in your Tinybird data source in real-time.

If you need *support for deletions* , check the [configuring support for deletions](https://docs.estuary.dev/reference/Connectors/materialization-connectors/Dekaf/tinybird) section on Estuary docs.

## Handling updates and deletes [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-with-estuary#handling-updates-and-deletes)

When capturing change data that includes updates and deletes, you need to deduplicate the data in Tinybird to maintain the latest state.

There are several strategies to deduplicate data in your data source, but with Estuary, the recommended approach is to use a [ReplacingMergeTree engine](/docs/sql-reference/engines/replacingmergetree) with appropriate settings and the `FINAL` modifier.

Do not build materialized views with an AggregatingMergeTree on top of a ReplacingMergeTree. The target data source will always contain duplicates due to the incremental nature of materialized views.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-vercel-logdrains
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Vercel log drains to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to send Vercel events to Tinybird using webhooks and the Events API."
---


# Send Vercel log drains to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-vercel-logdrains#send-vercel-log-drains-to-tinybird)

Copy as MD [Vercel](https://vercel.com/) is a platform for building and deploying web applications. By integrating Vercel with Tinybird, you can analyze your Vercel events in real time and enrich it with other data sources.

Some common use cases for sending Vercel Log Drains to Tinybird include:

1. Analyze logs from your applications.
2. Monitor logs from your applications.
3. Create custom analytical dashboards.
4. Build an alert system based on logging patterns.

Read on to learn how to send logs from Vercel to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-vercel-logdrains#before-you-start)

Before you connect Vercel Log Drains to Tinybird, ensure:

- You have a Vercel account.
- You have a Tinybird workspace.

## Connect Vercel to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-vercel-logdrains#connect-vercel-to-tinybird)

1. Choose your team scope on the dashboard, and go to**  Team Settings**   >**  Log Drains**  .
2. Select the**  Projects**   to send logs to Tinybird.
3. Select**  Sources**   you want to send logs to Tinybird.
4. Select**  NDJSON**   as Delivery Format.
5. Select**  Environments**   and**  Sampling Rate**  .
6. In your Tinybird project, create a data source called `vercel_logs`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/vercel_logs.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.type` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" The proxy column is a JSON string. Use the [JSONExtract](/docs/sql-reference/functions/json-functions) functions to extract the data you need in your pipes.

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in Vercel, paste the Events API URL in your Log Drains Endpoint. Use the query parameter `name`   to match the name of the data source you created in Tinybird.

Log Drains webhook needs to be verified by Vercel. You can do this by adding the `x-vercel-verify` parameter to the request.

https://api.tinybird.co/v0/events?name=vercel_logs&x-vercel-verify=<your-x-vercel-verify-token>

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select**  Custom Headers**   , add `Authorization`   with the value `Bearer <your-tinybird-token>`   and select**  Add**  .
2. Select**  Verify**   and optionally use**  Test Log Drain**   from Vercel to check data gets to the `vercel_logs`   data source in Tinybird.
3. You're done. Any of the Vercel Log Drains you selected is automatically sent to Tinybird through the[  Events API](../events-api)  .

Check the status of the from the **Log** tab in the Tinybird `vercel_logs` data source.

## Vercel Logs Explorer Template [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-vercel-logdrains#vercel-logs-explorer-template)

Use the [Vercel Logs Explorer Template](https://www.tinybird.co/templates/vercel-log-drains) to bootstrap a multi-tenant, user-facing logs explorer for your Vercel account. You can fork it and make it your own.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-vercel-logdrains#see-also)

- [  Events API](../events-api)
- [  Vercel Log Drains](https://vercel.com/docs/observability/log-drains/log-drains-reference)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-vercel-ai-sdk
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Instrument LLM calls with the Vercel AI SDK and Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to instrument LLM calls with the Vercel AI SDK and Tinybird."
---


# Instrument LLM calls from Vercel AI SDK [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-vercel-ai-sdk#instrument-llm-calls-from-vercel-ai-sdk)

Copy as MD [Vercel AI SDK](https://sdk.vercel.ai/) is a powerful tool for building AI applications. It's a popular choice for many developers and organizations.

To start instrumenting LLM calls with the Vercel AI SDK and Tinybird, first create a data source with this schema:

SCHEMA >
    `model` LowCardinality(String) `json:$.model` DEFAULT 'unknown',
    `messages` Array(Map(String, String)) `json:$.messages[:]` DEFAULT [],
    `user` String `json:$.user` DEFAULT 'unknown',
    `start_time` DateTime `json:$.start_time` DEFAULT now(),
    `end_time` DateTime `json:$.end_time` DEFAULT now(),
    `id` String `json:$.id` DEFAULT '',
    `stream` Bool `json:$.stream` DEFAULT false,
    `call_type` LowCardinality(String) `json:$.call_type` DEFAULT 'unknown',
    `provider` LowCardinality(String) `json:$.provider` DEFAULT 'unknown',
    `api_key` String `json:$.api_key` DEFAULT '',
    `log_event_type` LowCardinality(String) `json:$.log_event_type` DEFAULT 'unknown',
    `llm_api_duration_ms` Float32 `json:$.llm_api_duration_ms` DEFAULT 0,
    `cache_hit` Bool `json:$.cache_hit` DEFAULT false,
    `response_status` LowCardinality(String) `json:$.standard_logging_object_status` DEFAULT 'unknown',
    `response_time` Float32 `json:$.standard_logging_object_response_time` DEFAULT 0,
    `proxy_metadata` String `json:$.proxy_metadata` DEFAULT '',
    `organization` String `json:$.proxy_metadata.organization` DEFAULT '',
    `environment` String `json:$.proxy_metadata.environment` DEFAULT '',
    `project` String `json:$.proxy_metadata.project` DEFAULT '',
    `chat_id` String `json:$.proxy_metadata.chat_id` DEFAULT '',
    `response` String `json:$.response` DEFAULT '',
    `response_id` String `json:$.response.id`,
    `response_object` String `json:$.response.object` DEFAULT 'unknown',
    `response_choices` Array(String) `json:$.response.choices[:]` DEFAULT [],
    `completion_tokens` UInt16 `json:$.response.usage.completion_tokens` DEFAULT 0,
    `prompt_tokens` UInt16 `json:$.response.usage.prompt_tokens` DEFAULT 0,
    `total_tokens` UInt16 `json:$.response.usage.total_tokens` DEFAULT 0,
    `cost` Float32 `json:$.cost` DEFAULT 0,
    `exception` String `json:$.exception` DEFAULT '',
    `traceback` String `json:$.traceback` DEFAULT '',
    `duration` Float32 `json:$.duration` DEFAULT 0


ENGINE MergeTree
ENGINE_SORTING_KEY start_time, organization, project, model
ENGINE_PARTITION_KEY toYYYYMM(start_time) Use a wrapper around the LLM provider you use, this is an example using OpenAI:

const openai = createOpenAI({ apiKey: apiKey });
const wrappedOpenAI = wrapModelWithTinybird(
    openai('gpt-3.5-turbo'),
    process.env.NEXT_PUBLIC_TINYBIRD_API_URL!,
    process.env.TINYBIRD_TOKEN!,
    {
    event: 'search_filter',
    environment: process.env.NODE_ENV,
    project: 'ai-analytics',
    organization: 'your-org',
    }
); Implement the wrapper in your app:

import type { LanguageModelV1 } from '@ai-sdk/provider';

type TinybirdConfig = {
  event?: string;
  organization?: string;
  project?: string;
  environment?: string;
  user?: string;
  chatId?: string;
};

export function wrapModelWithTinybird(
  model: LanguageModelV1,
  tinybirdHost: string,
  tinybirdToken: string,
  config: TinybirdConfig = {}
) {
  const originalDoGenerate = model.doGenerate;
  const originalDoStream = model.doStream;

  const logToTinybird = async (
    messageId: string,
    startTime: Date,
    status: 'success' | 'error',
    // eslint-disable-next-line @typescript-eslint/no-explicit-any
    args: any[],
    result?: { text?: string; usage?: { promptTokens?: number; completionTokens?: number } },
    error?: Error
  ) => {
    const endTime = new Date();
    const duration = endTime.getTime() - startTime.getTime();

    const event = {
      start_time: startTime.toISOString(),
      end_time: endTime.toISOString(),
      message_id: messageId,
      model: model.modelId || 'unknown',
      provider: 'openai',
      duration,
      llm_api_duration_ms: duration,
      response: status === 'success' ? {
        id: messageId,
        object: 'chat.completion',
        usage: {
          prompt_tokens: result?.usage?.promptTokens || 0,
          completion_tokens: result?.usage?.completionTokens || 0,
          total_tokens: (result?.usage?.promptTokens || 0) + (result?.usage?.completionTokens || 0),
        },
        choices: [{ message: { content: result?.text ?? '' } }],
      } : undefined,
      messages: args[0]?.prompt ? [{ role: 'user', content: args[0].prompt }].map(m => ({
        role: String(m.role),
        content: String(m.content)
      })) : [],
      proxy_metadata: {
        organization: config.organization || '',
        project: config.project || '',
        environment: config.environment || '',
        chat_id: config.chatId || '',
      },
      user: config.user || 'unknown',
      standard_logging_object_status: status,
      standard_logging_object_response_time: duration,
      log_event_type: config.event || 'chat_completion',
      id: messageId,
      call_type: 'completion',
      cache_hit: false,
      ...(status === 'error' && {
        exception: error?.message || 'Unknown error',
        traceback: error?.stack || '',
      }),
    };

    // Send to Tinybird
    fetch(`${tinybirdHost}/v0/events?name=llm_events`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${tinybirdToken}`,
      },
      body: JSON.stringify(event),
    }).catch(console.error);
  };

  model.doGenerate = async function (...args) {
    const startTime = new Date();
    const messageId = crypto.randomUUID();

    try {
      const result = await originalDoGenerate.apply(this, args);
      await logToTinybird(messageId, startTime, 'success', args, result);
      return result;
    } catch (error) {
      await logToTinybird(messageId, startTime, 'error', args, undefined, error as Error);
      throw error;
    }
  };

  model.doStream = async function (...args) {
    const startTime = new Date();
    const messageId = crypto.randomUUID();

    try {
      const result = await originalDoStream.apply(this, args);
      await logToTinybird(messageId, startTime, 'success', args, { text: '', usage: { promptTokens: 0, completionTokens: 0 } });
      return result;
    } catch (error) {
      await logToTinybird(messageId, startTime, 'error', args, undefined, error as Error);
      throw error;
    }
  };

  return model;
}
## AI analytics template [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-vercel-ai-sdk#ai-analytics-template)

Use the [LLM tracker template](https://github.com/tinybirdco/llm-performance-tracker) to bootstrap a multi-tenant, user-facing AI analytics dashboard and LLM cost calculator for your AI models. You can fork it and make it your own.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-vercel-ai-sdk#see-also)

- [  Vercel AI SDK docs](https://sdk.vercel.ai/)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-litellm
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send LiteLLM Events to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to send LiteLLM events to Tinybird using webhooks and the Events API."
---


# Send LiteLLM events to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-litellm#send-litellm-events-to-tinybird)

Copy as MD [LiteLLM](https://www.litellm.ai/) is an LLM gateway that provides AI models access, fallbacks, and spend tracking across 100+ LLMs. It's a popular choice for many developers and organizations.

LiteLLM is open source and can be self-hosted.

To start sending LiteLLM events to Tinybird, first create a data source with this schema:

SCHEMA >
    `model` LowCardinality(String) `json:$.model` DEFAULT 'unknown',
    `messages` Array(Map(String, String)) `json:$.messages[:]` DEFAULT [],
    `user` String `json:$.user` DEFAULT 'unknown',
    `start_time` DateTime `json:$.start_time` DEFAULT now(),
    `end_time` DateTime `json:$.end_time` DEFAULT now(),
    `id` String `json:$.id` DEFAULT '',
    `stream` Bool `json:$.stream` DEFAULT false,
    `call_type` LowCardinality(String) `json:$.call_type` DEFAULT 'unknown',
    `provider` LowCardinality(String) `json:$.provider` DEFAULT 'unknown',
    `api_key` String `json:$.api_key` DEFAULT '',
    `log_event_type` LowCardinality(String) `json:$.log_event_type` DEFAULT 'unknown',
    `llm_api_duration_ms` Float32 `json:$.llm_api_duration_ms` DEFAULT 0,
    `cache_hit` Bool `json:$.cache_hit` DEFAULT false,
    `response_status` LowCardinality(String) `json:$.standard_logging_object_status` DEFAULT 'unknown',
    `response_time` Float32 `json:$.standard_logging_object_response_time` DEFAULT 0,
    `proxy_metadata` String `json:$.proxy_metadata` DEFAULT '',
    `organization` String `json:$.proxy_metadata.organization` DEFAULT '',
    `environment` String `json:$.proxy_metadata.environment` DEFAULT '',
    `project` String `json:$.proxy_metadata.project` DEFAULT '',
    `chat_id` String `json:$.proxy_metadata.chat_id` DEFAULT '',
    `response` String `json:$.response` DEFAULT '',
    `response_id` String `json:$.response.id`,
    `response_object` String `json:$.response.object` DEFAULT 'unknown',
    `response_choices` Array(String) `json:$.response.choices[:]` DEFAULT [],
    `completion_tokens` UInt16 `json:$.response.usage.completion_tokens` DEFAULT 0,
    `prompt_tokens` UInt16 `json:$.response.usage.prompt_tokens` DEFAULT 0,
    `total_tokens` UInt16 `json:$.response.usage.total_tokens` DEFAULT 0,
    `cost` Float32 `json:$.cost` DEFAULT 0,
    `exception` String `json:$.exception` DEFAULT '',
    `traceback` String `json:$.traceback` DEFAULT '',
    `duration` Float32 `json:$.duration` DEFAULT 0


ENGINE MergeTree
ENGINE_SORTING_KEY start_time, organization, project, model
ENGINE_PARTITION_KEY toYYYYMM(start_time) Install the Tinybird AI Python SDK:

pip install tinybird-python-sdk[ai] Finally, use the following handler in your app:

import litellm
from litellm import acompletion
from tb.litellm.handler import TinybirdLitellmAsyncHandler

customHandler = TinybirdLitellmAsyncHandler(
    api_url="https://api.us-east.aws.tinybird.co", 
    tinybird_token=os.getenv("TINYBIRD_TOKEN"), 
    datasource_name="litellm"
)

litellm.callbacks = [customHandler]

response = await acompletion(
    model="gpt-3.5-turbo", 
    messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}],
    stream=True
)
## AI analytics template [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-litellm#ai-analytics-template)

Use the [LLM tracker template](https://github.com/tinybirdco/llm-performance-tracker) to bootstrap a multi-tenant, user-facing AI analytics dashboard and LLM cost calculator for your AI models. You can fork it and make it your own.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-litellm#see-also)

- [  Events API](../events-api)
- [  LiteLLM docs](https://docs.litellm.ai/)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-vercel
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Vercel Webhooks to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to send Vercel events to Tinybird using webhooks and the Events API."
---


# Send Vercel events to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-vercel#send-vercel-events-to-tinybird)

Copy as MD [Vercel](https://vercel.com/) is a platform for building and deploying web applications. By integrating Vercel with Tinybird, you can analyze your Vercel events in real time and enrich it with other data sources.

Some common use cases for sending Vercel events to Tinybird include:

1. Tracking deployments, projects, integrations and domains status and errors.
2. Creating custom analytical dashboards.
3. Monitoring attacks.

Read on to learn how to send data from Vercel to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-vercel#before-you-start)

Before you connect Vercel webhooks to Tinybird, ensure:

- You have a Vercel account.
- You have a Tinybird workspace.

## Connect Vercel to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-vercel#connect-vercel-to-tinybird)

1. Choose your team scope on the dashboard, and go to**  Settings**   >**  Webhooks**  .
2. Select the Webhooks and Projects you want to send to Tinybird.
3. In Tinybird, create a data source, called `vercel`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/vercel.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.type` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from Vercel webhooks in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. In Tinybird, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in Vercel, paste the Events API URL in your Webhook Endpoint URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird. For example:

https://api.tinybird.co/v0/events?name=vercel&token=<your user token>

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. You're done. Any of the Vercel events you selected is automatically sent to Tinybird through the[  Events API](../events-api)  .

You can check the status of the integration from the **Log** tab in the Tinybird `vercel` data source.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-vercel#see-also)

- [  Events API](../events-api)
- [  Vercel webhooks](https://vercel.com/docs/observability/webhooks-overview)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-vector
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Ingest data using Vector.dev · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to use the Events API as a Vector sink."
---


# Ingest data using Vector.dev [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-vector#ingest-data-using-vector-dev)

Copy as MD [Vector.dev](https://vector.dev/) is an open-source tool created by DataDog for collecting, transforming, and shipping logs, metrics, and traces.

Some common use cases for using Vector.dev as a Tinybird sink include:

1. Ingesting data from a number of[  Vector.dev sources](https://vector.dev/components/)   to Tinybird.
2. Enriching other data sources with real-time Vector metrics.
3. Aggregate logs and metrics from Vector.dev to Tinybird.
4. Transform and redact sensitive data before ingesting it into Tinybird.

Read on to learn how to use the Events API as a Vector.dev sink.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-vector#before-you-start)

Before you connect Vector.dev to Tinybird, ensure:

- You have installed Vector.dev.
- You have a Tinybird workspace.

## Use the Events API as a Vector.dev sink [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-vector#use-the-events-api-as-a-vector-dev-sink)

To push events to Tinybird from Vector.dev, you need to configure Vector to use the Events API as a sink.

You can use the following example Vector configuration to push events, in this case Docker logs, to Tinybird:

sources:
  docker_logs:
    type: "docker_logs"

transforms:
  remap_docker_logs:
    inputs:
      - "docker_logs"
    type: "remap"
    source: |
      . = parse_json!(.log)

sinks:
  push_docker_logs_to_tinybird:
    inputs:
      - "remap_docker_logs"
    type: "http"
    uri: "$TINYBIRD_HOST/v0/events?name=docker"
    auth:
      strategy: "bearer"
      token: "$TINYBIRD_TOKEN"
    encoding:
      codec: "json"
    framing:
        method: "newline_delimited" The previous snippet uses the `docker_logs` source to collect Docker logs, and the `remap_docker_logs` transform to parse the logs as JSON.

The `push_docker_logs_to_tinybird` sink uses the Events API to push the transformed logs to Tinybird in NDJSON ( `newline_delimited` ) format.

You can customize the `$TINYBIRD_HOST` and `$TINYBIRD_TOKEN` environment variables to use your Tinybird workspace.

Learn more about other sources you can use to ingest data into Tinybird in the [Vector.dev documentation](https://vector.dev/docs/reference/configuration/sources/).


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-stripe
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Stripe Events to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to send Stripe events to Tinybird using webhooks and the Events API."
---


# Send Stripe events to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-stripe#send-stripe-events-to-tinybird)

Copy as MD [Stripe](https://stripe.com/) is a platform for payments and financial services, and it provides a way to send events to Tinybird using webhooks.

Some common use cases for sending Stripe events to Tinybird include:

1. Monitor Stripe events.
2. Run analytical workflows based on Stripe events.
3. Create custom dashboards based on Stripe events.
4. Create alerts and notifications based on Stripe events.
5. Join Stripe events with other data sources to enrich your user data.

Read on to learn how to send events from Stripe to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-stripe#before-you-start)

Before you connect Stripe to Tinybird, ensure:

- You have a Stripe account.
- You have a Tinybird workspace.

## Connect Stripe to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-stripe#connect-stripe-to-tinybird)

Stripe provides a variety of [webhook event types](https://docs.stripe.com/api/events/object) that you can use to send events to Tinybird.

This guide covers the base case for sending Stripe events to Tinybird.

1. In Stripe, go to[  Webhooks](https://dashboard.stripe.com/webhooks)
2. Select**  Add endpoint**  .
3. In your Tinybird project, create a data source called `stripe`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/stripe.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.type` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from Stripe in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in Stripe, paste the Events API URL in your Webhook Endpoint URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird.

https://api.tinybird.co/v0/events?name=stripe&format=json&token=<your user token> Make sure to use the `format=json` query parameter.


Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select**  Select events**   and select the events you want to send to Tinybird.
2. Save and you're done.

Check the status of the integration by selecting the webhook in Stripe or from the **Log** tab in the Tinybird `stripe` data source.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-stripe#see-also)

- [  Stripe Webhooks](https://docs.stripe.com/webhooks)
- [  Stripe Events](https://docs.stripe.com/api/events/object)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-incremental-updates
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Ingest from Snowflake using incremental updates · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to ingest data from Snowflake doing incremental appends, so you can keep last transactional data sources in sync with Tinybird."
---


# Ingest from Snowflake using incremental updates [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-incremental-updates#ingest-from-snowflake-using-incremental-updates)

Copy as MD Read on to learn how to incrementally load data from a Snowflake table into Tinybird, using Amazon S3 as an intermediary staging area.

An incremental loading strategy ensures that only new or updated rows are transferred.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-incremental-updates#before-you-start)

Before you begin, ensure the following:

- Snowflake Account: A Snowflake instance with the data you want to load.
- Amazon S3 Bucket: Access to an S3 bucket for staging data, along with appropriate permissions (write from Snowflake and read from Tinybird).
- Tinybird Account: A Tinybird workspace with an appropriate data source set up.
- Snowflake Permissions: Ensure the Snowflake user has privileges to:
  - Access the target table.
  - Create and manage stages.
  - Unload data into S3.
- AWS Credentials: Ensure Snowflake can use AWS credentials (IAM Role or Access Key/Secret Key pair) to write to the S3 bucket.

1
## Create the unload task in Snowflake [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-incremental-updates#create-the-unload-task-in-snowflake)

Follow these steps to create the unload task in Snowflake:

1. Grant the required permissions in AWS IAM Console

Make sure the S3 bucket allows Snowflake to write files by setting up an appropriate IAM role or policy. You can use this template to create the policy and attach it to the AWS role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": ["s3:PutObject", "s3:AbortMultipartUpload"],
            "Resource": "arn:aws:s3:::your-bucket-name/path/*"
        }
    ]
} Replace `your-bucket-name/path/*` with your bucket name, and optionally the path you want to grant access to.

1. Create the storage integration

Run the following SQL statement to create the storage integration:

/* Create the S3 integration.
 */
CREATE or replace STORAGE INTEGRATION tinybird_integration
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = 'S3'
  ENABLED = TRUE
  STORAGE_AWS_ROLE_ARN = '<arn_role>'
  STORAGE_ALLOWED_LOCATIONS = ('*');

-- describe integration tinybird_integration; Replace `<arn_role>` with the ARN of the role created in the previous step.

1. Create the file format

Run the following SQL statement to create the file format:

/* Create the file format for the output files generated.
 */
CREATE OR REPLACE FILE FORMAT csv_unload_format
  TYPE = 'CSV';
1. Create the stage

Run the following SQL statement to create the stage:

/* And finally the stage we'll use to unload the data to.
 */
CREATE or replace STAGE tinybird_stage
  STORAGE_INTEGRATION = tinybird_integration
  URL = 's3://your-bucket-name/path/'
  FILE_FORMAT = csv_unload_format; Replace `your-bucket-name` and `path` with your S3 bucket details.

2
## Create the unload task [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-incremental-updates#create-the-unload-task)

Run the following SQL statement to create the scheduled task that unloads the new records since the last successful execution to the S3 bucket:

/* Create the scheduled task that unloads the new records since
 * last successful execution to the S3 bucket.
 * 
 * Note how it reads the timestamp of the last successful execution,
 * and leaves a one hour margin.
 *
 * Orders need to be deduplicated later in Tinybird.
 */
CREATE or replace TASK export_order_deltas
    WAREHOUSE = compute_wh
    SCHEDULE = 'USING CRON 05 * * * * UTC'
AS
BEGIN
   LET sql := 'COPY INTO @tinybird_stage/orders/orders_<ts> from (
    select
        O_ORDERKEY, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, O_ORDERDATE,
        O_ORDERPRIORITY, O_CLERK
    from tinybird.samples.orders_incremental
    where o_orderdate >= (
        SELECT coalesce(timestampadd(hour,-1,max(QUERY_START_TIME)),\'1970-01-01\')
        FROM TABLE(INFORMATION_SCHEMA.TASK_HISTORY(TASK_NAME=>\'export_order_deltas\'))
        where state = \'SUCCEEDED\'
        ORDER BY SCHEDULED_TIME
    )) max_file_size=1000000000';

   sql := REPLACE(sql, '<ts>', TO_VARCHAR(CONVERT_TIMEZONE('UTC',current_timestamp()), 'YYYY_MM_DD__hh24_mi_ss'));
   
   EXECUTE IMMEDIATE (sql);

   RETURN sql;
END; 3
## Configure the ingestion in Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-incremental-updates#configure-the-ingestion-in-tinybird)

Create the S3 connection. See [S3 Connector](../connectors/s3).

You can use the following schema for the data source:

##### tinybird/datasources/s3_landing_ds.datasource - data source with S3 connection

SCHEMA >
    `O_ORDERKEY` Int64,
    `O_CUSTKEY` Int64,
    `O_ORDERSTATUS` String,
    `O_TOTALPRICE` Float32,
    `O_ORDERDATE` DateTime64(3),
    `O_ORDERPRIORITY` String,
    `O_CLERK` String

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYear(O_ORDERDATE)"
ENGINE_SORTING_KEY "O_ORDERDATE, O_ORDERPRIORITY, O_CLERK"

IMPORT_CONNECTION_NAME 'tinybird-tb-s3'
IMPORT_BUCKET_URI 's3://tinybird-tb/snowflake/csv/orders/*.csv.gz'
IMPORT_SCHEDULE '@auto' Deploy the data source. The new files Snowflake writes to the bucket are automatically ingested by Tinybird in a few seconds.

4
## Handle duplicates in Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-incremental-updates#handle-duplicates-in-tinybird)

Use a materialized view to handle duplicates in Tinybird. For example:

##### tinybird/materializations/mat_s3_data.pipe - pipe to materialize data

NODE mat_s3_data_0
SQL >

    SELECT *
    FROM landing_ds

TYPE materialized
DATASOURCE deduplicate_rmt_mv If you only need to work with the latest snapshot of the data, include a `file_ingested_at` field in the materializing pipe.

This is important in cases where your incremental loads from Snowflake don’t indicate deleted records. Since a deleted record from a previous ingest won’t be overwritten by a new ingest, it will persist in the dataset. By filtering downstream on the `file_ingested_at` field, you can exclude these stale records and isolate only the most recent ingest. For example:

##### tinybird/materializations/mat_s3_data_latest.pipe - pipe to materialize data with timestamp

NODE mat_s3_data_latest_0
SQL >

    SELECT *, toDateTime(now()) as file_ingested_at
    FROM landing_ds

TYPE materialized
DATASOURCE deduplicate_rmt_mv In the target datasource, the `ReplacingMergeTree` and `ENGINE_VER` options will [deduplicate records](/sql-reference/engines/replacingmergetree) with the same sorting key value.

##### tinybird/datasources/deduplicate_rmt_mv.datasource - Replacing Merge Tree to deduplicate data

SCHEMA >
    `O_ORDERKEY` Int64,
    `O_CUSTKEY` Int64,
    `O_ORDERSTATUS` String,
    `O_TOTALPRICE` Float32,
    `O_ORDERDATE` DateTime64(3),
    `O_ORDERPRIORITY` String,
    `O_CLERK` String,
    `file_ingested_at` DateTime

ENGINE "ReplacingMergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(O_ORDERDATE)"
ENGINE_SORTING_KEY "O_ORDERKEY"
ENGINE_VER "file_ingested_at" Finally, create a pipe to query the deduplicated data source. Filter on the maximum timestamp to get the latest snapshot, [rounding the timestamp](/sql-reference/functions/date-time-functions#tostartofday) based on your import schedule. This ensures that you query the most recent snapshot with no duplicates.

##### tinybird/endpoints/get_snapshot.pipe - query for latest snapshot

NODE get_snapshot_0
SQL >
    
    WITH (SELECT max(toStartOfDay(file_ingested_at)) FROM deduplicate_rmt_mv) AS latest_file_version
    SELECT *
    FROM deduplicate_rmt_mv
    FINAL
    WHERE toStartOfDay(file_ingested_at) = latest_file_version

TYPE endpoint Remember to use `final` when querying a ReplacingMergeTree.

5
## Next steps [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-incremental-updates#next-steps)

See the following resources:

- [  Ingest from Snowflake using Azure Blob Storage](./ingest-from-snowflake-using-azure-blob-storage)
- [  Ingest from Snowflake using AWS S3](./ingest-from-snowflake-using-aws-s3)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage
Last update: 2025-06-16T16:21:33.000Z
Content:
---
title: "Ingest from Snowflake using Azure Blob Storage · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to send data from Snowflake to Tinybird using Azure Blob Storage."
---


# Ingest from Snowflake using Azure Blob Storage [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage#ingest-from-snowflake-using-azure-blob-storage)

Copy as MD Read on to learn how to send data from Snowflake to Tinybird, for example when you need to periodically run full replaces of a table or do a one-off ingest.

This process relies on [unloading](https://docs.snowflake.com/en/user-guide/data-unload-overview) , or bulk exporting, data as gzipped CSVs and then ingesting them using the Data Sources API.

## Prerequisites [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage#prerequisites)

To follow these steps you need a Tinybird account and access to Snowflake and permissions to create SAS Tokens for Azure Blob Storage.

1
## Unload the Snowflake table [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage#unload-the-snowflake-table)

Snowflake lets you [unload](https://docs.snowflake.com/en/user-guide/data-unload-overview) query results to flat files to and external storage service. For example:

COPY INTO 'azure://myaccount.blob.core.windows.net/unload/'
  FROM mytable
  CREDENTIALS = ( AZURE_SAS_TOKEN='****' )
  FILE_FORMAT = ( TYPE = CSV  COMPRESSION = GZIP )
  HEADER = FALSE; The most basic implementation is [unloading directly](https://docs.snowflake.com/en/sql-reference/sql/copy-into-location#unloading-data-from-a-table-directly-to-files-in-an-external-location) , but for production use cases consider adding a [named stage](https://docs.snowflake.com/en/user-guide/data-unload-azure#unloading-data-into-an-external-stage) as suggested in the Snowflake docs. Stages give you fine-grained control to access rights.

2
## Create a SAS token for the file [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage#create-a-sas-token-for-the-file)

Using [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli) , generate a [shared access signature (SAS) token](https://learn.microsoft.com/en-us/azure/ai-services/translator/document-translation/how-to-guides/create-sas-tokens?tabs=blobs) so Tinybird can read the file:

az storage blob generate-sas \
    --account-name myaccount \
    --account-key '****' \
    --container-name unload \
    --name data.csv.gz \
    --permissions r \
    --expiry <expiry-ts> \
    --https-only \
    --output tsv \
    --full-uri

> 'https://myaccount.blob.core.windows.net/unload/data.csv.gz?se=2024-05-31T10%3A57%3A41Z&sp=r&spr=https&sv=2022-11-02&sr=b&sig=PMC%2E9ZvOFtKATczsBQgFSsH1%2BNkuJvO9dDPkTpxXH0g%5D' You can use the same behavior in S3 and GCS to generate presigned URLs.

3
## Ingest into Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage#ingest-into-tinybird)

Take the generated URL and make a call to Tinybird. You need a [token](/docs/administration/auth-tokens) with `DATASOURCES:CREATE` permissions:

curl \
-H "Authorization: Bearer <DATASOURCES:CREATE token>" \
-X POST "https://api.tinybird.co/v0/datasources?name=my_datasource_name" \
-d url='https://myaccount.blob.core.windows.net/unload/data.csv.gz?se=2024-05-31T10%3A57%3A41Z&sp=r&spr=https&sv=2022-11-02&sr=b&sig=PMC%2E9ZvOFtKATczsBQgFSsH1%2BNkuJvO9dDPkTpxXH0g%5D' You now have your Snowflake Table in Tinybird.

## Automation [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage#automation)

To adapt to production scenarios, like having to append data on a timely basis or replacing data that has been updated in Snowflake, you might need to define scheduled actions to move the data.

## Limits [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage#limits)

Because you're using the data sources API, its [limits](/docs/api-reference#limits) apply.

You might need to adjust your [COPY INTO <location>](https://docs.snowflake.com/en/sql-reference/sql/copy-into-location) expression adding `PARTITION` or `MAX_FILE_SIZE = 5000000000` . For example:

COPY INTO 'azure://myaccount.blob.core.windows.net/unload/'
  FROM mytable 
  CREDENTIALS=( AZURE_SAS_TOKEN='****')
  FILE_FORMAT = ( TYPE = CSV  COMPRESSION = GZIP )
  HEADER = FALSE
  MAX_FILE_SIZE = 5000000000;
## Next steps [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-azure-blob-storage#next-steps)

See the following resources:

- [  Ingest from Snowflake using AWS S3](./ingest-from-snowflake-using-aws-s3)
- [  Ingest from Snowflake using incremental updates](./ingest-from-snowflake-using-incremental-updates)
- [  Ingest from Google Cloud Storage](./ingest-from-google-gcs)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-aws-s3
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Ingest from Snowflake using AWS S3 · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to send data from Snowflake to Tinybird using AWS S3."
---


# Ingest data from Snowflake using AWS S3 [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-aws-s3#ingest-data-from-snowflake-using-aws-s3)

Copy as MD Read on to learn how to send data from Snowflake to Tinybird, for example when you need to periodically run full replaces of a table or do a one-off ingest.

This process relies on [unloading](https://docs.snowflake.com/en/user-guide/data-unload-overview) , or bulk exporting, data as gzipped CSVs and then ingesting them using the Data Sources API. Data is then ingested using the [S3 Connector](../connectors/s3).

## Prerequisites [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-aws-s3#prerequisites)

To follow these steps you need a Tinybird account and access to Snowflake and AWS S3.

1
## Unload the Snowflake table [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-aws-s3#unload-the-snowflake-table)

The first step consists in unloading the Snowflake table to a gzipped CSV file.

1. Grant the required permissions in AWS IAM Console

Make sure the S3 bucket allows Snowflake to write files by setting up an appropriate IAM role or policy. You can use this template to create the policy and attach it to the AWS role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": ["s3:PutObject", "s3:AbortMultipartUpload"],
            "Resource": "arn:aws:s3:::your-bucket-name/path/*"
        }
    ]
} Replace `your-bucket-name/path/*` with your bucket name, and optionally the path you want to grant access to. Attach the policy to the role you want use to unload the data from Snowflake.

1. Create the storage integration

Run the following SQL statement to create the storage integration:

/* Create the S3 integration.
 */
CREATE or replace STORAGE INTEGRATION tinybird_integration
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = 'S3'
  ENABLED = TRUE
  STORAGE_AWS_ROLE_ARN = '<arn_role>'
  STORAGE_ALLOWED_LOCATIONS = ('*');

-- describe integration tinybird_integration; Replace `<arn_role>` with the ARN of the role created in the previous step.

1. Create the file format

Run the following SQL statement to create the file format:

/* Create the file format for the output files generated.
 */
CREATE OR REPLACE FILE FORMAT csv_unload_format
  TYPE = 'CSV';
1. Create the stage

Run the following SQL statement to create the stage:

/* And finally the stage we'll use to unload the data to.
 */
CREATE or replace STAGE tinybird_stage
  STORAGE_INTEGRATION = tinybird_integration
  URL = 's3://your-bucket-name/path/'
  FILE_FORMAT = csv_unload_format; Replace `your-bucket-name` and `path` with your S3 bucket details.

1. Unload the data

Run the following SQL statement to unload the data:

COPY INTO @tinybird_stage/orders/ from (
    select
        O_ORDERKEY, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, O_ORDERDATE,
        O_ORDERPRIORITY, O_CLERK
    from my_database.my_schema.orders
) To automate the unloading, you can create a Snowflake task that runs the `COPY INTO` on a schedule. For example:

CREATE or replace TASK export_order_deltas
    WAREHOUSE = compute_wh
    SCHEDULE = 'USING CRON 05 * * * * UTC'
AS
COPY INTO @tinybird_stage/orders from (
    select
        O_ORDERKEY, O_CUSTKEY, O_ORDERSTATUS, O_TOTALPRICE, O_ORDERDATE,
        O_ORDERPRIORITY, O_CLERK
    from my_database.my_schema.orders
) max_file_size=1000000000 2
## Ingest data into Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-aws-s3#ingest-data-into-tinybird)

Before ingesting your Snowflake data from the S3 bucket, you need to create the S3 connection. See [S3 Connector](../connectors/s3).

You can use the following schema for the data source:

SCHEMA >
    `O_ORDERKEY` Int64,
    `O_CUSTKEY` Int64,
    `O_ORDERSTATUS` String,
    `O_TOTALPRICE` Float32,
    `O_ORDERDATE` DateTime64(3),
    `O_ORDERPRIORITY` String,
    `O_CLERK` String

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYear(O_ORDERDATE)"
ENGINE_SORTING_KEY "O_ORDERDATE, O_ORDERPRIORITY, O_CLERK"

IMPORT_CONNECTION_NAME 'tb-s3'
IMPORT_BUCKET_URI 's3://tb/snowflake/csv/orders/*.csv.gz'
IMPORT_SCHEDULE '@auto' Deploy the data source. The new files Snowflake writes to the bucket are automatically ingested by Tinybird.

Each file is appended to the Tinybird data source. As records might be duplicated, consider using a materialized view to consolidate a stateful set of your Snowflake table.

## Limits [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-aws-s3#limits)

Because you're using the data sources API, its [limits](/docs/api-reference#limits) apply.

You might need to adjust your [COPY INTO <location>](https://docs.snowflake.com/en/sql-reference/sql/copy-into-location) expression adding `PARTITION` or `MAX_FILE_SIZE = 5000000000`.

## Next steps [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-snowflake-using-aws-s3#next-steps)

See the following resources:

- [  Ingest from Snowflake using Azure Blob Storage](./ingest-from-snowflake-using-azure-blob-storage)
- [  Ingest from Snowflake using incremental updates](./ingest-from-snowflake-using-incremental-updates)
- [  Ingest from Google Cloud Storage](./ingest-from-google-gcs)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-sentry
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Sentry Webhooks to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to send Sentry events to Tinybird using webhooks and the Events API."
---


# Send Sentry events to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-sentry#send-sentry-events-to-tinybird)

Copy as MD [Sentry](https://sentry.io/) is a platform for monitoring and alerting on errors in your applications. By integrating Sentry with Tinybird, you can analyze your Sentry events in real time and enrich it with other data sources.

Some common use cases for sending Sentry events to Tinybird include:

1. Analyze errors from your applications.
2. Detect patterns in your error data.
3. Build an alert system based on error patterns.
4. Build custom analytical dashboards.

Read on to learn how to send logs from Sentry to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-sentry#before-you-start)

Before you connect Sentry to Tinybird, ensure:

- You have a Sentry account.
- You have a Tinybird workspace.

## Connect Sentry to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-sentry#connect-sentry-to-tinybird)

1. In Sentry, go to**  Settings**   >**  Developer Settings**   >**  Custom Integrations**  .
2. Select**  Create New Integration**  .
3. In your Tinybird project, create a data source called `sentry`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/sentry.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.action` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from Sentry in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in Sentry, paste the Events API URL in your Custom Integration. Use the query parameter `name`   to match the name of the data source you created in Tinybird.

https://api.tinybird.co/v0/events?name=sentry&token=<your user token>

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select**  Alert Rule Action**  .
2. In the**  Permissions**   box**  Issue and Event**   >**  Read**  .
3. Check all webhooks and**  Save Changes**  .
4. If you also want to send alerts to Tinybird, select**  Alerts**   from the left menu, click on an alert and select**  Edit Rule**   . You can select**  Send Notifications via**   your previously created Custom Integration.
5. You can then select**  Send Test Notification**   to check the connection.
6. You're done. Any of the Sentry events you selected are automatically sent to Tinybird through the[  Events API](../events-api)  .

Check the status of the integration from the **Log** tab in the Tinybird `sentry` data source.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-sentry#see-also)

- [  Sentry Webhooks](https://docs.sentry.io/organization/integrations/integration-platform/webhooks/)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Stream from RudderStack · Tinybird Docs"
theme-color: "#171612"
description: "In this guide, you'll learn two different methods to send events from RudderStack to Tinybird."
---


# Stream from RudderStack [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack#stream-from-rudderstack)

Copy as MD In this guide, you'll learn two different methods to send events from RudderStack to Tinybird.

To better understand the behavior of their customers, companies need to unify timestamped data coming from a wide variety of products and platforms. Typical events to track would be 'sign up', 'login', 'page view' or 'item purchased'. A customer data platform can be used to capture complete customer data like this from wherever your customers interact with your brand. It defines events, collects them from different platforms and products, and routes them to where they need to be consumed.

[RudderStack](https://www.rudderstack.com/) is an open-source customer data pipeline tool. It collects, processes and routes data from your websites, apps, cloud tools, and data warehouse. By using Tinybird's event ingestion endpoint for [high-frequency ingestion](../events-api) as a Webhook in RudderStack, you can stream customer data in real time to data sources.

## Option 1: A separate data source for each event type [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack#option-1-a-separate-data-source-for-each-event-type)

This is the preferred approach. It sends each type of event to a corresponding data source.

The advantages of this method are:

- Your data is well organized from the start.
- Different event types can have different attributes (columns in their data source).
- Whenever new attributes are added to an event type you will be prompted to add new columns.
- New event types will get a new data source.

Start by generating a Token in the UI to allow RudderStack to write to Tinybird.

### Create a Tinybird Token [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack#create-a-tinybird-token)

Go to the workspace in Tinybird where you want to receive data and select **Tokens** in the side panel. Create a new Token by selecting **Create Token**.

Give your Token a descriptive name. In the section **DATA SOURCES SCOPES** select **Data Sources management** to give your Token permission to create data sources. Select **Save changes**.

### Create a RudderStack Destination [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack#create-a-rudderstack-destination)

In RudderStack, Select **Destinations** in the side panel and then **New destination**.

Select **Webhook**:

1. Give the destination a descriptive name.
2. Connect your sources, you can test with the Rudderstack Sample HTTP Source.
3. Input the following Connection Settings:

- Webhook URL:*  <   https://api.tinybird.co   /v0/events>*
- URL Method:*  POST*
- Headers Key:*  Authorization*
- Headers Value:*  Bearer TINYBIRD_AUTH_TOKEN*

On the next page, select **Create new transformation**.

You can code a function in the box to apply to events when this transformation is active using the following example snippet. In this function, you can dynamically append the target data source to the target URL of the Webhook. Give your transformation a descriptive name and a helpful description.

##### Transformation code

export function transformEvent(event, metadata){
    event.appendPath="?name=rudderstack_"+event.event.toLowerCase().replace(/[\s\.]/g, '_')
    return event;
} This example snippet uses the prefix `*rudderstack\_*` followed by the name of the event in lower case, with its words separated by an underscore. For instance, a "Product purchased" event would go to a data source named `rudderstack_product_purchased`.

Save the transformation. Your destination has been created successfully.

### Test ingestion [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack#test-ingestion)

In Rudderstack, select **Sources** > **Rudderstack Sample HTTP** > **Live events** (top right) > **Send test event** and paste the provided curl command into your terminal. The event will appear on the screen and be sent to Tinybird.

## Option 2: All events in the same data source [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack#option-2-all-events-in-the-same-data-source)

This alternative approach consists of sending all events into a single data source and then splitting them using Tinybird. By pre-configuring the data source, any events that RudderStack sends will be ingested with the JSON object in full as a String in a single column. This is very useful when you have complex JSON objects, but be aware that using JSONExtract to parse data from the JSON object after ingestion has an impact on performance.

New columns from parsing the data will be detected and you will be asked if you want to save them. You can adjust the inferred data types before saving any new columns. pipes can be used to filter the data source by different events.

### Preconfigure a data source [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack#preconfigure-a-data-source)

Create a new file in your local workspace, named `rudderstack_events.datasource` , for example, to configure the empty data source.

##### Data Source schema

SCHEMA >
'value' String 'json:$'

ENGINE "MergeTree"
ENGINE_SORTING_KEY "value" Deploy the changes using `tb deploy`.

Note that this pre-configured data source is only required if you need a column containing the JSON object in full as a String. Otherwise, skip this step and let Tinybird infer the columns and data types when you send the first event. You will then be able to select which columns you wish to save and adjust their data types. Create the Token as in method 1.

### Create a Tinybird token [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack#create-a-tinybird-token)

Go to the workspace in Tinybird where you want to receive data and select **Tokens** in the side panel. Create a new token by selecting **Create Token**.

Give your Token a descriptive name. In the section **DATA SOURCES SCOPES** , select **Add data source scope** , select the name of the data source that you just created, and mark the **Append** checkbox. Select **Save changes**.

### Create a RudderStack destination [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack#create-a-rudderstack-destination)

In RudderStack, Select **Destinations** in the side panel and then **New destination**.

Select **Webhook**:

1. Give the destination a descriptive name.
2. Connect your sources, you can test with the Rudderstack Sample HTTP Source.
3. Input the following Connection Settings:

- Webhook URL:*  <   https://api.tinybird.co   /v0/events?name=rudderstack_events>*
- URL Method:*  POST*
- Headers Key:*  Authorization*
- Headers Value:*  Bearer TINYBIRD_AUTH_TOKEN*

Select **No transformation needed** and save. Your destination has been created successfully.

### Test ingestion [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-rudderstack#test-ingestion)

Select **Sources** > **Rudderstack Sample HTTP** > **Live events** > **Send test event** and paste the provided curl command into your terminal. The event will appear on the screen and be sent to Tinybird.

The `value` column contains the full JSON object. You will also have the option of having the data parsed into columns. When viewing the new columns you can select which ones to save and adjust their data types.

Whenever new columns are detected in the stream of events you will be asked if you want to save them.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-resend
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Resend webhooks to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to send data from Resend to Tinybird."
---


# Send Resend webhooks to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-resend#send-resend-webhooks-to-tinybird)

Copy as MD With [Resend](https://resend.com/) you can send and receive emails programmatically. By integrating Resend with Tinybird, you can analyze your email data in real time.

Some common use cases for sending Resend webhooks to Tinybird include:

1. Tracking email opens and clicks.
2. Monitoring delivery rates and bounces.
3. Analyzing user engagement patterns.
4. Creating custom dashboards for email performance.
5. Enriching other data sources with real-time email metrics.

Read on to learn how to send data from Resend to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-resend#before-you-start)

Before you connect Resend to Tinybird, ensure:

- You have a Resend account.
- You have a Tinybird workspace.

## Connect Resend to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-resend#connect-resend-to-tinybird)

1. Open the Resend UI and go to the Webhooks page.
2. Select**  Add Webhook**  .
3. In your Tinybird project, create a data source called `resend`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/resend.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.type` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from Resend in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in Resend, paste the Events API URL in your Webhook URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird. For example:

https://api.tinybird.co/v0/events?name=resend&token=<your user token>

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select the checkboxes for the Resend events you want to send to Tinybird, and select**  Add**  .
2. You're done. Sending emails to Resend will now push events to Tinybird via the[  Events API](../events-api)  .

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-resend#see-also)

- [  Resend event types](https://resend.com/docs/dashboard/webhooks/event-types)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-pagerduty
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send PagerDuty events to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to send PagerDuty events to Tinybird using webhooks and the Events API."
---


# Send PagerDuty events to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-pagerduty#send-pagerduty-events-to-tinybird)

Copy as MD [PagerDuty](https://www.pagerduty.com/) is a platform for incident management and alerting. By integrating PagerDuty with Tinybird, you can analyze your incident data in real time and enrich it with other data sources.

Some common use cases for sending PagerDuty events to Tinybird include:

1. Monitoring and alerting on incidents.
2. Creating custom dashboards for incident analysis.
3. Incident logs.

Read on to learn how to send events from PagerDuty to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-pagerduty#before-you-start)

Before you connect PagerDuty to Tinybird, ensure:

- You have an PagerDuty account.
- You have a Tinybird workspace.

## Connect PagerDuty to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-pagerduty#connect-pagerduty-to-tinybird)

1. From the PagerDuty dashboard, select**  Integrations**   >**  Developer Tools**   >**  Webhooks**  .
2. Select**  New Webhook**  .
3. In your Tinybird project, create a data source called `pagerduty`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/pagerduty.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.event.event_type` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from PagerDuty in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in PagerDuty, paste the Events API URL in your Webhook URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird. For example:

https://api.tinybird.co/v0/events?name=pagerduty

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select**  Add custom header**   and add 'Authorization' as**  Name**   and paste the token you created in Tinybird as**  Value**  .

Bearer <your user token>
1. Select all event subcriptions and**  Add webhook**
2. You're done. Any of the PagerDuty events is automatically sent to Tinybird through the[  Events API](../events-api)  .

You can check the status of the integration by testing the Webhook integration in PagerDuty or from the **Log** tab in the Tinybird `pagerduty` data source.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-pagerduty#see-also)

- [  PagerDuty webhooks](https://support.pagerduty.com/docs/webhooks)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-orb
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Orb events to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to send Orb events to Tinybird using webhooks and the Events API."
---


# Send Orb events to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-orb#send-orb-events-to-tinybird)

Copy as MD [Orb](https://withorb.com/) is a developer-focused platform to manage your subscription billing and revenue operations. By integrating Orb with Tinybird, you can analyze your subscription billing data in real time and enrich it with other data sources.

Some common use cases for sending Orb events to Tinybird include:

1. Tracking and monitoring subscriptions.
2. Monitoring user churn.
3. Creating custom dashboards for subscription analysis.
4. Subscriptions logs.

Read on to learn how to send events from Orb to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-orb#before-you-start)

Before you connect Orb to Tinybird, ensure:

- You have an Orb account.
- You have a Tinybird workspace.

## Connect Orb to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-orb#connect-orb-to-tinybird)

1. From the Orb dashboard, select**  Developers**   >**  Webhooks**  .
2. Select**  Add Endpoint**  .
3. In your Tinybird project, create a data source called `orb`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/orb.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.type` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from Orb in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in Orb, paste the Events API URL in your Webhook Endpoint URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird. For example:

https://api.tinybird.co/v0/events?name=orb&token=<your user token>

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select**  Send test request**   to test the connection and check the data gets to the `orb`   data source in Tinybird.
2. You're done. Any of the Orb events is automatically sent to Tinybird through the[  Events API](../events-api)  .

You can check the status of the integration by clicking on the Webhook endpoint in Orb or from the **Log** tab in the Tinybird `orb` data source.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-orb#see-also)

- [  Orb webhooks](https://docs.withorb.com/guides/integrations-and-exports/webhooks)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry
Last update: 2025-07-09T15:09:59.000Z
Content:
---
title: "Ingest from OpenTelemetry · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to ingest metrics, traces, and logs from OpenTelemetry into Tinybird using the Tinybird OpenTelemetry Collector distribution."
---


# Ingest from OpenTelemetry [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry#ingest-from-opentelemetry)

Copy as MD [OpenTelemetry](https://opentelemetry.io/) is an open-source observability framework for collecting, processing, and exporting telemetry data (metrics, traces, and logs) from your applications and infrastructure.

By integrating OpenTelemetry with Tinybird, you can analyze observability data in real time, build dashboards, and enrich it with other data sources.

Some common use cases for sending OpenTelemetry data to Tinybird include:

1. Centralizing metrics, traces, and logs for unified analytics.
2. Building custom dashboards and alerts on top of observability data.
3. Enriching telemetry with business or application data.

Read on to learn how to send data from OpenTelemetry to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry#before-you-start)

Before you connect OpenTelemetry to Tinybird, ensure:

- You have a Tinybird workspace.
- You have a Tinybird Token with**  append**   permissions to the target Data Sources.
- You are running the Tinybird distribution of the OpenTelemetry Collector.

You can find the latest release of the Tinybird OpenTelemetry Collector here:

- [  GitHub Releases](https://github.com/tinybirdco/opentelemetry-collector-contrib/releases)
- [  Docker Hub](https://hub.docker.com/r/tinybirdco/opentelemetry-collector-contrib/tags)

## Use the Tinybird OpenTelemetry project template [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry#use-the-tinybird-opentelemetry-project-template)

To get started quickly, you can use the [Tinybird OpenTelemetry project template](https://github.com/tinybirdco/tinybird-otel-template) . This template provides ready-to-use Data Sources and Pipes for storing and analyzing your telemetry data in Tinybird.

## Recommended OpenTelemetry Collector configuration [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry#recommended-opentelemetry-collector-configuration)

Below is an example configuration for the Tinybird OpenTelemetry Collector to export metrics, traces, and logs to Tinybird:

##### config.yaml

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 10s
    send_batch_size: 8192

exporters:
  tinybird:
    endpoint: ${OTEL_TINYBIRD_API_HOST}         # Your Events API endpoint, e.g. https://api.us-east.aws.tinybird.co
    token: ${OTEL_TINYBIRD_TOKEN}               # Token with append permissions
    sending_queue:
      enabled: true
      queue_size: 104857600                # Total memory buffer in bytes (100 MB)
      sizer: bytes
      batch:
        flush_timeout: 5s                  # Max wait time before flushing
        min_size: 1024000                  # Min batch size: 1 MB
        max_size: 8388608                  # Max batch size: 8 MB (Events API limit is 10 MB)
    retry_on_failure:
      enabled: true
    metrics_sum:
      datasource: otel_metrics_sum
    metrics_histogram:
      datasource: otel_metrics_histogram
    metrics_exponential_histogram:
      datasource: otel_metrics_exponential_histogram
    metrics_gauge:
      datasource: otel_metrics_gauge
    traces:
      datasource: otel_traces
    logs:
      datasource: otel_logs

service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [tinybird]

    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [tinybird]

    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [tinybird]
### Environment variables [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry#environment-variables)

- `OTEL_TINYBIRD_API_HOST`   : The API host for your Tinybird workspace (e.g., `https://api.tinybird.co`   ).
- `OTEL_TINYBIRD_TOKEN`   : A Tinybird token with**  append**   permissions to the target Data Sources ( `otel_metrics`  , `otel_traces`  , `otel_logs`   ).

You can create a token in the Tinybird UI under **Tokens** . Make sure it has the required append permissions for the Data Sources you want to ingest into.

### Run the Collector [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry#run-the-collector)

#### Using the binary [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry#using-the-binary)

Precompiled binaries for Linux and macOS are available for both `amd64` and `arm64` architectures.

Download them from the [GitHub Releases page](https://github.com/tinybirdco/opentelemetry-collector-contrib/releases).

./otelcontribcol_linux_amd64 --config config.yaml
#### Using Docker [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry#using-docker)

docker run \
  --platform linux/amd64 \
  -v $(pwd)/config.yaml:/etc/otelcol-contrib/config.yaml \
  -p 4317:4317 \
  -p 4318:4318 \
  -e OTEL_TINYBIRD_API_HOST="${OTEL_TINYBIRD_API_HOST}" \
  -e OTEL_TINYBIRD_TOKEN="${OTEL_TINYBIRD_TOKEN}" \
  tinybirdco/opentelemetry-collector-contrib:v0.128.0 Use the `--platform linux/amd64` flag to ensure compatibility when running on ARM-based systems like Apple Silicon Macs.

### Troubleshooting [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry#troubleshooting)

To troubleshoot your setup, edit your `config.yaml` to enable detailed logging from the Collector. You can also check the [Tinybird Service Data Sources](/docs/forward/monitoring/service-datasources) to confirm data is arriving in your Workspace.

To enable detailed logging, define the `debug` exporter and add it to your service pipelines:

##### config.yaml

#(...)
exporters:
  debug:
    verbosity: detailed
exporters:
  tinybird:
    endpoint: ${OTEL_TINYBIRD_API_HOST}

#(...)
service:
  pipelines:
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug,tinybird]
#(...) You can send mock data with [otelgen](https://github.com/krzko/otelgen).

## Next steps [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-opentelemetry#next-steps)

- Explore and customize the[  Tinybird OpenTelemetry project template](https://github.com/tinybirdco/tinybird-otel-template)   to fit your needs.
- Use the ingested data to build real-time analytics, dashboards, and alerts in Tinybird.

For more details on the available configuration options, see the [Tinybird OpenTelemetry Collector documentation](https://github.com/tinybirdco/opentelemetry-collector-contrib).


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mongodb
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Ingest data from MongoDB · Tinybird Docs"
theme-color: "#171612"
description: "In this guide, you'll learn how to ingest data into Tinybird from MongoDB."
---


# Connect MongoDB to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mongodb#connect-mongodb-to-tinybird)

Copy as MD In this guide, you'll learn how to ingest data into Tinybird from MongoDB.

You'll use:

- MongoDB Atlas as the source MongoDB database.
- Confluent Cloud's MongoDB Atlas Source connector to capture change events from MongoDB Atlas and push to Kafka
- Tinybird Confluent Cloud connector to ingest the data from Kafka

This guide uses Confluent Cloud as a managed Kafka service, and MongoDB Atlas as a managed MongoDB service. You can use any Kafka service and MongoDB instance, but the setup steps may vary.

## Prerequisites [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mongodb#prerequisites)

This guide assumes you have:

- An existing Tinybird account & workspace
- An existing Confluent Cloud account
- An existing MongoDB Atlas account & collection

## 1. Create a Confluent Cloud source [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mongodb#1-create-a-confluent-cloud-source)

[Create a new MongoDB Atlas Source in Confluent Cloud](https://docs.confluent.io/cloud/current/connectors/cc-mongo-db-source.html#get-started-with-the-mongodb-atlas-source-connector-for-ccloud) . Use the following template to configure the Source:

{
  "name": "<CONNECTOR NAME>",
  "config": {
    "name": "<CONNECTOR NAME>",

    "connection.host": "<MONGO HOST>",
    "connection.user": "<MONGO USER>",
    "connection.password": "<MONGO PASS>",
    "database": "<MONGO DATABASE>",
    "collection": "<MONGO COLLECTION>",
    
    "cloud.provider": "<CLOUD PROVIDER>",
    "cloud.environment": "<CLOUD ENV>",
    "kafka.region": "<KAFKA REGION>",
    "kafka.auth.mode": "KAFKA_API_KEY",
    "kafka.api.key": "<KAFKA KEY>",
    "kafka.api.secret": "<KAFKA SECRET>",
    "kafka.endpoint": "<KAFKA ENDPOINT>",
    
    "topic.prefix": "<KAFKA TOPIC PREFIX>",
    "errors.deadletterqueue.topic.name": "<KAFKA DEADLETTER TOPIC>",
    
    "startup.mode": "copy_existing",
    "copy.existing": "true",
    "copy.existing.max.threads": "1",
    "copy.existing.queue.size": "16000",

    "poll.await.time.ms": "5000",
    "poll.max.batch.size": "1000",
    "heartbeat.interval.ms": "10000",
    "errors.tolerance": "all",
    "max.batch.size": "100",

    "connector.class": "MongoDbAtlasSource",
    "output.data.format": "JSON",
    "output.json.format": "SimplifiedJson",
    "json.output.decimal.format": "NUMERIC",
    "change.stream.full.document": "updateLookup",
    "change.stream.full.document.before.change": "whenAvailable",
    "tasks.max": "1"
  }
} When the source is created, you should see a new Kafka topic in your Confluent Cloud account. This topic will contain the change events from your MongoDB collection.

## 2. Create Tinybird data source [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mongodb#2-create-tinybird-data-source)

Create a new Kafka data source in your Tinybird project. See [Kafka](../connectors/kafka) for more information.

The data source should have the following schema:

SCHEMA >
    `_id` String `json:$.documentKey._id` DEFAULT JSONExtractString(__value, '_id._id'),
    `operation_type` LowCardinality(String) `json:$.operationType`,
    `database` LowCardinality(String) `json:$.ns.db`,
    `collection` LowCardinality(String) `json:$.ns.coll`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(__timestamp)"
ENGINE_SORTING_KEY "__timestamp, _id"

KAFKA_CONNECTION_NAME '<CONNECTION NAME>'
KAFKA_TOPIC '<KAFKA TOPIC>'
KAFKA_GROUP_ID '<KAFKA CONSUMER GROUP ID>'
KAFKA_AUTO_OFFSET_RESET 'earliest'
KAFKA_STORE_RAW_VALUE 'True'
KAFKA_STORE_HEADERS 'False'
KAFKA_STORE_BINARY_HEADERS 'True'
KAFKA_TARGET_PARTITIONS 'auto'
KAFKA_KEY_AVRO_DESERIALIZATION '' Deploy the data source before going to the next step.

## 3. Validate the data source [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mongodb#3-validate-the-data-source)

Go to Tinybird Cloud and validate that a data source has been created.

As changes occur in MongoDB, you should see the data being ingested into Tinybird. Note that this is an append log of all changes, so you will see multiple records for the same document as it's updated.

## 4. Deduplicate with ReplacingMergeTree [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mongodb#4-deduplicate-with-replacingmergetree)

Tinybird creates a new data source using the ReplacingMergeTree engine to store the deduplicated data, and a pipe to process the data from the original data source and write to the new data source.

First, create a new data source to store the deduplicated data.

Create a new file called `deduped_ds.datasource` and add the following content:

SCHEMA >
    `fullDocument` String,
    `_id` String,
    `database` LowCardinality(String),
    `collection` LowCardinality(String),
    `k_timestamp` DateTime,
    `is_deleted` UInt8

ENGINE "ReplacingMergeTree"
ENGINE_SORTING_KEY "_id"
ENGINE_VER "k_timestamp"
ENGINE_IS_DELETED "is_deleted" Then, create a new file called `dedupe_mongo.pipe` and add the following content:

NODE mv
SQL >

    SELECT
        JSONExtractRaw(__value, 'fullDocument') as fullDocument,
        _id,
        database,
        collection,
        __timestamp as k_timestamp,
        if(operation_type = 'delete', 1, 0) as is_deleted
    FROM <ORIGINAL DATASOURCE NAME>

TYPE materialized
DATASOURCE <DESTINATION DATASOURCE NAME> Deploy the changes. As new data arrives via Kafka, it's processed automatically through the materialized view, writing it into the `ReplacingMergeTree` data source.

Query this new data source to access the deduplicated data:

SELECT * FROM deduped_ds FINAL

---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mailgun
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Mailgun Events to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to send Mailgun events to Tinybird using webhooks and the Events API."
---


# Send Mailgun events to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mailgun#send-mailgun-events-to-tinybird)

Copy as MD [Mailgun](https://www.mailgun.com/) is a platform for sending email, and it provides a way to send events to Tinybird using webhooks.

Read on to learn how to send events from Mailgun to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mailgun#before-you-start)

Before you connect Mailgun to Tinybird, ensure:

- You have a Mailgun account.
- You have a Tinybird workspace.

## Connect Mailgun to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mailgun#connect-mailgun-to-tinybird)

Mailgun provides a variety of [webhook event types](https://mailgun-docs.redoc.ly/docs/mailgun/user-manual/events/#event-structure) that you can use to send events to Tinybird.

This guide covers the base case for sending Mailgun events to Tinybird.

1. In Mailgun, go to**  Send**   >**  Sending**   >**  Webhooks**  .
2. Select**  Domain**   and**  Add webhook**  .
3. In your Tinybird project, create a data source called `mailgun`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/mailgun.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.event-data.event` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from Mailgun in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in Mailgun, paste the Events API URL in your Webhook Endpoint URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird.

https://api.tinybird.co/v0/events?name=mailgun&format=json&token=<your user token> Make sure to use the `format=json` query parameter.


Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select**  Event type**   and choose the event you want to send to Tinybird. You can use the same Tinybird data source for multiple events.
2. Select**  Create webhook**   abd you're done.

Check the status of the integration from the **Log** tab in the Tinybird `mailgun` data source.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-mailgun#see-also)

- [  Mailgun Events](https://mailgun-docs.redoc.ly/docs/mailgun/user-manual/events/#event-structure)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-knock
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Knock Events to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to send Knock events to Tinybird using webhooks and the Events API."
---


# Send Knock events to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-knock#send-knock-events-to-tinybird)

Copy as MD [Knock](https://knock.app/) is a platform for notifications and alerts, and it provides a way to send events to Tinybird using webhooks.

Some common use cases for sending Knock events to Tinybird include:

1. Monitor Knock message events.
2. Run analytical workflows based on Knock events.
3. Create custom dashboards based on Knock events.
4. Create alerts and notifications based on Knock events.
5. Join Knock message events with other data sources to enrich your user data.

Read on to learn how to send events from Knock to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-knock#before-you-start)

Before you connect Knock to Tinybird, ensure:

- You have a Knock account.
- You have a Tinybird workspace.

## Connect Knock to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-knock#connect-knock-to-tinybird)

Knock provides a variety of [webhook event types](https://docs.knock.app/developer-tools/outbound-webhooks/event-types#message-events) that you can use to send events to Tinybird.

This guide covers the base case for sending Knock Message events to Tinybird.

1. In Knock, go to your repository**  Developers**   >**  Webhooks**  .
2. Select**  Create webhook**  .
3. Webhooks payloads vary depending on the event type. You can check here the list of[  Knock events](https://docs.knock.app/developer-tools/outbound-webhooks/event-types#message-events)  .

For this guide, select events related to `message`.

1. In your Tinybird project, create a data source called `knock`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/knock.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.type` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from Knock in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in Knock, paste the Events API URL in your Webhook Endpoint URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird.

https://api.tinybird.co/v0/events?name=knock&token=<your user token>

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select**  Save webhook**  .
2. You're done.

Check the status of the integration from the `Logs` tab in the Knock webhook or from the **Log** tab in the Tinybird `knock` data source.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-knock#see-also)

- [  Knock Webhooks](https://docs.knock.app/developer-tools/outbound-webhooks/overview)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-pubsub
Last update: 2025-06-17T11:38:44.000Z
Content:
---
title: "Ingest from Google Pub/Sub · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to send data from Google Pub/Sub to Tinybird."
---


# Stream from Google Pub/Sub [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-pubsub#stream-from-google-pubsub)

Copy as MD In this guide you'll learn how to send data from Google Pub/Sub to Tinybird.

## Overview [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-pubsub#overview)

[Google Pub/Sub](https://cloud.google.com/pubsub) is often used as a messaging middleware that decouples event stream sources from the end destination. Pub/Sub streams are usually consumed by Google's DataFlow which can send events on to destinations such as BigQuery, BigTable, or Google Cloud Storage.

This DataFlow pattern works with Tinybird too, however, Pub/Sub also has a feature called [Push subscriptions](https://cloud.google.com/pubsub/docs/push) which can forward messages directly from Pub/Sub to Tinybird. The following guide steps use the subscription approach.

## Push messages from Pub/Sub to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-pubsub#push-messages-from-pubsub-to-tinybird)

### 1. Create a Pub/Sub topic [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-pubsub#1-create-a-pubsub-topic)

Start by creating a topic in Google Pub/Sub following the [Google Pub/Sub documentation](https://cloud.google.com/pubsub/docs/admin#create_a_topic).

### 2. Create a push subscription [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-pubsub#2-create-a-push-subscription)

Next, [create a Push subscription in Pub/Sub](https://cloud.google.com/pubsub/docs/create-push-subscription).

Set the **Delivery Type** to **Push**.

In the **Endpoint URL** field, ue the following snippet (which uses the [Tinybird Events API](../events-api) ) and pass your own Token, which you can find in your workspace > Tokens:

##### Endpoint URL

https://api.tinybird.co/v0/events?wait=true&name=<Data Source name>&token=<Static token>

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

If you are sending single-line JSON payload through Pubsub, tick the **Enable payload unwrapping** option to enable unwrapping. This means that data isn't base64 encoded before sending it to Tinybird. If you are sending any other format via Pubsub, leave this unchecked (you'll need to follow the decoding steps at the bottom of this guide).

Set **Retry policy** to **Retry after exponential backoff delay** . Set the **Minimum backoff** to **1** and **Maximum backoff** to **60**.

You don't need to create the data source in advance, it will automatically be created for you. This snippet also includes the `wait=true` parameter, which is explained in the [Events API docs](../events-api#wait-for-acknowledgement).

### 3. Send sample messages [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-pubsub#3-send-sample-messages)

Generate and send some sample messages to test your connection. If you don't have your own messages to test, use [this script](https://gist.github.com/alejandromav/dec8e092ef62d879e6821da06f6459c2).

### 4. Check the data source [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-pubsub#4-check-the-data-source)

Pub/sub will start to push data to Tinybird. Check the Tinybird UI to see that the data source has been created and events are arriving.

### (Optional) Decode the payload [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-pubsub#optional-decode-the-payload)

If you enabled the **Enable payload unwrapping** option, there is nothing else to do.

However, if you aren't sending single-line JSON payloads (NDJSON, JOSNL) through Pubsub, you'll need to continue to base64 encode data before sending it to Tinybird. When the data arrived in Tinybird, you can decode it using the `base64Decode` function, like this:

SELECT
    message_message_id as message_id,
    message_publish_time,
    base64Decode(message_data) as message_data
  FROM events_demo

---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-gcs
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Ingest from Google Cloud Storage · Tinybird Docs"
theme-color: "#171612"
description: "In this guide, you'll learn how to automatically synchronize all the CSV files in a Google GCS bucket to a Tinybird data source."
---


# Ingest from Google Cloud Storage [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-gcs#ingest-from-google-cloud-storage)

Copy as MD In this guide, you'll learn how to automatically synchronize all the CSV files in a Google GCS bucket to a Tinybird data source.

## Prerequisites [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-gcs#prerequisites)

This guide assumes you have familiarity with [Google GCS buckets](https://cloud.google.com/storage/docs/buckets) and the basics of [ingesting data into Tinybird](../../get-data-in).

## Perform a one-off load [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-gcs#perform-a-one-off-load)

When building on Tinybird, people often want to load historical data that comes from another system (called 'seeding' or 'backfilling'). A very common pattern is exporting historical data by creating a dump of CSV files into a Google GCS bucket, then ingesting these CSV files into Tinybird.

You can append these files to a data source in Tinybird using the data sources API.

Let's assume you have a set of CSV files in your GCS bucket:

##### List of events files

tinybird-assets/datasets/guides/events/events_0.csv
tinybird-assets/datasets/guides/events/events_1.csv
tinybird-assets/datasets/guides/events/events_10.csv
tinybird-assets/datasets/guides/events/events_11.csv
tinybird-assets/datasets/guides/events/events_12.csv
tinybird-assets/datasets/guides/events/events_13.csv
tinybird-assets/datasets/guides/events/events_14.csv
tinybird-assets/datasets/guides/events/events_15.csv
tinybird-assets/datasets/guides/events/events_16.csv
tinybird-assets/datasets/guides/events/events_17.csv
tinybird-assets/datasets/guides/events/events_18.csv
tinybird-assets/datasets/guides/events/events_19.csv
tinybird-assets/datasets/guides/events/events_2.csv
tinybird-assets/datasets/guides/events/events_20.csv
tinybird-assets/datasets/guides/events/events_21.csv
tinybird-assets/datasets/guides/events/events_22.csv
tinybird-assets/datasets/guides/events/events_23.csv
tinybird-assets/datasets/guides/events/events_24.csv
tinybird-assets/datasets/guides/events/events_25.csv
tinybird-assets/datasets/guides/events/events_26.csv
tinybird-assets/datasets/guides/events/events_27.csv
tinybird-assets/datasets/guides/events/events_28.csv
tinybird-assets/datasets/guides/events/events_29.csv
tinybird-assets/datasets/guides/events/events_3.csv
tinybird-assets/datasets/guides/events/events_30.csv
tinybird-assets/datasets/guides/events/events_31.csv
tinybird-assets/datasets/guides/events/events_32.csv
tinybird-assets/datasets/guides/events/events_33.csv
tinybird-assets/datasets/guides/events/events_34.csv
tinybird-assets/datasets/guides/events/events_35.csv
tinybird-assets/datasets/guides/events/events_36.csv
tinybird-assets/datasets/guides/events/events_37.csv
tinybird-assets/datasets/guides/events/events_38.csv
tinybird-assets/datasets/guides/events/events_39.csv
tinybird-assets/datasets/guides/events/events_4.csv
tinybird-assets/datasets/guides/events/events_40.csv
tinybird-assets/datasets/guides/events/events_41.csv
tinybird-assets/datasets/guides/events/events_42.csv
tinybird-assets/datasets/guides/events/events_43.csv
tinybird-assets/datasets/guides/events/events_44.csv
tinybird-assets/datasets/guides/events/events_45.csv
tinybird-assets/datasets/guides/events/events_46.csv
tinybird-assets/datasets/guides/events/events_47.csv
tinybird-assets/datasets/guides/events/events_48.csv
tinybird-assets/datasets/guides/events/events_49.csv
tinybird-assets/datasets/guides/events/events_5.csv
tinybird-assets/datasets/guides/events/events_6.csv
tinybird-assets/datasets/guides/events/events_7.csv
tinybird-assets/datasets/guides/events/events_8.csv
tinybird-assets/datasets/guides/events/events_9.csv
### Ingest a single file [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-gcs#ingest-a-single-file)

To ingest a single file, [generate a signed URL in GCP](https://cloud.google.com/storage/docs/access-control/signed-urls) , and send the URL to the data sources API using the `append` mode flag:

##### Example POST request with append mode flag

curl -H "Authorization: Bearer <your_auth_token>" \
   -X POST "https://api.tinybird.co/v0/datasources?name=<my_data_source_name>&mode=append" \
   --data-urlencode "url=<my_gcs_file_http_url>"
### Ingest multiple files [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-gcs#ingest-multiple-files)

If you want to ingest multiple files, you probably don't want to manually write each cURL. Instead, create a script to iterate over the files in the bucket and generate the cURL commands automatically.

The following script example requires the [gsutil tool](https://cloud.google.com/storage/docs/gsutil) and assumes you have already created your Tinybird data source.

You can use the `gsutil` tool to list the files in the bucket, extract the name of the CSV file, and create a signed URL. Then, generate a cURL to send the signed URL to Tinybird.

To avoid hitting [API rate limits](/docs/api-reference#limits) you should delay 15 seconds between each request.

Here's an example script in bash:

##### Ingest CSV files from a Google Cloud Storage Bucket to Tinybird

TB_HOST=<region>
TB_TOKEN=<token>
BUCKET=gs://<name_of_bucket>
DESTINATION_DATA_SOURCE=<name_of_datasource>
GOOGLE_APPLICATION_CREDENTIALS=
REGION=<region>

for url in $(gsutil ls $BUCKET | grep csv)
do
  echo $url
  SIGNED=`gsutil signurl -r $REGION $GOOGLE_APPLICATION_CREDENTIALS $url | tail -n 1 | python3 -c "import sys; print(sys.stdin.read().split('\t')[-1])"`
  curl -H "Authorization: Bearer $TB_TOKEN" \
    -X POST "$TB_HOST/v0/datasources?name=$DESTINATION_DATA_SOURCE&mode=append" \
    --data-urlencode "url=$SIGNED"
  echo
  sleep 15
done The script uses the following variables:

- `TB_HOST`   as the corresponding URL for[  your region](/docs/api-reference#regions-and-endpoints)  .
- `TB_TOKEN`   as a Tinybird[  Token](../../administration/auth-tokens)   with `DATASOURCE:CREATE`   or `DATASOURCE:APPEND`   scope. See the[  Tokens API](/docs/api-reference/token-api)   for more information.
- `BUCKET`   as the GCS URI of the bucket containing the events CSV files.
- `DESTINATION_DATA_SOURCE`   as the name of the data source in Tinybird, in this case `events`  .
- `GOOGLE_APPLICATION_CREDENTIALS`   as the local path of a Google Cloud service account JSON file.
- `REGION`   as the Google Cloud region name.

## Automatically sync files with Google Cloud Functions [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-google-gcs#automatically-sync-files-with-google-cloud-functions)

The previous scenario covered a one-off dump of CSV files in a bucket to Tinybird. A slightly more complex scenario is appending to a Tinybird data source each time a new CSV file is dropped into a GCS bucket, which can be done using Google Cloud Functions.

That way you can have your ETL process exporting data from your Data Warehouse (such as Snowflake or BigQuery) or any other origin and you don't have to think about manually synchronizing those files to Tinybird.

Imagine you have a GCS bucket named `gs://automatic-ingestion-poc/` and each time you put a CSV there you want to sync it automatically to an `events` data source previously created in Tinybird:

1. Clone<a href="https://github.com/tinybirdco/gcs-cloud-function">  this GitHub repository ( `gcs-cloud-function`   )</a>  .
2. Install and configure the `gcloud`   command line tool.
3. Run `cp .env.yaml.sample .env.yaml`   and set the `TB_HOST`   , and `TB_TOKEN`   variable
4. Run:

##### Syncing from GCS to Tinybird with Google Cloud Functions

# set some environment variables before deploying
PROJECT_NAME=<the_GCP_project_name>
SERVICE_ACCOUNT_NAME=<service_account_name@project_name.iam.gserviceaccount.com>
BUCKET_NAME=<bucket_name>
REGION=<region>
TB_FUNCTION_NAME=<name_of_the_function>

# grant permissions to deploy the cloud function and read from storage to the service account
gcloud projects add-iam-policy-binding $PROJECT_NAME --member serviceAccount:$SERVICE_ACCOUNT_NAME --role roles/storage.admin
gcloud projects add-iam-policy-binding $PROJECT_NAME --member serviceAccount:$SERVICE_ACCOUNT_NAME --role roles/iam.serviceAccountTokenCreator
gcloud projects add-iam-policy-binding $PROJECT_NAME --member serviceAccount:$SERVICE_ACCOUNT_NAME --role roles/editor

# deploy the cloud function
gcloud functions deploy $TB_FUNCTION_NAME \
  --runtime python38 \
  --trigger-resource $BUCKET_NAME \
  --trigger-event google.storage.object.finalize \
  --region $REGION \
  --env-vars-file .env.yaml \
  --service-account $SERVICE_ACCOUNT_NAME It deploys a Google Cloud Function with name `TB_FUNCTION_NAME` to your Google Cloud account, which listens for new files in the `BUCKET_NAME` provided (in this case `automatic-ingestion-poc` ), and automatically appends them to the Tinybird data source described by the `FILE_REGEXP` environment variable.


<-figure->
![](/docs/_next/image?url=%2Fdocs%2Fimg%2Fsyncing-data-from-s3-or-gcs-buckets-3.png&w=3840&q=75)

<-figcaption->
Cloud function to sync a GCS bucket to Tinybird

</-figcaption->


</-figure->
Now you can drop CSV files into the configured bucket:


<-figure->
![](/docs/_next/image?url=%2Fdocs%2Fimg%2Fsyncing-data-from-s3-or-gcs-buckets-4.gif&w=3840&q=75)

<-figcaption->
Drop files to a GCS bucket and check the datasources_ops_log

</-figcaption->


</-figure->
A recommended pattern is naming the CSV files in the format `datasourcename_YYYYMMDDHHMMSS.csv` so they are automatically appended to `datasourcename` in Tinybird. For instance, `events_20210125000000.csv` will be appended to the `events` data source.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-gitlab
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send GitLab Events to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to send GitLab events to Tinybird using webhooks and the Events API."
---


# Send GitLab events to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-gitlab#send-gitlab-events-to-tinybird)

Copy as MD [GitLab](https://gitlab.com/) is a platform for building and deploying web applications. By integrating GitLab with Tinybird, you can analyze your GitLab events in real time and enrich it with other data sources.

Some common use cases for sending GitLab events to Tinybird include:

1. Analyze GitLab issues and merge requests.
2. Analyze GitLab push events.
3. Analyze and monitor GitLab pipeline.
4. Analyze custom DORA metrics.

All this allows you to build a more complete picture of your GitLab events and improve your DevOps processes.

Read on to learn how to send events from GitLab to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-gitlab#before-you-start)

Before you connect GitLab to Tinybird, ensure:

- You have a GitLab account.
- You have a Tinybird workspace.

## Connect GitLab to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-gitlab#connect-gitlab-to-tinybird)

1. In GitLab, go to**  Settings**   >**  Webhooks**  .
2. Select**  Add new webhook**  .
3. Webhooks payloads vary depending on the event type. You can check here the list of[  GitLab events](https://docs.gitlab.com/ee/user/project/integrations/webhook_events.html)  .

Select **Issues Events**.

1. In your Tinybird project, create a data source called `gitlab`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/gitlab.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.object_kind` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from GitLab in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in GitLab, paste the Events API URL in your Webhook URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird.

https://api.tinybird.co/v0/events?name=gitlab

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select**  Add custom header**   and add 'Authorization' as**  Header name**   and paste the token you created in Tinybird as**  Header value**  .

Bearer <your user token>
1. You're done. You can select**  Test**   to check if the webhook is working.

Check the status of the integration from the **Log** tab in the Tinybird `gitlab` data source.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-gitlab#see-also)

- [  GitLab Webhooks](https://docs.gitlab.com/ee/user/project/integrations/webhook_events.html)
- [  Tinybird data sources](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-github
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send GitHub Events to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to send GitHub events to Tinybird using webhooks and the Events API."
---


# Send GitHub events to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-github#send-github-events-to-tinybird)

Copy as MD [GitHub](https://github.com/) is a platform for building and deploying web applications. By integrating GitHub with Tinybird, you can analyze your GitHub events in real time and enrich it with other data sources.

Some common use cases for sending GitHub events to Tinybird include:

1. Analyze GitHub issues and pull requests.
2. Analyze GitHub push events.
3. Analyze and monitor GitHub pipeline.
4. Analyze custom DORA metrics.

All this allows you to build a more complete picture of your GitHub events and improve your DevOps processes.

Read on to learn how to send events from GitHub to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-github#before-you-start)

Before you connect GitHub to Tinybird, ensure:

- You have a GitHub account.
- You have a Tinybird workspace.

## Connect GitHub to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-github#connect-github-to-tinybird)

GitHub provides a variety of webhooks (+70) that you can use to send events to Tinybird at organization, repository or application level.

This guide covers the base case for sending GitHub events from a repository to Tinybird.

1. In GitHub, go to your repository**  Settings**   >**  Webhooks**  .
2. Select**  Add webhook**  .
3. Webhooks payloads vary depending on the event type. You can check here the list of[  GitHub events](https://docs.github.com/en/webhooks/webhook-events-and-payloads)  .

Select **Send me everything**.

1. In your Tinybird project, create a data source called `github`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/github.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.type` DEFAULT 'unknown',
  `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from GitHub in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in GitHub, paste the Events API URL in your Webhook URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird.

https://api.tinybird.co/v0/events?name=github&token=<your user token>

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select**  application/json**   as the content type.
2. You're done.

Check the status of the integration from the `Recent deliveries` in the GitHub webhooks panel or from the **Log** tab in the Tinybird `github` data source.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-github#see-also)

- [  GitHub Webhook events and payloads](https://docs.github.com/en/webhooks/webhook-events-and-payloads)
- [  GitHub Webhooks](https://docs.github.com/en/webhooks/using-webhooks/creating-webhooks)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dynamodb-single-table-design
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Working with DynamoDB Single-Table Design · Tinybird Docs"
theme-color: "#171612"
description: "In this guide, you'll learn how to work with data that follows DynamoDB Single-Table Design."
---


# Working with DynamoDB Single-Table Design [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dynamodb-single-table-design#working-with-dynamodb-single-table-design)

Copy as MD Single-Table Design is a common pattern [recommended by AWS](https://aws.amazon.com/blogs/compute/creating-a-single-table-design-with-amazon-dynamodb/) in which different table schemas are stored in the same table. Single-table design makes it easier to support many-to-many relationships and avoid the need for JOINs, which DynamoDB doesn't support.

Single-Table Design is a good pattern for DynamoDB, but it's not optimal for analytics. To achieve higher performance in Tinybird, normalize data from DynamoDB into multiple tables that support the access patterns of your analytical queries.

The normalization process is achieved entirely within Tinybird by ingesting the raw DynamoDB data into a landing data source and then creating materialized views to extract items into separate tables.

This guide assumes you're familiar with DynamoDB, Tinybird, creating DynamoDB data sources in Tinybird, and materialized views.

## Example DynamoDB Table [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dynamodb-single-table-design#example-dynamodb-table)

For example, if Tinybird metadata were stored in DynamoDB using Single-Table Design, the table might look like this:

- **  Partition Key**  : `Org#Org_name`   , example values:**  Org#AWS**   or**  Org#Tinybird**  .
- **  Sort Key**  : `Item_type#Id`   , example values:**  USER#1**   or**  WS#2**  .
- **  Attributes**   : the information stored for each kind of item, like user email or workspace cores.

## Create the DynamoDB data source [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dynamodb-single-table-design#create-the-dynamodb-data-source)

Use the [DynamoDB Connector](../connectors/dynamodb) to ingest your DynamoDB table into a data source.

Rather than defining all columns in this landing data source, set only the Partition Key (PK) and Sort Key (SK) columns. The rest of the attributes are stored in the `_record` column as JSON. You don't need to define the `_record` column in the schema, as it's created automatically.

SCHEMA >
   `PK` String `json:$.Org#Org_name`,
   `SK` String `json:$.Item_type#Id`

ENGINE "ReplacingMergeTree"
ENGINE_SORTING_KEY "PK, SK"
ENGINE_VER "_timestamp"
ENGINE_IS_DELETED "_is_deleted"

IMPORT_SERVICE 'dynamodb'
IMPORT_CONNECTION_NAME <your_connection_name>
IMPORT_TABLE_ARN <your_table_arn>
IMPORT_EXPORT_BUCKET <your_dynamodb_export_bucket> The following image shows how data looks. The DynamoDB Connector creates some additional rows, such as `_timestamp` , that aren't in the .datasource file:


<-figure->
![DynamoDB Table storing users and worskpaces information](/docs/_next/image?url=%2Fdocs%2Fimg%2Fguides-ddb-std-2.png&w=3840&q=75)

<-figcaption->
DynamoDB Table storing users and worskpaces information

</-figcaption->


</-figure->
## Use a pipe to filter and extract items [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dynamodb-single-table-design#use-a-pipe-to-filter-and-extract-items)

Data is now be available in your landing data source. However, you need to use the `JSONExtract` function to access attributes from the `_record` column. To optimize performance, use Materialized Views to extract and store item types in separate data sources with their own schemas.

Create a pipe, use the PK and SK columns as needed to filter for a particular item type, and parse the attributes from the JSON in `_record` column.

The example table has User and workspace items, requiring a total of two materialized views, one for each item type.


<-figure->
![Workspace Data Flow showing std connection, landing DS and users and workspaces materialized views](/docs/_next/image?url=%2Fdocs%2Fimg%2Fguides-ddb-std-4.png&w=3840&q=75)

<-figcaption->
Two materialized views from landing DS

</-figcaption->


</-figure->
To extract the workspace items, the pipe uses the SK to filter for workspace items, and parses the attributes from the JSON in `_record` column. For example:

SELECT
  toLowCardinality(splitByChar('#', PK)[2]) org,
  toUInt32(splitByChar('#', SK)[2]) workspace_id,
  JSONExtractString(_record,'ws_name') ws_name,
  toUInt16(JSONExtractUInt(_record,'cores')) cores,
  JSONExtractUInt(_record,'storage_tb') storage_tb,
  _record,
  _old_record,
  _timestamp,
  _is_deleted
FROM dynamodb_ds_std
WHERE splitByChar('#', SK)[1] = 'WS'
## Create the materialized views [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dynamodb-single-table-design#create-the-materialized-views)

Create a materialized view from the pipe to store the extracted data in a new data source.

The materialized view must use the ReplacingMergeTree engine to handle the deduplication of rows, supporting updates and deletes from DynamoDB. Use the following engine settings and configure them as needed for your table:

- `ENGINE "ReplacingMergeTree"`   : the ReplacingMergeTree engine is used to deduplicate rows.
- `ENGINE_SORTING_KEY "key1, key2"`   : the columns used to identify unique items, can be one or more columns, typically the part of the PK and SK that isn't idetifying Item type.
- `ENGINE_VER "_timestamp"`   : the column used to identify the most recent row for each key.
- `ENGINE_IS_DELETED "_is_deleted"`   : the column used to identify if a row has been deleted.

For example, the materialized view for the workspace items uses the following schema and engine settings:

SCHEMA >
    `org` LowCardinality(String),
    `workspace_id` UInt32,
    `ws_name` String,
    `cores` UInt16,
    `storage_tb` UInt64,
    `_record` String,
    `_old_record` Nullable(String),
    `_timestamp` DateTime64(3),
    `_is_deleted` UInt8

ENGINE "ReplacingMergeTree"
ENGINE_SORTING_KEY "org, workspace_id"
ENGINE_VER "_timestamp"
ENGINE_IS_DELETED "_is_deleted" Repeat the same process for each item type.


<-figure->
![Materialized View for extracting Users attributes](/docs/_next/image?url=%2Fdocs%2Fimg%2Fguides-ddb-std-3.png&w=3840&q=75)

<-figcaption->
Materialized View for extracting Users attributes

</-figcaption->


</-figure->
You have now your data sources with the extracted columns ready to be queried.

## Review performance gains [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dynamodb-single-table-design#review-performance-gains)

This process offers significant performance gains over querying the landing data source. To demonstrate this, you can use a Playground to compare the performance of querying the raw data vs the extracted data.

For the example table, the following queries aggregate the total number of users, workspaces, cores, and storage per organization using the unoptimized raw data and the optimized extracted data. The query over raw data took 335 ms, while the query over the extracted data took 144 ms, for a 2.3x improvement.

NODE users_stats
SQL >
    SELECT org, count() total_users 
    FROM ddb_users_mv FINAL
    GROUP BY org


NODE ws_stats
SQL >
    SELECT org, count() total_workspaces, sum(cores) total_cores, sum(storage_tb) total_storage_tb
    FROM ddb_workspaces_mv FINAL
    GROUP BY org


NODE users_stats_raw
SQL >
    SELECT
      toLowCardinality(splitByChar('#', PK)[2]) org,
      count() total_users
    FROM dynamodb_ds_std FINAL
    WHERE splitByChar('#', SK)[1] = 'USER'
    GROUP BY org


NODE ws_stats_raw
SQL >
    SELECT
      toLowCardinality(splitByChar('#', PK)[2]) org,
      count() total_ws,
      sum(toUInt16(JSONExtractUInt(_record,'cores'))) total_cores,
      sum(JSONExtractUInt(_record,'storage_tb')) total_storage_tb
    FROM dynamodb_ds_std FINAL
    WHERE splitByChar('#', SK)[1] = 'WS'
    GROUP BY org


NODE org_stats
SQL >
    SELECT * FROM users_stats JOIN ws_stats using org


NODE org_stats_raw
SQL >
    SELECT * FROM users_stats_raw JOIN ws_stats_raw using org This is how the outcome looks in Tinybird:


<-figure->
![Comparison of same query](/docs/_next/image?url=%2Fdocs%2Fimg%2Fguides-ddb-std-5.png&w=3840&q=75)

<-figcaption->
Same info, faster and more efficient from materialized views

</-figcaption->


</-figure->


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dub
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Dub webhooks to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to send data from Dub to Tinybird."
---


# Send Dub webhooks to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dub#send-dub-webhooks-to-tinybird)

Copy as MD With [Dub](https://dub.co/) , you can shorten any link and get powerful [conversion analytics](https://dub.co/analytics) . By integrating Dub with Tinybird, you can analyze your events and usage data in real time.

Some common use cases for sending Dub webhooks to Tinybird include:

1. Tracking link clicks.
2. Monitoring link performance.
3. Analyzing user engagement patterns.
4. Creating custom dashboards for link performance.
5. Enriching other data sources with real-time link metrics.

Read on to learn how to send data from Dub to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dub#before-you-start)

Before you connect Dub to Tinybird, ensure:

- You have a Dub account.
- You have a Tinybird workspace.

## Connect Dub to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-dub#connect-dub-to-tinybird)

1. Open the Dub UI and go to the**  Settings**   >**  Webhooks**   page.
2. Select**  Create Webhook**  .
3. In your Tinybird project, create a data source called `dub`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/dub.datasource)  :

SCHEMA >
  `event_time` DateTime `json:$.tinybirdIngestTime` DEFAULT now(),
  `event_type` String `json:$.event` DEFAULT 'unknown',
  `event` JSON(max_dynamic_types=2, max_dynamic_paths=16) `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from Dub in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. From Tinybird Cloud, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in Dub, paste the Events API URL as your webhook URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird. For example:

https://api.tinybird.co/v0/events?name=dub&token=<your user token>

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. Select the checkboxes for the Dub events you want to send to Tinybird, and select**  Create webhook**  .
2. You're done. Dub will now push events to Tinybird via the[  Events API](../events-api)  .


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-csv-files
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Ingest CSV files · Tinybird Docs"
theme-color: "#171612"
description: "In this guide, you'll learn how to ingest data into Tinybird using CSV (comma-separated values) files."
---


# Ingest CSV files [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-csv-files#ingest-csv-files)

Copy as MD CSV (comma-separated values) is one of the most widely used formats out there. However, it's used in different ways; some people don't use commas, and other people use escape values differently, or are unsure about using headers.

The Tinybird platform is smart enough to handle many scenarios. If your data doesn't comply with format and syntax best practices, Tinybird will still aim to understand your file and ingest it, but following certain best practices can speed your CSV processing speed by up to 10x.

## Syntax best practices [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-csv-files#syntax-best-practices)

By default, Tinybird processes your CSV file assuming the file follows the most common standard ( [RFC4180](https://datatracker.ietf.org/doc/html/rfc4180#section-2) ). Key points:

- Separate values with commas.
- Each record is a line (with CRLF as the line break). The last line may or may not have a line break.
- First line as a header is optional (though not using one is faster in Tinybird.)
- Double quotes are optional but using them means you can escape values (for example, if your content has commas or line breaks).

Example: Instead of using the backslash `\` as an escape character, like this:

1234567890,0,0,0,0,2021-01-01 10:00:00,"{\"authorId\":\"123456\",\"handle\":\"aaa\"}" Use two double quotes:

##### More performant

1234567890,0,0,0,0,2021-01-01 10:00:00,"{""authorId"":""123456"",""handle"":""aaa""}"
- Fields containing line breaks, double quotes, and commas should be enclosed in double quotes.
- Double quotes can also be escaped by using another double quote (""aaa"",""b""""bb"",""ccc"")

In addition to the previous points, it's also recommended to:

1. Format `DateTime`   columns as `YYYY-MM-DD HH:MM:SS`   and `Date`   columns as `YYYY-MM-DD`  .
2. Send the encoding in the `charset`   part of the `content-type`   header, if it's different to UTF-8. The expectation is UTF-8, so it should look like this `Content-Type: text/html; charset=utf-8`  .
3. You can set values as `null`   in different ways, for example,*  ""[]""*  ,*  """"*   (empty space),*  N*   and*  "N"*  .
4. If you use a delimiter other than a comma, explicitly define it with the API parameter*  ``dialect_delimiter``.*
5. If you use an escape character other than a ", explicitly define it with the API parameter*  ``dialect_escapechar``.*
6. If you have no option but to use a different line break character, explicitly define it with the API parameter `dialect_new_line`  .

For more information, check the [Data Sources API docs](/docs/api-reference/datasource-api).

## Append data [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-csv-files#append-data)

Once the data source schema has been created, you can optimize your performance by not including the header. Just keep the data in the same order.

However, if the header is included and it contains all the names present in the data source schema the ingestion will still work (even if the columns follow a different order to the initial creation).


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-clerk
Last update: 2025-05-21T17:17:07.000Z
Content:
---
title: "Send Clerk webhooks to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to send data from Clerk to Tinybird."
---


# Send Clerk webhooks to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-clerk#send-clerk-webhooks-to-tinybird)

Copy as MD [Clerk](https://clerk.com/) is a developer-focused user management platform to handle user authentication with many prebuilt UI components. By integrating Clerk with Tinybird, you can analyze your user authentication data in real time and enrich it with other data sources.

Some common use cases for sending Clerk webhooks to Tinybird include:

1. Tracking net user and organization growth.
2. Monitoring user churn.
3. Identifying common auth errors.
4. Creating custom dashboards for auth analysis.
5. Enriching other data sources with real-time auth metrics.

Read on to learn how to send data from Clerk to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-clerk#before-you-start)

Before you connect Clerk to Tinybird, ensure:

- You have a Clerk account.
- You have a Tinybird workspace.

## Connect Clerk to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-clerk#connect-clerk-to-tinybird)

1. From the Clerk UI, select**  Configure**   >**  Webhooks**  .
2. Select**  Add Endpoint**  .
3. In your Tinybird project, create a data source called `clerk`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/clerk.datasource)  :

SCHEMA >
    `event_time` DateTime64(3) `json:$.tinybirdIngestTime` DEFAULT now(),
    `event_type` String `json:$.type` DEFAULT 'unknown',
    `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from Clerk in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. Back in Clerk, paste the Events API URL in your Webhook Endpoint URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird, for example:

https://api.tinybird.co/v0/events?name=clerk

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

1. From Tinybird Cloud, copy a token with privileges to write to the data source you created. You can use the admin token or create one with the required scope.
2. Return to the Clerk Webhooks page, and update the URL to add a new search parameter `token`   with the token you copied. The final URL looks like the following:

https://api.tinybird.co/v0/events?name=clerk&token=p.eyXXXXX
1. Select the checkboxes for the Clerk events you want to send to Tinybird, and select**  Create**  .
2. You're done. Any of the Clerk events you selected is automatically sent to Tinybird through the[  Events API](../events-api)   . You can test the integration from the**  Testing**   tab in the Clerk Webhooks UI.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-bigquery-using-google-cloud-storage
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Ingest from BigQuery using Google Cloud Storage · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to send data from BigQuery to Tinybird using Google Cloud Storage."
---


# Ingest data from BigQuery using Google Cloud Storage [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-bigquery-using-google-cloud-storage#ingest-data-from-bigquery-using-google-cloud-storage)

Copy as MD Read on to learn how to send data from BigQuery to Tinybird, for example when you need to periodically run full replaces of a table or do a one-off ingest.

This process relies on [BigQuery's exporting capabilities](https://cloud.google.com/bigquery/docs/exporting-data#sql) , or bulk exporting data as gzipped CSVs and then ingesting them using the [Data Sources API](/docs/api-reference/datasource-api).

## Prerequisites [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-bigquery-using-google-cloud-storage#prerequisites)

To follow these steps you need a Tinybird account and access to BigQuery and Google Cloud Storage.

1
## Export the BigQuery table [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-bigquery-using-google-cloud-storage#export-the-bigquery-table)

The first step consists in unloading the BigQuery table, or query result set, to a gzipped CSV file.

Run the following SQL statement to unload the data:

EXPORT DATA
  OPTIONS (
    uri = 'gs://bucket/folder/*.csv',
    format = 'CSV',
    compression = 'GZIP',
    overwrite = true,
    header = true,
    field_delimiter = ';')
AS (
  SELECT field1, field2
  FROM mydataset.table1
  ORDER BY field1
); More details about the `EXPORT DATA` statement in the [Google Cloud docs](https://cloud.google.com/bigquery/docs/reference/standard-sql/export-statements#export_data_statement)

To automate the unloading, you can create a BigQuery Scheduled Query.

You can follow [Google Cloud official docs for Scheduled queries](https://cloud.google.com/bigquery/docs/scheduling-queries).

2
## Ingest data into Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-bigquery-using-google-cloud-storage#ingest-data-into-tinybird)

After your data is in a Google Cloud Storage bucket, you can ingest it in Tinybird generating signed URLs and sending them to the [Data Sources API](/docs/api-reference/datasource-api).

You can use this example script:

import json
import time
import requests
from google.cloud import storage
from google.auth.credentials import AnonymousCredentials
from google.oauth2 import service_account
from datetime import datetime, timedelta

# Configuration
BUCKET_NAME = "your-gcs-bucket"
TINYBIRD_HOST = "https://api.tinybird.co"
TINYBIRD_DATASOURCE = "your_datasource_name"
TINYBIRD_TOKEN = "your_tinybird_token"
SERVICE_ACCOUNT_JSON = "path/to/service-account.json"
URL_EXPIRATION = 3600  # 1 hour

# Authenticate with GCS
credentials = service_account.Credentials.from_service_account_file(SERVICE_ACCOUNT_JSON)
storage_client = storage.Client(credentials=credentials)

def generate_signed_url(blob_name):
    """Generate a signed URL for a blob in GCS."""
    bucket = storage_client.bucket(BUCKET_NAME)
    blob = bucket.blob(blob_name)
    url = blob.generate_signed_url(expiration=timedelta(seconds=URL_EXPIRATION))
    return url

def list_gcs_files():
    """List files in the specified GCS bucket."""
    bucket = storage_client.bucket(BUCKET_NAME)
    return [blob.name for blob in bucket.list_blobs()]

def send_to_tinybird(url, tinybird_token, datasource_name):
    """Uploads signed URLs to Tinybird."""
    u = f"{TINYBIRD_HOST}/v0/datasources?name={datasource_name}&mode=append&format=ndjson&token={tinybird_token}"
    
    data = {"url": url}
    response = requests.post(u, data)
    
    if response.status_code == 200:
        print("Successfully sent data to Tinybird")
    else:
        print(f"Failed to send data: {response.text}")

def main():
    print("Fetching GCS files...")
    files = list_gcs_files()
    signed_urls = [generate_signed_url(file) for file in files]
    print(f"Generated {len(signed_urls)} signed URLs.")

    print("Sending to Tinybird...")
    for url in signed_urls:
        send_to_tinybird(url, TINYBIRD_TOKEN, TINYBIRD_DATASOURCE)
        time.sleep(15)

if __name__ == "__main__":
    main() Check more detailed use cases in the [Ingest from Google Cloud Storage](./ingest-from-google-gcs) guide.

## Limits [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-bigquery-using-google-cloud-storage#limits)

Because you're using the data sources API, its [limits](/docs/api-reference#limits) apply.

Remember you can [limit the size of the exported files](https://cloud.google.com/bigquery/docs/exporting-data#limit_the_exported_file_size) to accommodate to your Tinybird plan limits.

## Next steps [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-bigquery-using-google-cloud-storage#next-steps)

See the following resources:

- [  Ingest from Google Cloud Storage](./ingest-from-google-gcs)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Stream from AWS Kinesis · Tinybird Docs"
theme-color: "#171612"
description: "In this guide, you'll learn how to send data from AWS Kinesis to Tinybird."
---


# Stream from AWS Kinesis [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis#stream-from-aws-kinesis)

Copy as MD In this guide, you'll learn how to send data from AWS Kinesis to Tinybird.

If you have a [Kinesis Data Stream](https://aws.amazon.com/kinesis/data-streams/) that you want to send to Tinybird, it should be pretty quick thanks to [Kinesis Firehose](https://aws.amazon.com/kinesis/data-firehose/) . This page explains how to integrate Kinesis with Tinybird using Firehose.

## 1. Push messages From Kinesis To Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis#1-push-messages-from-kinesis-to-tinybird)

### Create a token with the right scope [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis#create-a-token-with-the-right-scope)

In your workspace, create a Token with the `Create new data sources or append data to existing ones` scope.

### Create a new data stream [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis#create-a-new-data-stream)

Start by creating a new data stream in AWS Kinesis. See the [AWS documentation](https://docs.aws.amazon.com/streams/latest/dev/working-with-streams.html) for more information.


<-figure->
![](/docs/_next/image?url=%2Fdocs%2Fimg%2Fingest-from-aws-kinesis-2.png&w=3840&q=75)

<-figcaption->
Create a Kinesis Data Stream

</-figcaption->


</-figure->
### Create a Firehose delivery stream [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis#create-a-firehose-delivery-stream)

Next, [create a Kinesis Data Firehose delivery stream](https://docs.aws.amazon.com/firehose/latest/dev/basic-create.html).

Set the **Source** to **Amazon Kinesis Data Streams** and the **Destination** to **HTTP Endpoint**.

In the **Destination Settings** , set **HTTP Endpoint URL** to point to the [Tinybird Events API](../events-api).

https://api.tinybird.co/v0/events?name=<your_datasource_name>&wait=true&token=<your_token_with_DS_rights> This example is for workspaces in the `GCP` --> `europe-west3` region. If necessary, replace with the [correct region for your workspace](/docs/api-reference#regions-and-endpoints) . Additionally, note the `wait=true` parameter. Learn more about it [in the Events API docs](../events-api#wait-for-acknowledgement).

You don't need to create the data source in advance; it will automatically be created for you.

### Send sample messages and check that they arrive to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis#send-sample-messages-and-check-that-they-arrive-to-tinybird)

If you don't have an active data stream, follow [this python script](https://gist.github.com/GnzJgo/f1a80186a301cd8770a946d02343bafd) to generate dummy data.

Back in Tinybird, you should see 3 columns filled with data in your data source. `timestamp` and `requestId` are self explanatory, and your messages are in `records\_\data`:


<-figure->
![](/docs/_next/image?url=%2Fdocs%2Fimg%2Fingest-from-aws-kinesis-3.png&w=3840&q=75)

<-figcaption->
Firehose data source

</-figcaption->


</-figure->
## 2. Decode message data [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis#2-decode-message-data)

### Decode message data [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis#decode-message-data)

The `records\_\data` column contains an array of encoded messages.

In order to get one row per each element of the array, use the ARRAY JOIN Clause. You'll also need to decode the messages with the base64Decode() function.

Now that the raw JSON is in a column, you can use [JSONExtract functions](/docs/sql-reference/functions/json-functions) to extract the desired fields:

##### Decoding messages

NODE decode_messages
SQL >
   SELECT
       base64Decode(encoded_m) message,
       fromUnixTimestamp64Milli(timestamp) kinesis_ts
   FROM firehose
   ARRAY JOIN records__data as encoded_m
 
NODE extract_message_fields
SQL >
   SELECT
       kinesis_ts,
       toDateTime64(JSONExtractString(message, 'datetime'), 3) datetime,
       JSONExtractString(message, 'event') event,
       JSONExtractString(message, 'product') product
   FROM decode_messages

<-figure->
![](/docs/_next/image?url=%2Fdocs%2Fimg%2Fingest-from-aws-kinesis-4.png&w=3840&q=75)

<-figcaption->
Decoding messages

</-figcaption->


</-figure->
## Recommended settings [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis#recommended-settings)

When configuring AWS Kinesis as a data source, use the following settings:

- Set `wait=true`   when calling the Events API. See[  the Events API docs](../events-api#wait-for-acknowledgement)   for more information.
- Set the buffer size lower than 10 Mb in Kinesis.
- Set 128 shards as the maximum in Kinesis.

## Performance optimizations [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-from-aws-kinesis#performance-optimizations)

Persist the decoded and unrolled result in a different data source. You can do it with a materialized view: A combination of a pipe and a data source that leaves the transformed data into the destination data source as soon as new data arrives to the Firehose data source.

Don't store what you don't need. In this example, some of the extra columns could be skipped. [Add a TTL](../../dev-reference/datafiles/datasource-files) to the Firehose data source to prevent keeping more data than you need.

Another alternative is to create the Firehose data source with a Null Engine. This way, data ingested there can be transformed and fill the destination data source without being persisted in the data source with the Null Engine.


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-aws-elb-logs
Last update: 2025-06-13T09:20:59.000Z
Content:
---
title: "Ingest AWS ELB logs · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to import AWS Elastic Load Balancer (ELB) logs into Tinybird using the Data Sources API."
---


# Ingest AWS ELB logs [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-aws-elb-logs#ingest-aws-elb-logs)

Copy as MD AWS Elastic Load Balancer (ELB) logs record detailed information about requests sent to your load balancers. These logs are useful for monitoring traffic, troubleshooting, and analyzing application performance.

## What are AWS ELB logs? [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-aws-elb-logs#what-are-aws-elb-logs)

AWS ELB logs capture information such as the time a request was received, the client's IP address, request paths, backend responses, and more. Each log entry is a single line in a space-delimited text format.

Here's an example of an ELB log entry:

##### Sample AWS ELB log entry

2023-06-01T12:00:00.000000Z my-elb 192.0.2.1:12345 203.0.113.1:80 0.000022 0.001048 0.00002 200 200 0 57 "GET http://www.example.com:80/ HTTP/1.1" "curl/7.46.0" - - For a full description of each field, see the [AWS documentation](https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/access-log-collection.html#access-log-entry-format).

## File format [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-aws-elb-logs#file-format)

AWS ELB logs are typically exported to Amazon S3 as compressed files with the `.log.gz` extension. Each file contains multiple log entries, one per line.

- **  Format**   : Space-delimited text
- **  Compression**   : Gzip ( `.gz`   )

## Importing ELB logs into Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-aws-elb-logs#importing-elb-logs-into-tinybird)

You can import ELB logs into Tinybird using the [Data Sources API](/docs/api-reference/datasource-api) with `format=csv` and a custom delimiter. Although ELB logs are not comma-separated, you can specify the space character as the delimiter.

### Step 1: Prepare your Data Source schema [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-aws-elb-logs#step-1-prepare-your-data-source-schema)

Define a schema that matches the ELB log fields. For example:

##### ELB log Data Source schema

SCHEMA >
  `timestamp` DateTime,
  `elb` String,
  `client_port` String,
  `backend_port` String,
  `request_processing_time` Float32,
  `backend_processing_time` Float32,
  `response_processing_time` Float32,
  `elb_status_code` UInt16,
  `backend_status_code` UInt16,
  `received_bytes` UInt64,
  `sent_bytes` UInt64,
  `request` String,
  `user_agent` String,
  `ssl_cipher` String,
  `ssl_protocol` String
ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "timestamp" Adjust the schema to match your log format and needs.

### Step 2: Upload and ingest the logs [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-aws-elb-logs#step-2-upload-and-ingest-the-logs)

You can send your `.log.gz` files directly to Tinybird. Tinybird will automatically decompress and parse the file as CSV, as long as you set `format=csv` in the request. By default, Tinybird expects comma-separated values. Since ELB logs are space-delimited, set the delimiter explicitly with the `dialect_delimiter` parameter:

##### Ingest ELB logs with space delimiter

curl \
  -H "Authorization: Bearer <your_auth_token>" \
  -X POST "https://api.tinybird.co/v0/datasources?name=logs&format=csv&dialect_delimiter=%20" \
  -F "csv=@file.log.gz" If your ELB logs contain quoted fields with spaces, consider preprocessing the logs to use a different delimiter or to quote fields consistently.

## Next steps [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-aws-elb-logs#next-steps)

- [  Data Sources API reference](/docs/api-reference/datasource-api)
- [  AWS ELB access log documentation](https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/access-log-collection.html)


---

URL: https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-auth0-logs
Last update: 2025-05-09T07:54:31.000Z
Content:
---
title: "Send Auth0 Log Streams to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to send Auth0 Log Streams to Tinybird using webhooks and the Events API."
---


# Send Auth0 Logs Streams to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-auth0-logs#send-auth0-logs-streams-to-tinybird)

Copy as MD [Auth0](https://auth0.com/) is a developer-focused user management platform to handle user authentication with many prebuilt UI components. By integrating Auth0 with Tinybird, you can analyze your user authentication data in real time and enrich it with other data sources.

Some common use cases for sending Auth0 logs to Tinybird include:

1. Tracking net user and organization growth.
2. Monitoring user churn.
3. Identifying common auth errors.
4. Creating custom dashboards for auth analysis.
5. User authentication audit logs.

Read on to learn how to send data from Auth0 Logs Streams to Tinybird.

## Before you start [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-auth0-logs#before-you-start)

Before you connect Auth0 Logs Streams to Tinybird, ensure:

- You have an Auth0 account.
- You have a Tinybird workspace.

## Connect Auth0 to Tinybird [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-auth0-logs#connect-auth0-to-tinybird)

1. From the Auth0 dashboard, select**  Monitoring**   >**  Streams**  .
2. Select**  Create Stream**  .
3. In Tinybird, create a data source, called `auth0`   . You can follow this[  schema](https://github.com/tinybirdco/tinynest/blob/main/tinybird/datasources/auth0.datasource)  :

SCHEMA >
    `event_time` DateTime64(3) `json:$.tinybirdIngestTime` DEFAULT now(),
    `event_type` String `json:$.data.type` DEFAULT 'unknown',
    `event` JSON `json:$` DEFAULT '{}'

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(event_time)"
ENGINE_SORTING_KEY "event_time" Using the [JSON Data Type](/docs/sql-reference/data-types/json) you can store the semi-structured data you receive from Auth0 Logs Streams in a single column. You can later retrieve various events and their metadata as needed in your pipes.

The `JSON` data type is in private beta. If you are interested in using this type, contact Tinybird at [support@tinybird.co](mailto:support@tinybird.co) or in the [Community Slack](/docs/community).

1. In Tinybird, copy a token with privileges to append to the data source you created. You can use the admin token or create one with the required scope.
2. Back in Auth0, paste the Events API URL in your Webhook Endpoint URL. Use the query parameter `name`   to match the name of the data source you created in Tinybird. For example:

https://api.tinybird.co/v0/events?name=auth0&token=<your user token>

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

Content Type is `application/json` and Content Format is `JSON Lines`.

1. Select the any event category to filter, like `All`   , and a date in case you want to perform some backfilling. Then select**  Save**  .
2. You're done. Any of the Auth0 Log Streams events you selected is automatically sent to Tinybird through the[  Events API](../events-api)  .

You can check the status of the integration from the **Health** tab in the created webhook or from the **Log** tab in the Tinybird `auth0` data source.

## Auth0 Logs Explorer Template [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-auth0-logs#auth0-logs-explorer-template)

Use the [Auth0 Logs Explorer Template](https://github.com/tinybirdco/auth0-logs-explorer-template) to bootstrap a multi-tenant, user-facing logs explorer for your Auth0 account. You can fork it and make it your own.

## See also [¶](https://www.tinybird.co/docs/forward/get-data-in/guides/ingest-auth0-logs#see-also)

- [  Events API](../events-api)
- [  Auth0 Logs Streams](https://auth0.com/docs/customize/log-streams/custom-log-streams)


---

URL: https://www.tinybird.co/docs/forward/dev-reference/datafiles/tinyb-file
Last update: 2025-06-18T13:57:28.000Z
Content:
---
title: ".tinyb file · Tinybird Docs"
theme-color: "#171612"
description: "The .tinyb file contains the Tinybird project configuration, including the authentication token."
---


# .tinyb file [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/tinyb-file#tinyb-file)

Copy as MD The `.tinyb` file is a configuration file that contains the Tinybird project configuration, including the authentication token obtained by running `tb login`.

Running commands requires a valid `.tinyb` file in the root of your project. If you don't have one, you can create one by running [tb login](/docs/forward/dev-reference/commands/tb-login).

## Location [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/tinyb-file#location)

You can place the `.tinyb` file in the root of your app repository or keep it together with your Tinybird datafiles. Tinybird looks for the `.tinyb` file in all parent folders till it reaches the home directory.

If your Tinybird files and folders are located in a subdirectory (e.g., `tinybird/` ), you should add a `cwd` field to specify the relative path to that directory.

## File structure [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/tinyb-file#file-structure)

The following is a sample `.tinyb` file:

##### Sample .tinyb file

{
    "host": "<tinybird-host>",
    "id": "<workspace-id>",
    "name": "<workspace-name>",
    "scope": "user",
    "token": "<authentication-token>",
    "tokens": {
        "<tinybird-host>": "<authentication-token>",
        "<another-tinybird-host>": "<authentication-token>"
    },
    "cwd": "<optional-path-to-your-tinybird-folder-e.g-./tinybird>", 
    "user_email": "<user-email>",
    "user_id": "<user-id>",
    "user_token": "<authentication-token>",
    "version": "<tinybird-version>"
}

---

URL: https://www.tinybird.co/docs/forward/dev-reference/datafiles/test-files
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "Test files · Tinybird Docs"
theme-color: "#171612"
description: "Test files are used to test the API endpoints."
---


# Test files [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/test-files#test-files)

Copy as MD Test files describe the tests for the API endpoints. See [Test and deploy](/docs/forward/test-and-deploy/test-your-project#create-a-test-suite).

Test files are stored in the `tests` folder in your project.

## Test file format [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/test-files#test-file-format)

Test files are YAML files that contain the tests for the API endpoints.

Test files are structured as YAML arrays. For example:

- name: test_default_parameters
  description: 'Test with default parameters: org_id=1, start_date=30 days ago, end_date=now.'
  parameters: org_id=1
  expected_result: ''

- name: test_custom_org_id
  description: Test with a custom organization ID and default date range.
  parameters: org_id=42
  expected_result: '' Each test is a YAML object with the following fields:

- `name`   : The name of the test.
- `description`   : The description of the test.
- `parameters`   : The query parameters for the endpoint.
- `expected_result`   : The expected result for the test.
- `expected_http_status`   : The expected HTTP status, for example `200`  .

## Create a test file [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/test-files#create-a-test-file)

To create a test file, run `tb test create` against an endpoint .pipe file. See [tb test create](/docs/forward/dev-reference/commands/tb-test#tb-test-create).

## Run tests [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/test-files#run-tests)

To run tests, run `tb test run` . See [tb test run](/docs/forward/dev-reference/commands/tb-test#tb-test-run).


---

URL: https://www.tinybird.co/docs/forward/dev-reference/datafiles/pipe-files
Last update: 2025-06-16T16:46:57.000Z
Content:
---
title: "Pipe files · Tinybird Docs"
theme-color: "#171612"
description: "Pipe files describe your Tinybird pipes. Define the type, data source, and other settings."
---


# Pipe files (.pipe) [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/pipe-files#pipe-files-pipe)

Copy as MD Pipe files describe your pipes. You can use .pipe files to define the type, starting node, data source, and other settings of your pipes.


## Nodes [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/pipe-files#nodes)

A node is a container for a single SQL `SELECT` statement. Nodes live within pipes, and you can have many sequential nodes inside the same pipe. They allow you to break your query logic down into multiple smaller queries. You can then chain nodes together to build the logic incrementally.

A query in a node can read data from a data source, other nodes inside the same pipe, or from endpoint nodes in other pipes. Each node can be developed and tested individually. This makes it much easier to build complex query logic in Tinybird as you avoid creating large monolithic queries with many subqueries.

## Available instructions [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/pipe-files#available-instructions)

The following instructions are available for .pipe files.

| Instruction | Required | Description |
| --- | --- | --- |
| `%` | No | Use as the first character of a node to indicate the node uses the[  templating system](/docs/forward/dev-reference/template-functions)  . |
| `DESCRIPTION <markdown_string>` | No | Sets the description for a node or the complete file. |
| `TAGS <tag_names>` | No | Comma-separated list of tags. |
| `NODE <node_name>` | Yes | Starts the definition of a new node. All the instructions until a new `NODE`   instruction or the end of the file are related to this node. |
| `SQL <sql>` | Yes | Defines a block for the SQL of a node. The block must be indented. |
| `TYPE <pipe_type>` | No | Sets the type of the pipe. Valid values are `ENDPOINT`  , `MATERIALIZED`  , `COPY`   , or `SINK`  . |
| `DATASOURCE <data_source_name>` | Yes | Required when `TYPE`   is `MATERIALIZED`   . Sets the destination data source for materialized nodes. |
| `TARGET_DATASOURCE <data_source_name>` | Yes | Required when `TYPE`   is `COPY`   . Sets the destination data source for copy nodes. |
| `TOKEN <token_name> READ` | No | Grants read access to a pipe or endpoint to the token named <token_name>. If the token isn't specified or <token_name> doesn't exist, it will be automatically created. |
| `COPY_SCHEDULE` | No | Cron expression with the frequency to run copy jobs. Must be higher than 5 minutes. For example, `*/5 * * * *`   . If undefined, it defaults to `@on-demand`  . |
| `COPY_MODE` | No | Strategy to ingest data for copy jobs. One of `append`   or `replace`   . If empty, the default strategy is `append`  . |

## Endpoint pipes [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/pipe-files#endpoint-pipes)

In a .pipe file you can define how to export the result of a pipe as an HTTP endpoint.

The following example shows how to describe an endpoint pipe. See [Endpoints](/docs/forward/work-with-data/publish-data/endpoints).

##### tinybird/pipes/sales_by_hour_endpoint.pipe

TOKEN dashboard READ
DESCRIPTION endpoint to get sales by hour filtering by date and country

TAGS sales

NODE daily_sales
SQL >
    %
    SELECT day, country, sum(total_sales) as total_sales
    FROM sales_by_hour
    WHERE
    day BETWEEN toStartOfDay(now()) - interval 1 day AND toStartOfDay(now())
    and country = {{ String(country, 'US')}}
    GROUP BY day, country

NODE result
SQL >
    %
    SELECT * FROM daily_sales
    LIMIT {{Int32(page_size, 100)}}
    OFFSET {{Int32(page, 0) * Int32(page_size, 100)}}
TYPE ENDPOINT
## Materialized pipes [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/pipe-files#materialized-pipes)

In a .pipe file you can define how to materialize each row ingested in the earliest data source in the pipe query to a materialized data source. Materialization happens at ingest. See [Materialized views](/docs/forward/work-with-data/optimize/materialized-views).

The following example shows how to describe a materialized pipe.

##### tinybird/pipes/sales_by_hour_mv.pipe

DESCRIPTION Materialized pipe to aggregate sales per hour in the sales_by_hour data source

NODE daily_sales
SQL >
    SELECT toStartOfDay(starting_date) day, country, sum(sales) as total_sales
    FROM teams
    GROUP BY day, country

TYPE MATERIALIZED
DATASOURCE sales_by_hour
## Copy pipes [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/pipe-files#copy-pipes)

In a .pipe file you can define how to export the result of a pipe to a data source, optionally with a schedule. See [Copy pipes](/docs/forward/work-with-data/optimize/copy-pipes).

The following example shows how to describe a copy pipe.

##### tinybird/pipes/sales_by_hour_cp.pipe

DESCRIPTION Copy pipe to export sales hour every hour to the sales_hour_copy data source

NODE daily_sales
SQL >
    %
    SELECT toStartOfDay(starting_date) day, country, sum(sales) as total_sales
    FROM teams
    WHERE
    day BETWEEN toStartOfDay(now()) - interval 1 day AND toStartOfDay(now())
    and country = {{ String(country, 'US')}}
    GROUP BY day, country

TYPE COPY
TARGET_DATASOURCE sales_hour_copy
COPY_SCHEDULE 0 * * * *
## Sink Pipes [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/pipe-files#sink-pipes)

In a .pipe file you can define how to export the result of a pipe to an external source, optionally with a schedule. See [Sink pipes](/docs/forward/work-with-data/publish-data/sinks).

The following parameters are available when defining Sink Pipes:

| Instruction | Required | Description |
| --- | --- | --- |
| `EXPORT_CONNECTION_NAME` | Yes | The name of the export connection. |
| `EXPORT_SCHEDULE` | No | Cron expression, in UTC time. Must be higher than 5 minutes. For example, `*/5 * * * *`  . |

### Blob storage Sink [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/pipe-files#blob-storage-sink)

When using a S3 or GCS connection, you can use the following instructions:

| Instruction | Required | Description |
| --- | --- | --- |
| `EXPORT_BUCKET_URI` | Yes | The desired bucket path for the exported file. Path must not include the filename and extension. |
| `EXPORT_FILE_TEMPLATE` | Yes | Template string that specifies the naming convention for exported files. The template can include dynamic attributes between curly braces based on columns' data that will be replaced with real values when exporting. For example: `export_{category}{date,'%Y'}{2}`  . |
| `EXPORT_FORMAT` | Yes | Format in which the data is exported. The default value is `csv`  . |
| `EXPORT_COMPRESSION` | No | Compression file type. Accepted values are `none`  , `gz`   for gzip, `br`   for brotli, `xz`   for LZMA, `zst`   for zstd. Default values is `none`  . |
| `EXPORT_STRATEGY` | Yes | One of the available strategies. The default is `@new`  . |

### Kafka Sink [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/pipe-files#kafka-sink)

When using a `kafka` connection, you can use the following instructions:

| Instruction | Required | Description |
| --- | --- | --- |
| `EXPORT_KAFKA_TOPIC` | Yes | The desired topic for the export data. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/datafiles/datasource-files
Last update: 2025-06-03T06:39:46.000Z
Content:
---
title: "Datasource files · Tinybird Docs"
theme-color: "#171612"
description: "Datasource files describe your data sources. Define the schema, engine, and other settings."
---


# Datasource files (.datasource) [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/datasource-files#datasource-files-datasource)

Copy as MD Datasource files describe your data sources. You can use .datasource files to define the schema, engine, and other settings of your data sources.

## Available instructions [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/datasource-files#available-instructions)

The following instructions are available for .datasource files.

| Declaration | Required | Description |
| --- | --- | --- |
| `SCHEMA <indented_schema_definition>` | Yes | Defines a block for a data source schema. The block must be indented. |
| `ENGINE <engine_type>` | No | Sets the engine for data source. Default value is `MergeTree`  . |
| `ENGINE_SORTING_KEY <sql>` | No | Sets the `ORDER BY`   expression for the data source. |
| `ENGINE_PARTITION_KEY <sql>` | No | Sets the `PARTITION`   expression for the data source. |
| `ENGINE_TTL <sql>` | No | Sets the `TTL`   expression for the data source. |
| `ENGINE_VER <column_name>` | No | Column with the version of the object state. Required when using `ENGINE ReplacingMergeTree`  . |
| `ENGINE_SIGN <column_name>` | No | Column to compute the state. Required when using `ENGINE CollapsingMergeTree`   or `ENGINE VersionedCollapsingMergeTree`  . |
| `ENGINE_VERSION <column_name>` | No | Column with the version of the object state. Required when `ENGINE VersionedCollapsingMergeTree`  . |
| `ENGINE_SETTINGS <settings>` | No | Comma-separated list of key-value pairs that describe engine settings for the data source. |
| `FORWARD_QUERY <sql>` | No | Defines a query to execute on the data source. The results of the query are returned instead of the original schema defined in the `SCHEMA`   declaration. See[  Evolve data sources](/docs/forward/test-and-deploy/evolve-data-source#forward-query)  . |
| `TOKEN <token_name> READ|APPEND` | No | Grants read or append access to a datasource to the token named <token_name>. If the token isn't specified or <token_name> doesn't exist, it will be automatically created. |

The following example shows a typical .datasource file:

##### tinybird/datasources/example.datasource

# A comment
SCHEMA >
    `timestamp` DateTime `json:$.timestamp`,
    `session_id` String `json:$.session_id`,
    `action` LowCardinality(String) `json:$.action`,
    `version` LowCardinality(String) `json:$.version`,
    `payload` String `json:$.payload`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "timestamp"
ENGINE_TTL "timestamp + toIntervalDay(60)"
ENGINE_SETTINGS "index_granularity=8192"
### Schema [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/datasource-files#schema)

A `SCHEMA` declaration is a newline, comma-separated list of columns definitions. For example:

##### Example SCHEMA declaration

SCHEMA >
    `timestamp` DateTime `json:$.timestamp`,
    `session_id` String `json:$.session_id`,
    `action` LowCardinality(String) `json:$.action`,
    `version` LowCardinality(String) `json:$.version`,
    `payload` String `json:$.payload` Each column in a `SCHEMA` declaration is in the format `<column_name> <data_type> <json_path> <default_value>` , where:

- `<column_name>`   is the name of the column in the data source.
- `<data_type>`   is one of the supported[  Data types](/docs/sql-reference/data-types)  .
- `<json_path>`   is optional and only required for NDJSON data sources.
- `<default_value>`   sets a default value to the column when it's null. A common use case is to set a default date to a column, like `updated_at DateTime DEFAULT now()`  .

### JSONPath expressions [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/datasource-files#jsonpath-expressions)

`SCHEMA` definitions need JSONPath expressions when working with Parquet or NDJSON data.

It supports base fields `json:$.field` , arrays `json:$.an_array[:]` , nested fields `json:$.nested.nested_field` , and storing the whole object `json:$`.

For example, given this JSON object:

{
  "field": "test",
  "nested": { "nested_field": "bla" },
  "an_array": [1, 2, 3],
  "a_nested_array": { "nested_array": [1, 2, 3] }
} The schema would be something like this:

##### jsonpaths.datasource

SCHEMA >
    field String `json:$.field`,
    nested_nested_field String `json:$.nested.nested_field`,
    an_array Array(Int16) `json:$.an_array[:]`,
    a_nested_array_nested_array Array(Int16) `json:$.a_nested_array.nested_array[:]`,
    whole_message String `json:$` Use brackets for attributes with dots that are not actual nested attributes, like this:

{
    "attributes": {
        "otel.attributes": {
            "cli_command": "datasource ls",
        }
    }
}
##### jsonpaths.datasource

SCHEMA >
    cli_command String `json:$.attributes.['otel.attributes'].cli_command` Tinybird's JSONPath syntax support has some limitations: It support nested objects at multiple levels, but it **supports nested arrays only at the first level** , as in the example before. To ingest and transform more complex JSON objects, store the whole JSON as a String "<column_name> String `json:$` ", and use [JSONExtract functions](/docs/sql-reference/functions/json-functions#jsonextract-functions) to parse at query time or in materializations.

### Column compression codecs [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/datasource-files#column-compression-codecs)

Tinybird applies compression codecs to data types to optimize storage. You can override the default compression codecs by adding the `CODEC(<codec>)` statement after the type declarations in your .datasource schema. For example:

SCHEMA >
    `product_id` Int32 `json:$.product_id`,
    `timestamp` DateTime64(3) `json:$.timestamp` CODEC(DoubleDelta, ZSTD(1)),
### Engine settings [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/datasource-files#engine-settings)

`ENGINE` declares the engine used for the data source. The default value is `MergeTree`.

See [Engines](/docs/sql-reference/engines) for more information.

### Connector settings [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/datasource-files#connector-settings)

A data source file can contain connector settings for certain type of sources, such as Kafka or S3. See [Connectors](/docs/forward/get-data-in/connectors).

## Forward query [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/datasource-files#forward-query)

If you make changes to a .datasource file that are incompatible with the live version, you must use the `FORWARD_QUERY` instruction to transform the data from the live schema to the new one. Otherwise, your deployment will fail due to a schema mismatch.

See [Evolve data sources](/docs/forward/test-and-deploy/evolve-data-source#forward-query) for more information.


---

URL: https://www.tinybird.co/docs/forward/dev-reference/datafiles/connection-files
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "Connection files · Tinybird Docs"
theme-color: "#171612"
description: "Connection files describe your data connections. They're used by Tinybird connectors."
---


# Connection files (.connection) [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/connection-files#connection-files-connection)

Copy as MD Connection files describe your data connections. They're used by Tinybird connectors to establish a connection when deploying to live environments. See [Connectors](/docs/forward/get-data-in/connectors).

Connection files bear the .connection extension and are stored in the connections folder of your project. See [Datafiles](/docs/forward/dev-reference/datafiles).

Connection files are not active in the local development environment. When using `tb dev` , connections to external services like Kafka or S3 will not be checked or ingest any data. Connections only become active after deploying to Local or Cloud environments.

## Structure [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/connection-files#structure)

All .connection file contains a `TYPE` declaration that specifies the type of connection. For each connection type, a number of settings are mandatory.

The name of the .connection file determines the name used in .datasource files when defining the connection.

The following example shows a .connection file for a Kafka data source:

##### kafkasample.connection

TYPE kafka
KAFKA_BOOTSTRAP_SERVERS {{ tb_secret("PRODUCTION_KAFKA_SERVERS", "localhost:9092") }}
KAFKA_SECURITY_PROTOCOL SASL_SSL
KAFKA_SASL_MECHANISM PLAIN
KAFKA_KEY {{ tb_secret("PRODUCTION_KAFKA_USERNAME", "") }}
KAFKA_SECRET {{ tb_secret("PRODUCTION_KAFKA_PASSWORD", "") }} The calls to the `tb_secret` function contain the name of the secret and a default value. See [tb secret](/docs/forward/dev-reference/commands/tb-secret) to learn how to set and manage secrets.

Never add credentials as plain text in your datafiles. Use secrets instead.

## List your connections [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/connection-files#list-your-connections)

To list your connections, run the [tb connections](/docs/forward/dev-reference/commands/tb-connection) command:

tb connection ls
## Update a connection [¶](https://www.tinybird.co/docs/forward/dev-reference/datafiles/connection-files#update-a-connection)

To update or refresh a connection after changing settings or secrets, redeploy your project. See [Deployments](/docs/forward/test-and-deploy/deployments) . Connections remain active using the previous settings until you deploy them again.


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-workspace
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "tb workspace · Tinybird Docs"
theme-color: "#171612"
description: "Manage your Tinybird workspaces"
---


# tb workspace [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-workspace#tb-workspace)

Copy as MD Manage your workspaces. Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

The following subcommands are available:

| Subcommand | Description | clear [OPTIONS] | Clear a workspace. Only available against Tinybird Local. |
| --- | --- | --- | --- |
| create [OPTIONS] WORKSPACE_NAME | Creates a new workspace for your Tinybird user. |  |  |
| current | Shows the workspace you're currently authenticated to. |  |  |
| delete [OPTIONS] WORKSPACE_NAME_OR_ID | Deletes a workspace where you are an admin. |  |  |
| ls | Lists all the workspaces you have access to in the account you're currently authenticated to. |  |  |
| members add [OPTIONS] MEMBERS_EMAILS | Adds members to the current workspace. |  |  |
| members ls [OPTIONS] | Lists members in the current workspace. |  |  |
| members rm [OPTIONS] | Removes members from the current workspace. |  |  |
| members set-role [OPTIONS] [guest|viewer|admin] MEMBERS_EMAILS | Sets the role for existing workspace members. |  |  |
| use WORKSPACE_NAME_OR_ID | Switches to another workspace. Use `tb workspace ls`   to list the workspaces you have access to. |  |  |

## tb workspace create [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-workspace#tb-workspace-create)

Creates a new workspace for your Tinybird user.

| Option | Description |
| --- | --- |
| --fork | When enabled, Tinybird shares all data sources from the current workspace to the new one. |
| --user_token TEXT | When passed, Tinybird won't prompt asking for it. |

## tb workspace delete [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-workspace#tb-workspace-delete)

Deletes a workspace where you are an admin.

| Option | Description |
| --- | --- |
| --user_token TEXT | When passed, Tinybird won't prompt asking for it. |
| --yes | Don't ask for confirmation. |

## tb workspace members add [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-workspace#tb-workspace-members-add)

Adds members to the current workspace. Takes a list of members emails as an argument.

| Option | Description |
| --- | --- |
| --role [guest|viewer|admin] | Sets the role for the members. |
| --user_token TEXT | When passed, Tinybird won't prompt asking for it. |

## tb workspace members rm [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-workspace#tb-workspace-members-rm)

Removes members from the current workspace. Takes a list of members emails as an argument.

| Option | Description |
| --- | --- |
| --user_token TEXT | When passed, Tinybird won't prompt asking for it. |

## tb workspace members set-role [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-workspace#tb-workspace-members-set-role)

Sets the role for existing workspace members.

| Option | Description |
| --- | --- |
| --user_token TEXT | When passed, Tinybird won't prompt asking for it. |

## tb workspace use [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-workspace#tb-workspace-use)

Switches to another workspace. Use `tb workspace ls` to list the workspaces you have access to.

| Option | Description |
| --- | --- |
| --user_token TEXT | When passed, Tinybird won't prompt asking for it. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-update
Last update: 2025-06-04T08:32:07.000Z
Content:
---
title: "tb update · Tinybird Docs"
theme-color: "#171612"
description: "Update the Tinybird CLI to the latest version"
---


# tb update [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-update#tb-update)

Copy as MD Updates the CLI to the latest version. Requires `tb local start`.

tb update

» Updating Tinybird CLI...
✓ Tinybird CLI updated If you can't update for some reason, manually reinstall the CLI:


- For macOS and Linux
- For Windows

curl https://tinybird.co | sh

---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-token
Last update: 2025-06-17T15:11:52.000Z
Content:
---
title: "tb token · Tinybird Docs"
theme-color: "#171612"
description: "Create and manage your workspace tokens."
---


# tb token [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-token#tb-token)

Copy as MD Manage your workspace tokens. See [Authentication](/docs/forward/administration/tokens).

| Command | Description |
| --- | --- |
| copy OPTIONS TOKEN_ID | Copies a token. |
| ls OPTIONS | Lists tokens. Use `--match TEXT`   to retrieve any token matching the pattern. For example, `--match _test`  . |
| refresh OPTIONS TOKEN_ID | Refreshes a token. Adding `--yes`   removes the need for confirmation. |
| rm OPTIONS TOKEN_ID | Removes a token. Adding `--yes`   removes the need for confirmation. |
| scopes OPTIONS TOKEN_ID | Lists token scopes. |
| create static OPTIONS TOKEN_NAME | Creates a static token that lasts forever. If a token with the same name already exists, it updates it. |
| create jwt OPTIONS TOKEN_NAME | Creates a JWT token with a fixed expiration time. |

## tb token create static [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-token#tb-token-create-static)

Creates a static token. Note that resource-scoped tokens are automatically generated.

For example:

tb token create static my_static_token --scope ORG_DATASOURCES:READ The following options are available:

| Option | Description |
| --- | --- |
| --scope | Scope for the token, for example `DATASOURCES:READ`   . Required. |

## tb token create jwt [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-token#tb-token-create-jwt)

Creates a JWT token.

For example:

tb token create jwt my_jwt --ttl 1h --scope PIPES:READ --resource my_pipe --fixed-params "param_name=value" The following options are available:

| Option | Description |
| --- | --- |
| --ttl | Time to live. For example, `1h`  , `30min`  , `1d`   . Required. |
| --scope | Scope for the token. Only `PIPES:READ`   is allowed for JWT tokens. Required. |
| --resource | Resource you want to associate the scope with. Required |
| --fixed-params | Fixed parameters in `key=value`   format. You can separate multiple values using commas. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-test
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "tb test · Tinybird Docs"
theme-color: "#171612"
description: "Generate and run tests for your Tinybird project"
---


# tb test [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-test#tb-test)

Copy as MD Generates and runs tests. The following subcommands are available:

| Command | Description |
| --- | --- |
| create [OPTIONS] | Creates a test in YAML format from a pipe datafile. |
| run [FILES] | Runs all the tests in the project or specific tests passed as arguments. |
| update [FILES] | Updates the test's expectations. |

## tb test create [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-test#tb-test-create)

Creates a test in YAML format from an API endpoint.

For example: `tb test create api_events --prompt "test for the customer id 42 and the event type 'purchase'"`.

| Option | Description |
| --- | --- |
| --prompt TEXT | Passes a prompt to generate a customized test. |

Tests are stored in the `tests` folder in your project as YAML files. For a description of the test file format, see [Test files](/docs/forward/dev-reference/datafiles/test-files).

## tb test run [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-test#tb-test-run)

Runs all the tests in the project `tb test run` or specific tests passed as arguments.

For example: `tb test tests/run api_events.yaml`.

Tinybird creates a fresh workspace for each test run. Secrets will not persist between test runs. To avoid test failures, add a default value to your secrets: `{{ tb_secret("secret_name", "default_value") }}`.

## tb test update [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-test#tb-test-update)

Updates the test's expectations. For example: `tb test update api_events`.


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-sql
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "tb sql · Tinybird Docs"
theme-color: "#171612"
description: "Run SQL queries over data sources and pipes"
---


# tb sql [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-sql#tb-sql)

Copy as MD Run SQL query over data sources and pipes. Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

## Options [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-sql#options)

| Option | Description |
| --- | --- |
| --rows_limit INTEGER | Max number of rows retrieved. |
| --pipeline TEXT | The name of the pipe to run the SQL Query. |
| --pipe TEXT | The path to the .pipe file to run the SQL Query of a specific NODE. |
| --node TEXT | The NODE name. |
| --format [json|csv|human] | Output format. |
| --stats / --no-stats | Shows query stats. |

## Examples [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-sql#examples)

The following example shows how to run a SELECT statement with count:

tb sql "SELECT count(*) from tinybird.endpoint_errors"

Running against Tinybird Local
  count()  
   UInt64  
───────────
        0 The following example shows how to use the `--rows_limit` and `--stats` options:

tb sql --rows_limit 5 --stats "SELECT start_datetime, duration, pipe_name from tinybird.pipe_stats_rt"

Running against Tinybird Local
** Query took 0.004044331 seconds
** Rows read: 142
** Bytes read: 10.12 KB
  start_datetime            duration   pipe_name  
  DateTime                   Float32   String     
──────────────────────────────────────────────────
  2025-03-12 16:07:16     0.05388069   query_api  
──────────────────────────────────────────────────
  2025-03-12 16:07:27   0.0040593147   query_api  
──────────────────────────────────────────────────
  2025-03-12 16:09:03    0.026257038   query_api  
──────────────────────────────────────────────────
  2025-03-12 16:15:27     0.03177452   query_api  
──────────────────────────────────────────────────
  2025-03-12 16:15:33    0.010550499   query_api

---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-sink
Last update: 2025-06-16T16:46:57.000Z
Content:
---
title: "tb sink · Tinybird Docs"
theme-color: "#171612"
description: "Manage sink pipes in your Tinybird project"
---


# tb sink [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-sink#tb-sink)

Copy as MD Manages sink pipes. Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| ls | Lists all the sink pipes. |
| run [OPTIONS] PIPE_NAME_OR_ID | Runs a sink job. |

## tb sink run [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-sink#tb-sink-run)

Runs a sink job.

| Option | Description |
| --- | --- |
| --wait / --no-wait | Waits for the sink job to finish. |
| --mode [append|replace] | Defines the sink strategy. |
| --param TEXT | Key and value of the params you want the sink pipe to be called with. For example: tb sink run <my_sink_pipe> --param foo=bar |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-secret
Last update: 2025-05-20T15:10:47.000Z
Content:
---
title: "tb secret · Tinybird Docs"
theme-color: "#171612"
description: "Manage secrets in datafiles."
---


# tb secret [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-secret#tb-secret)

Copy as MD Manage secrets in datafiles, like connection credentials. Secrets consists of a name and a value.

You can add a secret to your workspace like this:

tb --cloud secret set KAFKA_USERNAME 12345 You can then use the secret in a datafile like this:

TYPE kafka
KAFKA_KEY {{ tb_secret("KAFKA_USERNAME", "") }} In the datafile syntax, first argument is the name of the secret, and the second is the default value that's used when the secret is not set.

Secrets are only replaced in your resources when you deploy. If you change a secret, you need to deploy for the changes to take effect.

## Environment variables [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-secret#environment-variables)

When working in local, you can store secrets in `.env.local` or `.env` files. They will be loaded automatically when you run `tb dev` or `tb build`.

KAFKA_USERNAME=12345
KAFKA_PASSWORD=67890
## Subcommands [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-secret#subcommands)

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| ls | Lists all secrets in the project. |
| rm NAME | Deletes a secret. |
| set NAME VALUE | Creates or updates a secret. |

## tb secret ls [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-secret#tb-secret-ls)

Lists secrets.

| Option | Description |
| --- | --- |
| --match TEXT | Retrieves any resource matching the pattern. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-pipe
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "tb pipe · Tinybird Docs"
theme-color: "#171612"
description: "Manage your Tinybird pipes."
---


# tb pipe [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-pipe#tb-pipe)

Copy as MD Manage your pipes. Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| ls | Lists all the pipes you have access to. |

## tb pipe ls [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-pipe#tb-pipe-ls)

Lists all the pipes you have access to.

| Option | Description |
| --- | --- |
| --match TEXT | Retrieves any resource matching the pattern. For example, `--match _test`  . |
| --format [json] | Returns the results in the specified format. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-open
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "tb open · Tinybird Docs"
theme-color: "#171612"
description: "Open a Tinybird workspace in your browser."
---


# tb open [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-open#tb-open)

Copy as MD Opens a workspace in Tinybird Cloud. You can get your current workspace by running `tb workspace current` . Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

For example:

# Opens the current workspace
tb open

# Opens a specific workspace in the cloud environment
tb --cloud open --workspace someworkspace The following options are available:

| Option | Description |
| --- | --- |
| --workspace | Sets the workspace you want to open. If unset, your current workspace is used. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-mock
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "tb mock · Tinybird Docs"
theme-color: "#171612"
description: "Generate sample data for your Tinybird project"
---


# tb mock [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-mock#tb-mock)

Copy as MD Generates sample data in `/fixtures` . The command accepts a data source name as an argument and can be used against Tinybird Local and Tinybird Cloud. For example: `tb --cloud mock events`

Use the `--prompt` flag to add more context to the data that is generated. For example: `tb mock user_actions --prompt "Create mock data for 23 users from the US"`.

To use the fixture data in your project, use `tb test` . See [tb test](/docs/forward/dev-reference/commands/tb-test) for more information.

## Options [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-mock#options)

| Option | Description |
| --- | --- |
| --rows INTEGER | Number of rows to generate. |
| --prompt TEXT | Extra context to use for data generation. |

The `tb mock` command saves the SQL query used to generate the data inside the `/fixtures` directory. You can edit the SQL query file to generate different data.


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-materialization
Last update: 2025-06-19T11:03:06.000Z
Content:
---
title: "tb materialization · Tinybird Docs"
theme-color: "#171612"
description: "Manage materialized views in your Tinybird project"
---


# tb materialization [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-materialization#tb-materialization)

Copy as MD Manage materialized views. Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| ls | Lists materialized views. |

## tb materialization ls [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-materialization#tb-materialization-ls)

Lists materialized views.

| Option | Description |
| --- | --- |
| --match TEXT | Retrieves any resource matching the pattern. |
| --format [json] | Returns the results in the specified format. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-logout
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "tb logout · Tinybird Docs"
theme-color: "#171612"
description: "Log out of Tinybird"
---


# tb logout [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-logout#tb-logout)

Copy as MD Logs you out of your Tinybird session in the project directory you're in.

tb logout

» Logging out from <workspace>...
✓ Logged out! Logging out erases all the credentials stored in the `.tinyb` file. See [.tinyb](/docs/forward/dev-reference/datafiles/tinyb-file).

To log back in, run `tb login`.


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-login
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "tb login · Tinybird Docs"
theme-color: "#171612"
description: "Authenticate with your Tinybird account"
---


# tb login [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-login#tb-login)

Copy as MD Authenticates with your Tinybird account.

The command opens a browser window so that you can authenticate with your Tinybird account.

The credentials are stored in the `.tinyb` file. See [.tinyb file](/docs/forward/dev-reference/datafiles/tinyb-file).

## Options [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-login#options)

| Option | Description |
| --- | --- |
| --host TEXT | Set a custom host if it's different than https://api.tinybird.co. Check[  Regions and endpoints](/docs/api-reference#regions-and-endpoints)   for the available list of regions. |
| --auth-host TEXT | Set the host to authenticate to. If unset, the default host is used. |
| --workspace TEXT | Set the workspace to authenticate to. If unset, the default workspace is used. |
| -i, --interactive | Show available regions and prompts you to select one for authentication. |
| --use-aws-creds | Use AWS credentials to authenticate. Required to run S3 connections locally. See[  Using AWS credentials](/docs/forward/get-data-in/connectors/s3#using-s3-connector-local-environment)  . |

To log out, run `tb logout`.


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-local
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "tb local · Tinybird Docs"
theme-color: "#171612"
description: "Manage your local development environment"
---


# tb local [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-local#tb-local)

Copy as MD Manages the local development environment. The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| start | Starts the local development environment. |
| restart | Restarts the local development environment. |
| stop | Stops the local development environment. |
| remove | Removes the local development environment. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-job
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "tb job · Tinybird Docs"
theme-color: "#171612"
description: "Manage jobs in your Tinybird project"
---


# tb job [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-job#tb-job)

Copy as MD Manage jobs. Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| cancel JOB_ID | Tries to cancel a job. |
| details JOB_ID | Gets details for any job created in the last 48h. |
| ls [OPTIONS] | Lists jobs. Use `--status [waiting|working|done|error]`   or `-s`   to show results with the desired status. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-infra
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "tb infra · Tinybird Docs"
theme-color: "#171612"
description: "Select the infrastructure on which you want to run Tinybird."
---


# tb infra [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-infra#tb-infra)

Copy as MD Helps you deploy Tinybird on a cloud service provider of your choice and manage custom regions. See [Self-managed Tinybird Cloud](/docs/forward/install-tinybird/self-managed).

To list, add, update, or remove infrastructure regions, you can also go to **Settings**, **Managed regions** in Tinybird Cloud.

## Subcommands [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-infra#subcommands)

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| init | Initializes the infrastructure and deploys Tinybird Local in a custom region. By default, the command is interactive and prompts you for the necessary information. |
| add | Adds a new self-managed infrastructure region using an existing infrastructure URL. |
| update | Updates the URL of an existing self-managed infrastructure region. |
| rm NAME | Removes an existing infrastructure region. |
| ls | Lists available infrastructure regions. |

## tb infra init [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-infra#tb-infra-init)

Initializes the selected cloud infrastructure provider to host Tinybird Local.

Running the `tb infra init` command requires the following components:


- AWS

The following tools are required to deploy Tinybird Local on AWS:

- [  Terraform CLI](https://developer.hashicorp.com/terraform/install)   to create the Kubernetes cluster.
- [  Kubectl CLI](https://kubernetes.io/docs/tasks/tools/)   to manage your Kubernetes installation.
- [  AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)   with credentials configured.

Before initiating deployment, you need to set up the following in AWS:

- A zone and domain name in[  Route53 zone](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/CreatingHostedZone.html)  .  

  - Write down the hosted zone name and the zone ID for your domain.
- An[  EKS cluster](https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html)   with the following components:  

  - [    AWS Load Balancer Controller](https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/)     installed.
  - [    external-dns](https://github.com/kubernetes-sigs/external-dns)     configured.
  - Both components need sufficient permissions to manage resources.
  - Optionally, set a storage class in the EKS cluster with high IOPS and throughput.

The command requires the following information:


- AWS

When prompted, enter the following information from the first step:

- AWS region. For example, `us-east-1`  .
- DNS zone name. For example, `example.com`  .
- DNS record name. For example, `tinybird`
- The Kubernetes context to apply the changes on.
- Kubernetes namespace. For example, `default`  .
- EKS storage class. For example, `gp3-encrypted`  .

Review the changes before applying them. If you want to generate the files without applying them, use the `--skip-apply` flag.

After the deployment is complete, `tb infra` shows the URL to access your Tinybird Local instance. See [tb infra](/docs/forward/dev-reference/commands/tb-infra) for more information and settings.

The following options are available:

| Option | Description |
| --- | --- |
| --name TEXT | Name for identifying the self-managed infrastructure region in Tinybird. |
| --cloud-provider TEXT | Infrastructure provider. Possible values are: `aws`  , `gcp`  , `azure`  . |
| --cloud-region TEXT | AWS region, when using `aws`   as the provider. |
| --dns-zone-name TEXT | DNS zone name. |
| --kubernetes-namespace TEXT | Kubernetes namespace for the deployment. |
| --dns-record TEXT | DNS record name to create, without domain. For example, `tinybird`  . |
| --kubernetes-storage-class TEXT | Storage class for the Kubernetes StatefulSet. |
| --skip-apply | Create all the configuration files without applying them. |
| --auto-apply | Apply all the configuration automatically, without prompting for confirmation. |

The command always generates the following files inside the `infra/<provider>` directory, where `<provider>` is the infrastructure provider:

- config.json: Contains the configuration for the provider.
- k8s.yaml: Contains the Kubernetes configuration.
- main.tf: Contains the Terraform configuration.

## tb infra add [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-infra#tb-infra-add)

Adds a new self-managed infrastructure region using an existing infrastructure URL. The command prompts you for the necessary information.

For example:

tb infra add

Running against Tinybird Cloud: Workspace example_workspace
Enter name: example
Enter host: https://tinybird.example.com
» Adding infrastructure 'example' in Tinybird...
✓ Infrastructure 'example' added
» Required environment variables:
TB_INFRA_TOKEN=example_token
TB_INFRA_WORKSPACE=example_workspace
TB_INFRA_ORGANIZATION=example_organization
TB_INFRA_USER=user@example.com The following options are available:

| Option | Description |
| --- | --- |
| --name TEXT | Name for identifying the self-managed infrastructure in Tinybird. |
| --host TEXT | Host for the infrastructure. |

## tb infra update [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-infra#tb-infra-update)

Updates an existing infrastructure region.

For example:

tb infra update example --host https://tinybird-2.example.com

Running against Tinybird Cloud: Workspace example_workspace
» Updating infrastructure 'test' in Tinybird... The following options are available:

| Option | Description |
| --- | --- |
| --name TEXT | Name for identifying the self-managed infrastructure in Tinybird. |
| --host TEXT | Host for the infrastructure. |

## tb infra rm [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-infra#tb-infra-rm)

Removes an existing infrastructure region.

For example:

tb infra rm example

Running against Tinybird Cloud: Workspace example_workspace
» Deleting infrastructure 'example' from Tinybird...
✓ Infrastructure 'example' deleted
## tb infra ls [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-infra#tb-infra-ls)

Lists all the infrastructure regions.

For example:

tb infra ls

Running against Tinybird Cloud: Workspace example_workspace
** Infras:
--------------------------------------------------------------------------
| name         | host                                                    |
--------------------------------------------------------------------------
| example      | https://tinybird.example.com                            |
| example2     | https://tinybird2.example.com                           |
--------------------------------------------------------------------------

---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-info
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "tb info · Tinybird Docs"
theme-color: "#171612"
description: "Get information about your authentication in use"
---


# tb info [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-info#tb-info)

Copy as MD Displays authentication details stored for Tinybird Cloud and Tinybird Local, along with the current project path configuration. The credentials are stored in the `.tinyb` file. See [.tinyb file](/docs/forward/dev-reference/datafiles/tinyb-file).

For example:

tb info

» Tinybird Cloud:
--------------------------------------------------------------------------------------------
user: tinybird@domain.co
workspace_name: forward
workspace_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX
token: YOUR-ADMIN-TOKEN
user_token: YOUR-USER-TOKEN
api: https://api.tinybird.co
ui: https://cloud.tinybird.co/gcp/europe-west2/forward
--------------------------------------------------------------------------------------------

» Tinybird Local:
--------------------------------------------------------------------------------------------
user: tinybird@domain.co
workspace_name: forward
workspace_id: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX
token: YOUR-LOCAL-ADMIN-TOKEN
user_token: YOUR-LOCAL-USER-TOKEN
api: http://localhost:7181
ui: http://cloud.tinybird.co/local/7181/forward
--------------------------------------------------------------------------------------------

» Project:
---------------------------------------------------
current: /path/to/your/project
.tinyb: /path/to/your/project/.tinyb
project: /path/to/your/project
---------------------------------------------------

---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-endpoint
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "tb endpoint · Tinybird Docs"
theme-color: "#171612"
description: "Manage endpoints in your Tinybird project"
---


# tb endpoint [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-endpoint#tb-endpoint)

Copy as MD Manage endpoints. Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| data PIPE | Prints data returned by the endpoint. |
| ls [OPTIONS] | Lists all the endpoints. |
| stats [OPTIONS] PIPE | Prints stats of the last 7 days for an endpoint. |
| token PIPE | Retrieves a token to call an endpoint. |
| url PIPE | Prints the URL of an endpoint. |

## tb endpoint data [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-endpoint#tb-endpoint-data)

Prints data returned by the endpoint.

| Option | Description |
| --- | --- |
| --query TEXT | Runs a query over endpoint results. |
| --format [json|csv] | Returns the results in the specified format. |

## tb endpoint ls [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-endpoint#tb-endpoint-ls)

Lists all the endpoints.

| Option | Description |
| --- | --- |
| --match TEXT | Retrieves any resource matching the pattern. For example, `--match _test`  . |
| --format [json] | Returns the results in the specified format. |

## tb endpoint stats [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-endpoint#tb-endpoint-stats)

Prints stats of the last 7 days for an endpoint.

| Option | Description |
| --- | --- |
| --format [json] | Returns the results in the specified format. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-dev
Last update: 2025-06-13T22:45:16.000Z
Content:
---
title: "tb dev · Tinybird Docs"
theme-color: "#171612"
description: "Build your Tinybird project and validate resources"
---


# tb dev [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-dev#tb-dev)

Copy as MD Use `tb dev` to build your project and watch for changes. When in watch mode you can run SQL queries against the project and also run commands, as `--build` becomes the default target.

For example:

tb dev

» Building project...
✓ datasources/user_actions.datasource created
✓ endpoints/user_actions_line_chart.pipe created
✓ endpoints/user_actions_total_widget.pipe created

✓ Rebuild completed in 0.2s

Watching for changes...

tb > You can run commands and queries from the tb dev prompt. For example:

tb » select * from reservoir_levels


» Running QUERY

  day          station               abslevel   percentagevolume          volume
  Date         String       Nullable(Float32)            Float32         Float32
──────────────────────────────────────────────────────────────────────────────────
  2025-02-07   station_4         108667680000       108667680000   1086676860000
──────────────────────────────────────────────────────────────────────────────────
  2025-01-13   station_9         325980750000       325980750000   3259807600000
──────────────────────────────────────────────────────────────────────────────────
  2025-01-30   station_2         406434020000       406434020000   4064340300000
──────────────────────────────────────────────────────────────────────────────────
  2025-02-09   station_2          60706034000        60706034000    607060300000
──────────────────────────────────────────────────────────────────────────────────
  2025-01-25   station_7         403222040000       403222040000   4032220400000
──────────────────────────────────────────────────────────────────────────────────

» 331 bytes (5 rows x 5 cols) in 6.85ms
» Showing all rows
## Open UI in editor mode [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-dev#open-ui-in-editor-mode)

Passing `--ui` flag, your browser opens a UI with a Project section where you can edit your files as if you were in your IDE.

tb dev --ui

» Starting Tinybird dev server...

✓ Dev server running on http://localhost:49161
* Access your project at https://cloud.tinybird.co/local/7181/your_workspace/project

» Building project...
No changes. Build skipped.

Watching for changes...

tb »
## Fetch data from Cloud workspace [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-dev#fetch-data-from-cloud-workspace)

You can ingest data from Cloud workspace by passing the `--data-origin` flag with `cloud` value.

tb dev --data-origin=cloud

Running against Tinybird Cloud: Workspace <workspace_name>
» Building project...

** Running 'get_reservoir_levels'
** 'get_reservoir_levels' created

✓ Build completed in 0.9s

Watching for changes... Possible values for `--data-origin` are `cloud` and `local` (default).


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-deployment
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "tb deployment · Tinybird Docs"
theme-color: "#171612"
description: "Deploy your project to the Tinybird platform"
---


# tb deployment [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-deployment#tb-deployment)

Copy as MD Deploys your project to the Tinybird platform. Run using `--cloud` to deploy to Tinybird Cloud.

Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| create [OPTIONS] | Validates and deploy the project in Tinybird. |
| ls | Lists all the deployments of your project. |
| promote | Promotes last deploy to ready and remove old one. Accepts the --wait / --no-wait option. |
| discard | Discards the last deployment. Accepts the --wait / --no-wait option. |

## tb deployment create [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-deployment#tb-deployment-create)

Validates and deploys the project in Tinybird. Run using `--cloud` to deploy to Tinybird Cloud. When deploying to Tinybird Cloud, the command shows the URL of the deployment.

| Option | Description |
| --- | --- |
| --wait / --no-wait | Waits for the deployment to finish. Defaults to not waiting. |
| --auto / --no-auto | Automatically promotes the deployment. Only works if `--wait`   is used. |
| --check / --no-check | Validates the deployment before creating it. Disabled by default. |
| --allow-destructive-operations / --no-allow-destructive-operations | Allows destructive operations, like removing data sources, dropping historical data from tables, and so on. Disabled by default. |
| --template TEXT | Deploy from a template name or URL. If a URL is provided, it should point to a valid Tinybird project template. |

Removing data sources is an irreversible operation. Be careful when using the `--allow-destructive-operations` flag.


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-deploy
Last update: 2025-05-07T10:44:34.000Z
Content:
---
title: "tb deploy · Tinybird Docs"
theme-color: "#171612"
description: "Deploy your project to Tinybird Cloud."
---


# tb deploy [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-deploy#tb-deploy)

Copy as MD The `tb deploy` command is an alias of `tb deployment create --wait --auto` . Use `tb deploy` to create and promote a deployment to the Tinybird platform.

For example:

tb deploy    
Running against Tinybird Local

» Changes to be deployed...

-----------------------------------------------------------------
| status   | name         | path                                |
-----------------------------------------------------------------
| modified | user_actions | datasources/user_actions.datasource |
-----------------------------------------------------------------

✓ Deployment submitted successfully
Deployment is ready
Setting candidate deployment as live
Removing old deployment
Deployment promotion successfully started
Deployment promoted successfully See [tb deployment](/docs/forward/dev-reference/commands/tb-deployment) for more information.


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource
Last update: 2025-06-12T10:38:54.000Z
Content:
---
title: "tb datasource · Tinybird Docs"
theme-color: "#171612"
description: "Manage data sources in your Tinybird project"
---


# tb datasource [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource#tb-datasource)

Copy as MD Manages data sources. Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| create [OPTIONS] | Creates a new .datasource file from a URL, local file or a connector. |
| analyze URL_OR_FILE | Analyzes a URL or a file before creating a new data source. |
| append DATASOURCE_NAME [OPTIONS] | Appends data to an existing data source from URL, local file or via Events API. For example, `tb datasource append my_datasource --url https://my_url.com`  . |
| data | Prints data from a data source. |
| delete [OPTIONS] DATASOURCE_NAME | Deletes specific rows from a data source given a SQL condition. |
| export [OPTIONS] DATASOURCE_NAME | Exports data from a data source to a local file in CSV or NDJSON format. |
| ls [OPTIONS] | Lists data sources. |
| replace DATASOURCE_NAME URL | Replaces the data in a data source from a URL, local file or a connector. |
| sync [OPTIONS] DATASOURCE_NAME | Syncs from connector defined in .datasource file. |
| truncate [OPTIONS] DATASOURCE_NAME | Truncates a data source. |

## tb datasource create [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-create)

Creates a new .datasource file. Opens a wizard if no arguments are provided.

| Option | Description |
| --- | --- |
| --name TEXT | Name of the data source |
| --blank | Create a blank data source |
| --file TEXT | Create a data source from a local file |
| --url TEXT | Create a data source from a remote URL |
| --connection TEXT | Create a data source from a connection |
| --prompt TEXT | Create a data source from a prompt |
| --s3 | Create a data source from a S3 connection |
| --gcs | Create a data source from a GCS connection |
| --kafka | Create a data source from a Kafka connection |

## tb datasource analyze [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-analyze)

Analyzes a URL or a local file before creating a new data source. It prints the column names, data type and nullable status of each column, and the SQL schema of the data file.

For example, `tb datasource analyze telemetry.ndjson` will return the following:

| name | type | nullable |
| --- | --- | --- |
| altitude | Float64 | false |
| latitude | Float32 | false |
| longitude | Float32 | false |
| name | String | false |
| timestamp | DateTime64 | false |

altitude Float64 `json:$.altitude`, latitude Float32 `json:$.latitude`, longitude Float32 `json:$.longitude`, name String `json:$.name`, timestamp DateTime64 `json:$.timestamp`
## tb datasource append [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-append)

Appends data to an existing data source from URL, local file  or a connector.

| Option | Description |
| --- | --- |
| --url TEXT | URL to append data from |
| --file TEXT | Local file to append data from |
| --events TEXT | Events to append data from TEXT in NDJSON format |
| -h, --help | Explains append command and options |

## tb datasource data [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-data)

Prints data from a data source.

| Option | Description |
| --- | --- |
| --limit INTEGER | Limits the number of rows to return |

## tb datasource delete [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-delete)

Deletes rows from a data source with SQL condition. For example: `tb datasource delete [datasource_name] --sql-condition "country='ES'"`

| Option | Description |
| --- | --- |
| --yes | Does not ask for confirmation |
| --wait | Waits for delete job to finish |
| --dry-run | Runs the command without deleting anything |

## tb datasource export [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-export)

Exports data from a data source to a local file in CSV or NDJSON format.

For example:

- Export all rows as CSV: `tb datasource export my_datasource`
- Export 1000 rows as NDJSON: `tb datasource export my_datasource --format ndjson --rows 1000`
- Export to specific file: `tb datasource export my_datasource --target ./data/export.csv`

| Option | Description |
| --- | --- |
| --format [csv|ndjson] | Output format (CSV or NDJSON) |
| --rows INTEGER | Number of rows to export (default: 100) |
| --where TEXT | Condition to filter data |
| --target TEXT | Target file path. Default is `datasource_name.{format}` |
| -h, --help | Explains export commmand and options |

## tb datasource ls [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-ls)

Lists data sources.

| Option | Description |
| --- | --- |
| --match TEXT | Retrieves any resource matching the pattern |
| --format [json] | Returns the results in the specified format |

## tb datasource sync [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-sync)

Sync data source to S3 bucket.

| Option | Description |
| --- | --- |
| --yes | Does not ask for confirmation |

## tb datasource truncate [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-datasource#tb-datasource-truncate)

Truncates a data source. For example, `tb datasource truncate my_datasource`.

| Option | Description |
| --- | --- |
| --yes | Does not ask for confirmation |
| --cascade | Truncates the dependent data source attached in cascade to the given data source |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-create
Last update: 2025-05-07T10:54:16.000Z
Content:
---
title: "tb create · Tinybird Docs"
theme-color: "#171612"
description: "Creates an empty data project with predefined folders, CI configuration files, and a git repository."
---


# tb create [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-create#tb-create)

Copy as MD Creates an empty data project with predefined folders, CI configuration files, and a git repository.

Pass the `--prompt` flag to generate a customized starter project based on your prompt.

## Options [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-create#options)

| Option | Description |
| --- | --- |
| --prompt STRING | Prompt to generate a customized initial project. Tinybird Local and authentication are required. |
| --folder PATH | Path that will contain the Tinybird project files. Dotfiles are created in project's root. Tinybird Local (run `tb local start`   ) and authentication ( `tb login`   ) are required. |
| --data STRING | Creates a data project based on the file passed as an argument. You can pass a url or a path to a local file. Supported formats are CSV, NDJSON and Parquet. |
| --agent | Creates rules for an LLM agent. Supported values are `cursor`   and `windsurf`   . The default value is `cursor`  . |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-copy
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "tb copy · Tinybird Docs"
theme-color: "#171612"
description: "Manage copy pipes in your Tinybird project"
---


# tb copy [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-copy#tb-copy)

Copy as MD Manages copy pipes. Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| ls | Lists all the copy pipes. |
| run [OPTIONS] PIPE_NAME_OR_ID | Runs a copy job. |
| pause PIPE_NAME_OR_ID | Pauses a copy job. |
| resume PIPE_NAME_OR_ID | Resumes a copy job. |

## tb copy run [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-copy#tb-copy-run)

Runs a copy job.

| Option | Description |
| --- | --- |
| --wait / --no-wait | Waits for the copy job to finish. |
| --mode [append|replace] | Defines the copy strategy. |
| --yes | Does not ask for confirmation. |
| --param TEXT | Key and value of the params you want the copy pipe to be called with. For example: tb pipe copy run <my_copy_pipe> --param foo=bar |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-connection
Last update: 2025-05-13T20:04:54.000Z
Content:
---
title: "tb connection · Tinybird Docs"
theme-color: "#171612"
description: "Manage your data connections from the terminal."
---


# tb connection [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-connection#tb-connection)

Copy as MD Manage your data connections. Global options apply to this command. See [Global options](/docs/forward/dev-reference/commands/global-options).

Connections are configured through .connection files. See [Connection files](/docs/forward/dev-reference/datafiles/connection-files).

## Subcommands [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-connection#subcommands)

The following subcommands are available:

| Subcommand | Description |
| --- | --- |
| create TYPE | Create a new connection. Supported values are: `kafka`  , `gcs`   , and `s3`  . |
| ls | Lists all connections in the project. |

## tb connection ls [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-connection#tb-connection-ls)

Lists connections.

| Option | Description |
| --- | --- |
| --service TEXT | Filter by service. |


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/tb-build
Last update: 2025-06-13T22:45:16.000Z
Content:
---
title: "tb build · Tinybird Docs"
theme-color: "#171612"
description: "Build your Tinybird project and validate resources"
---


# tb build [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-build#tb-build)

Copy as MD Builds your project and checks that all the resources are valid.

tb build

» Building project...
✓ datasources/events.datasource created
✓ endpoints/endpoint.pipe created

✓ Build completed in 0.2s
## Difference with tb deployment create [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/tb-build#difference-with-tb-deployment-create)

While similar, `tb build` and `tb deployment create` have different purposes:

- `tb build`   command is a stateless command that checks that project[  datafiles](/docs/forward/dev-reference/datafiles)   are valid.
- `tb deployment create --check`   checks that you can successfully create the deployment.

For example, when updating a data project in a workspace, `tb build` checks that the new version of your project is valid, while `tb deployment create --check` verifies that you can successfully migrate from the old version to the new one.


---

URL: https://www.tinybird.co/docs/forward/dev-reference/commands/global-options
Last update: 2025-05-08T13:40:58.000Z
Content:
---
title: "Global options · Tinybird Docs"
theme-color: "#171612"
description: "Global options for the Tinybird CLI"
---


# Global options [¶](https://www.tinybird.co/docs/forward/dev-reference/commands/global-options#global-options)

Copy as MD The following options are available for all commands.

Enter them before the command name. For example: `tb --host https://api.tinybird.co workspace ls`.


| Option | Description |
| --- | --- |
| --host TEXT | Tinybird host. Defaults to the value of the `TB_HOST`   environment variable, then to `https://api.tinybird.co`  . |
| --token TEXT | Authentication token. Defaults to the value of the `TB_TOKEN`   environment variable, then to the `.tinyb`   file. See[  .tinyb file](/docs/forward/dev-reference/datafiles/tinyb-file)  . |
| --show-tokens | Shows available tokens. |
| --user-token | Use the user token, defaults to the `TB_USER_TOKEN`   environment variable, then to the .tinyb file. See[  .tinyb file](/docs/forward/dev-reference/datafiles/tinyb-file)  . |
| --cloud / --local | Run against Tinybird Cloud or Tinybird Local. Local is the default except when in build mode. |
| --build | Run against the build mode. Default when running `tb dev`   or `tb build --watch`  . |
| --staging | Run against the staging deployment. |
| --debug / --no-debug | Prints internal representation. Combine it with any command to get more information. |
| --version | Shows the version and exits. |
| -h, --help | Shows help for the command. |


---

URL: https://www.tinybird.co/docs/forward/administration/tokens/static-tokens
Last update: 2025-06-17T11:17:27.000Z
Content:
---
title: "Static tokens · Tinybird Docs"
theme-color: "#171612"
description: "Static tokens are permanent and long-term."
---


# Static tokens [¶](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#static-tokens)

Copy as MD Static tokens are permanent and long-term. They're stored inside Tinybird and don't have an expiration date or time. They will be valid until deleted or refreshed. They're useful for backend-to-backend integrations, where you call Tinybird as another service.

## Default tokens (created by Tinybird) [¶](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#default-tokens-created-by-tinybird)

All workspaces come with a set of default tokens:

| Token name | Description |
| --- | --- |
| `Workspace admin token` | The Workspace token. This token is workspace-bound and enables any operation over it. Note: only workspace admins have access to this token. |
| `Admin <your-email> token` | The CLI token. This token is managed by Tinybird for you and the CLI uses it to authenticate via 'tb login' (stores it locally in the `.tinyb`   file). |
| `User token` | Required only for certain operations through the API (like creating workspaces) - the system will ask you for it if required. |

See below how to [list exiting tokens](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#list-existing-tokens)

## User created tokens [¶](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#user-created-tokens)

Users can create additional tokens with different authorization scopes. This allow you to grant granular access to resources or to create tokens for CI/CD or for other purposes.

There are two types of static tokens:

- **[  Resource-scoped tokens](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#resource-scoped-tokens)  :**   grant specific permissions on specific resources, such as reading from a given endpoint or appending to a given data source. Created in*  .pipe*   and*  .datasource*   files and managed via deployments.
- **[  Workspace and Org. level tokens](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#other-tokens)  :**   tokens with `ADMIN`   or `TOKENS`   scope to interact with other Workspace resources, or with `ORG_DATASOURCES:READ`   scope to query Organization level data sources. Created and managed via the CLI or API.

### Resource-scoped tokens [¶](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#resource-scoped-tokens)

When you create a resource-scoped token, you can define which resources can be accessed by that token, and which methods can be used to access them.

They are managed using the `TOKEN` directive in data files, with the following structure `TOKEN <token_name> <scope>` . Scopes are `READ` or `APPEND`.

For example in a .datasource file:

##### example.datasource

TOKEN app_read READ
TOKEN landing_read READ
TOKEN landing_append APPEND
SCHEMA >
    ... For .pipe files, the behavior is the same:

##### example.pipe

TOKEN app_read READ

NODE node_1
SQL >
    %
    SELECT Resource-scoped tokens are created and updated through deployments. Tinybird will keep track of which ones to create or destroy based on all the tokens defined within the data files in your project.

The following scopes are available for resource-scoped tokens:

| Token Scope (API) | Token Scope (CLI) | Description |
| --- | --- | --- |
| `DATASOURCES:READ:datasource_name` | `TOKEN <token_name> READ`   in `.datasource`   files | Grants the token read permissions on the specified data source(s) |
| `DATASOURCES:APPEND:datasource_name` | `TOKEN <token_name> APPEND`   in `.datasource`   files | Grants the token permission to append data to the specified data source. |
| `PIPES:READ:pipe_name` | `TOKEN <token_name> APPEND`   in `.pipe`   files | Grants the token read permissions for the specified pipe. |

When adding the `DATASOURCES:READ` scope to a token, it automatically grants read permissions to the [quarantine data source](/docs/forward/get-data-in/quarantine) associated with it.

SQL filters ( `:sql_filter` suffix) are not supported in Tinybird Forward. Use fixed parameters in JWTs for row-level security instead.

### Other tokens [¶](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#other-tokens)

These are operational tokens that are not tied to specific resources. Run the following command in the CLI:

tb token create static new_admin_token --scope <scope> The following scopes are available for general tokens:

| Value | Description |
| --- | --- |
| `TOKENS` | Grants the token permission to create, delete or refresh tokens. |
| `ADMIN` | Grants full access to the workspace. Use sparingly. |
| `ORG_DATASOURCES:READ` | Grants the token read access to organization service datasources. |

## List existing tokens [¶](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#list-existing-tokens)

You can review your existing tokens using:

- **  CLI**   : Run `tb token ls`   to list all tokens in your workspace. See[  tb token](/docs/forward/dev-reference/commands/tb-token)   for reference.
- **  UI**   : Navigate to the "Tokens" section in the sidebar of your Tinybird workspace.

## Refresh a static token [¶](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#refresh-a-static-token)

To refresh a token, run the `tb token refresh` command. For example:

tb token refresh my_static_token See [tb token](/docs/forward/dev-reference/commands/tb-token) for more information.

## Delete a static token [¶](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#delete-a-static-token)

### Resource-scoped tokens [¶](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#resource-scoped-tokens)

Resource-scoped tokens are updated through deployments. Tinybird will keep track of which ones destroy based on all the tokens defined within the data files in your project.

So, to remove a resource-scoped token, just **delete it from the data files and make a deployment.** The changes will be applied automatically.

### Other tokens [¶](https://www.tinybird.co/docs/forward/administration/tokens/static-tokens#other-tokens)

To delete [other tokens](/docs/forward/administration/tokens/static-tokens#other-tokens) that are not tied to specific resources, run the following command:

tb token rm <token_name> See [tb token](/docs/forward/dev-reference/commands/tb-token) for more information.


---

URL: https://www.tinybird.co/docs/forward/administration/tokens/jwt
Last update: 2025-06-17T15:11:52.000Z
Content:
---
title: "JSON Web tokens (JWTs) · Tinybird Docs"
theme-color: "#171612"
description: "JWTs are signed tokens that allow you to securely authorize and share data between your application and Tinybird."
---


# JSON Web tokens (JWTs) [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#json-web-tokens-jwts)

Copy as MD JWTs are signed tokens that allow you to securely and independently authorize and consume data from Tinybird.

Unlike static tokens, **JWTs are not stored in Tinybird** . They're created by you, inside your application, and signed with a shared secret between your application and Tinybird. Tinybird validates the signature of the JWT, using the shared secret, to ensure it's authentic.

A great use case for JWTs is when you want to allow your app to call Tinybird API endpoints directly from the browser without proxying through your backend.

The typical pattern looks like this:

1. A user starts a session in your application.
2. The frontend requests a JWT from your backend.
3. Your backend generates a new JWT, signed with the Tinybird shared secret, and returns to the frontend.
4. The frontend uses the JWT to call the Tinybird API endpoints directly.

## JWT payload [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#jwt-payload)

The payload of a JWT is a JSON object that contains the following fields:

| Key | Example Value | Required | Description |
| --- | --- | --- | --- |
| workspace_id | workspaces_id | Yes | The UUID of your Tinybird workspace, found in the workspace list. |
| name | frontend_jwt | Yes | Used to identify the token in the `tinybird.pipe_stats_rt`   table, useful for analytics. Doesn't need to be unique. |
| exp | 123123123123 | Yes | The Unix timestamp (UTC) showing the expiry date & time. After a token has expired, Tinybird returns a 403 HTTP status code. |
| scopes | [{"type": "PIPES:READ", "resource": "requests_per_day", "fixed_params": {"org_id": "testing"}}] | Yes | Used to pass data to Tinybird, including the Tinybird scope, resources and fixed parameters. |
| scopes.type | PIPES:READ | Yes | The type of scope, for example `READ`   . See[  JWT scopes](https://www.tinybird.co/docs/forward/administration/tokens/jwt#jwt-scopes)   for supported scopes. |
| scopes.resource | t_b9427fe2bcd543d1a8923d18c094e8c1 or top_airlines | Yes | The ID or name of the pipe that the scope applies to, like which API endpoint the token can access. |
| scopes.fixed_params | {"org_id": "testing"} | No | Pass arbitrary fixed values to the API endpoint. These values can be accessed by pipe templates to supply dynamic values at query time. |
| limits | {"rps": 10} | No | You can limit the number of requests per second the JWT can perform. See[  JWT rate limit](https://www.tinybird.co/docs/forward/administration/tokens/jwt#rate-limits-for-jwt-tokens)  . |

Check out the [JWT example](https://www.tinybird.co/docs/forward/administration/tokens/jwt#jwt-example) to see what a complete payload looks like.

## JWT algorithm [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#jwt-algorithm)

Tinybird always uses HS256 as the algorithm for JWTs and doesn't read the `alg` field in the JWT header. You can skip the `alg` field in the header.

## JWT scopes [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#jwt-scopes)

| Value | Description |
| --- | --- |
| `PIPES:READ:pipe_name` | Gives your token read permissions for the specified pipe |

## JWT expiration [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#jwt-expiration)

JWTs can have an expiration time that gives each token a finite lifespan.

Setting the `exp` field in the JWT payload is mandatory, and not setting it results in a 403 HTTP status code from Tinybird when requesting the API endpoint.

Tinybird validates that a JWT hasn't expired before allowing access to the API endpoint.

If a token has expired, Tinybird returns a 403 HTTP status code.

## JWT fixed parameters [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#jwt-fixed-parameters)

Fixed parameters allow you to pass arbitrary values to the API endpoint. These values can be accessed by pipe templates to supply dynamic values at query time.

For example, consider the following API Endpoint that accepts a parameter called `org` that filters by the `org_id` column:

##### example.pipe

SELECT fieldA, fieldB FROM my_ds WHERE org_id = '{{ String(org) }}'
TYPE ENDPOINT The following JWT payload passes a parameter called `org` with the value `test_org` to the API endpoint:

##### Example fixed parameters

{
  "fixed_params": {
    "org": "test_org"
  }
} This is particularly useful when you want to pass dynamic values to an API endpoint that are set by your backend and must be safe from user tampering. A good example is multi-tenant applications that require row-level security, where you need to filter data based on a user or tenant ID.

The value for the `org` parameter is always the one specified in the `fixed_params` . Even if you specify a new value in the URL when requesting the endpoint, Tinybird always uses the one specified in the JWT.

## JWT example [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#jwt-example)

Consider the following payload with all [required and optional fields](https://www.tinybird.co/docs/forward/administration/tokens/jwt#jwt-payload):

##### Example payload

{
    "workspace_id": "workspaces_id",
    "name": "frontend_jwt",
    "exp": 123123123123,
    "scopes": [
        {
            "type": "PIPES:READ",
            "resource": "requests_per_day",
            "fixed_params": {
                "org_id": "testing"
            }
        }
    ],
    "limits": {
      "rps": 10
    }
} Use the workspace admin token as your signing key ( `TINYBIRD_SIGNING_KEY` ), for example:

##### Example workspace admin token

p.eyJ1IjogIjA1ZDhiYmI0LTdlYjctND... With the payload and admin token, the signed JWT payload would look like this:

##### Example JWT

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ3b3Jrc3BhY2V...
## JWT limitations [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#jwt-limitations)

The following limitations apply to JWTs:

- You can't refresh JWTs individually from inside Tinybird as they aren't stored in Tinybird. You must do this from your application, or you can globally invalidate all JWTs by refreshing your admin token.
- If you refresh your admin token, all the tokens are invalidated.
- If your token expires or is invalidated, you get a 403 HTTP status code from Tinybird when requesting the API endpoint.

## Create a JWT in production [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#create-a-jwt-in-production)

There is wide support for creating JWTs in many programming languages and frameworks. Any library that supports JWTs should work with Tinybird.

- JavaScript (Next.js)
- Python

##### Create a JWT in Python using pyjwt

import jwt
import datetime
import os

TINYBIRD_SIGNING_KEY = os.getenv('TINYBIRD_SIGNING_KEY')

def generate_jwt():
  expiration_time = datetime.datetime.utcnow() + datetime.timedelta(hours=3)
  payload = {
      "workspace_id": "workspaces_id",
      "name": "frontend_jwt",
      "exp": expiration_time,
      "scopes": [
          {
              "type": "PIPES:READ",
              "resource": "requests_per_day",
              "fixed_params": {
                  "org_id": "testing"
              }
          }
      ]
  }

  return jwt.encode(payload, TINYBIRD_SIGNING_KEY, algorithm='HS256')
## Create a JWT token via CLI [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#create-a-jwt-token-via-cli)

If for any reason you don't want to generate a JWT on your own, Tinybird provides a command and an endpoint to create a JWT token.

- API
- CLI

##### Create a JWT with the Tinybird CLI

tb token create jwt my_jwt --ttl 1h --scope PIPES:READ --resource my_pipe --fixed-params "column_name=value"
## Error handling [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#error-handling)

There are many reasons why a request might return a `403` status code. When a `403` is received, check the following:

1. Confirm the JWT is valid and hasn't expired. The expiration time is in the `exp`   field in the JWT's payload.
2. The generated JWTs can only read Tinybird API endpoints. Confirm you're not trying to use the JWT to access other APIs.
3. Confirm the JWT has a scope to read the endpoint you are trying to read.
4. If you generated the JWT outside of Tinybird, without using the API or the CLI, make sure you are using the**  workspace** `admin token`   , not your personal one.

## Rate limits for JWTs [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#rate-limits-for-jwts)


Check the [limits page](/docs/forward/pricing/limits) for limits on ingestion, queries, API Endpoints, and more.

When you specify a `limits.rps` field in the payload of the JWT, Tinybird uses the name specified in the payload of the JWT to track the number of requests being done. If the number of requests goes beyond the limit, Tinybird starts rejecting new requests and returns an "HTTP 429 Too Many Requests" error.

The following example shows the tracking of all requests done by `frontend_jwt` . Once you reach 10 requests per second, Tinybird would start rejecting requests:

##### Example payload with global rate limit

{
    "workspace_id": "workspaces_id",
    "name": "frontend_jwt",
    "exp": 123123123123,
    "scopes": [
        {
            "type": "PIPES:READ",
            "resource": "requests_per_day",
            "fixed_params": {
                "org_id": "testing"
            }
        }
    ],
    "limits": {
      "rps": 10
    }
} If `rps <= 0` , Tinybird ignores the limit and assumes there is no limit.

As the `name` field doesn't have to be unique, all the tokens generated using the `name=frontend_jwt` would be under the same umbrella. This can be useful if you want to have a global limit in one of your apps or components.

If you want to limit for each specific user, you can generate a JWT using the following payload. In this case, you would specify a unique name so the limits only apply to each user:

##### Example of a payload with isolated rate limit

{
    "workspace_id": "workspaces_id",
    "name": "frontend_jwt_user_<unique identifier>",
    "exp": 123123123123,
    "scopes": [
        {
            "type": "PIPES:READ",
            "resource": "requests_per_day",
            "fixed_params": {
                "org_id": "testing"
            }
        }
    ],
    "limits": {
      "rps": 10
    }
}
## Next steps [¶](https://www.tinybird.co/docs/forward/administration/tokens/jwt#next-steps)

- Learn about[  workspaces](../workspaces)  .
- Learn about[  endpoints](../../work-with-data/publish-data/endpoints)  .
- Using[  Clerk.com](https://clerk.com/)   in your app?[  Let Clerk.com create your Tinybird JWTs automatically](https://clerk.com/blog/tinybird-and-clerk)  .


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink
Last update: 2025-07-02T21:49:40.000Z
Content:
---
title: "S3 Sink · Tinybird Docs"
theme-color: "#171612"
description: "Offload data to S3 on a batch-based schedule using Tinybird's fully managed S3 Sink Connector."
---


# S3 Sink [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#s3-sink)

Copy as MD You can set up an S3 Sink to export your data from Tinybird to any S3 bucket in CSV, NDJSON, or Parquet format. The S3 Sink allows you to offload data on a batch-based schedule using Tinybird's fully managed connector.

Setting up the S3 Sink requires:

1. Configuring AWS[  permissions](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#aws-permissions)   using[  IAM roles](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html)  .
2. Creating a connection file in Tinybird.
3. Creating a Sink pipe that uses this connection.

Tinybird represents Sinks using the icon.

The S3 Sink feature is available for Developer and Enterprise plans. See [Plans](/docs/forward/pricing).

## Environment considerations [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#environment-considerations)

Before setting up the S3 Sink, understand how it works in different environments.

### Cloud environment [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#cloud-environment)

In the Tinybird Cloud environment, Tinybird uses its own AWS account to assume the IAM role you create, allowing it to write to your S3 bucket.

### Local environment [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#local-environment)

When using the S3 Sink in the Tinybird Local environment, which runs in a container, you need to pass your local AWS credentials to the container. These credentials must have the [permissions described in the AWS permissions section](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#aws-permissions) , including access to S3 operations like `PutObject`, `ListBucket` , etc. This allows Tinybird Local to assume the IAM role you specify in your connection.

To pass your AWS credentials, use the `--use-aws-creds` flag when starting Tinybird Local:

tb local start --use-aws-creds
» Starting Tinybird Local...
✓ AWS credentials found and will be passed to Tinybird Local (region: us-east-1)
* Waiting for Tinybird Local to be ready...
✓ Tinybird Local is ready! If you're using a specific AWS profile, you can specify it using the `AWS_PROFILE` environment variable:

AWS_PROFILE=my-profile tb local start --use-aws-creds When using the S3 Sink in the `--local` environment, scheduled sink operations are not supported. You can only run on-demand sinks using `tb sink run <pipe_name>` . For scheduled sink operations, use the Cloud environment.

## Set up the sink [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#set-up-the-sink)

1
### Create an S3 connection [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#create-an-s3-connection)

You can create an S3 connection in Tinybird using either the guided CLI process or by manually creating a connection file.

#### Option 1: Use the guided CLI process (recommended) [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#option-1-use-the-guided-cli-process-recommended)

The Tinybird CLI provides a guided process that helps you set up the required AWS permissions and creates the connection file automatically:

tb connection create s3 When prompted, you'll need to:

1. Enter a name for your connection.
2. Specify whether you'll use this connection for sinking or ingesting data.
3. Enter the S3 bucket name.
4. Enter the AWS region where your bucket is located.
5. Copy the displayed AWS IAM policy to your clipboard (you'll need this to set up permissions in AWS).
6. Copy the displayed AWS IAM role trust policy for your Local environment, then enter the ARN of the role you create.
7. Copy the displayed AWS IAM role trust policy for your Cloud environment, then enter the ARN of the role you create.
8. The ARN values will be stored securely using[  tb secret](/docs/forward/dev-reference/commands/tb-secret)   , which will allow you to have different roles for each environment.

#### Option 2: Create a connection file manually [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#option-2-create-a-connection-file-manually)

You can also set up a connection manually by creating a [connection file](/docs/forward/dev-reference/datafiles/connection-files) with the required credentials. There are two authentication methods available:

##### Option 2a: IAM Role Authentication (recommended)

This method uses AWS IAM roles for secure, temporary credential access:

##### s3sample.connection

TYPE s3
S3_REGION "<S3_REGION>"
S3_ARN "<IAM_ROLE_ARN>" When creating your connection manually with IAM roles, you need to set up the required AWS IAM role with appropriate permissions. See the [AWS permissions](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#aws-permissions) section for details on the required access policy and trust policy configurations.

##### Option 2b: HMAC Authentication

This method uses long-lived access keys for authentication:

##### s3sample.connection

TYPE s3
S3_REGION "<S3_REGION>"
S3_ACCESS_KEY {{ tb_secret('s3_access_key') }}
S3_SECRET {{ tb_secret('s3_secret') }} When using HMAC authentication, you need to:

1. Create an AWS IAM user with programmatic access
2. Attach the same permissions policy described in the[  AWS permissions](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#aws-permissions)   section
3. Store the access key and secret key as[  Tinybird secrets](/docs/forward/dev-reference/commands/tb-secret)

IAM Role authentication (Option 2a) is recommended over HMAC authentication as it provides better security through temporary credentials and follows AWS security best practices.

See [Connection files](/docs/forward/dev-reference/datafiles/connection-files) for more details on how to create a connection file and manage secrets.

You need to create separate connections for each environment you're working with, Local and Cloud.

For example, you can create:

- `my-s3-local`   for your Local environment
- `my-s3-cloud`   for your Cloud environment

2
### Create a Sink pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#create-a-sink-pipe)

To create a Sink pipe, create a regular .pipe and filter the data you want to export to your bucket in the SQL section as in any other pipe. Then, specify the pipe as a sink type and add the needed configuration. Your pipe should have the following structure:

##### s3_export.pipe

NODE node_0

SQL >
    SELECT *
    FROM events
    WHERE status = 'processed'

TYPE sink
EXPORT_CONNECTION_NAME "s3sample"
EXPORT_BUCKET_URI "s3://tinybird-sinks"
EXPORT_FILE_TEMPLATE "daily_prices" # Supports partitioning
EXPORT_SCHEDULE "*/5 * * * *" 
EXPORT_FORMAT "csv" # Optional
EXPORT_COMPRESSION "gz" # Optional
EXPORT_STRATEGY "create_new" # Optional 3
### Deploy the Sink pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#deploy-the-sink-pipe)

After defining your S3 data source and connection, test it by running a deploy check:

tb --cloud deploy --check This runs the connection locally and checks if the connection is valid. To see the connection details, run `tb --cloud connection ls`.

When ready, push the datafile to your Workspace using `tb deploy` to create the Sink pipe:

tb --cloud deploy This creates the Sink pipe in your workspace and makes it available for execution.

## .connection settings [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#connection-settings)

The S3 connector use the following settings in .connection files:

| Instruction | Required | Description |
| --- | --- | --- |
| `S3_REGION` | Yes | Region of the S3 bucket. |
| `S3_ARN` | No* | ARN of the IAM role with the required permissions. Required for IAM Role authentication. |
| `S3_ACCESS_KEY` | No* | AWS access key for HMAC authentication. Store as a[  Tinybird secret](/docs/forward/dev-reference/commands/tb-secret)  . |
| `S3_SECRET` | No* | AWS secret key for HMAC authentication. Store as a[  Tinybird secret](/docs/forward/dev-reference/commands/tb-secret)  . |

*Either `S3_ARN` (for IAM Role authentication) or both `S3_ACCESS_KEY` and `S3_SECRET` (for HMAC authentication) are required.

## .pipe settings [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#pipe-settings)

The S3 Sink pipe uses the following settings in .pipe files:

| Key | Type | Description |
| --- | --- | --- |
| `EXPORT_CONNECTION_NAME` | string | Required. The connection name to the destination service. This the connection created in Step 1. |
| `EXPORT_BUCKET_URI` | string | Required. The path to the destination bucket. Example: `s3://tinybird-export` |
| `EXPORT_FILE_TEMPLATE` | string | Required. The target file name. Can use parameters to dynamically name and partition the files. See File partitioning section below. Example: `daily_prices_{customer_id}` |
| `EXPORT_SCHEDULE` | string | Required. A crontab expression that sets the frequency of the Sink operation or the @on-demand string. |
| `EXPORT_FORMAT` | string | Optional. The output format of the file. Values: CSV, NDJSON, Parquet. Default value: CSV |
| `EXPORT_COMPRESSION` | string | Optional. Accepted values: `none`  , `gz`   for gzip, `br`   for brotli, `xz`   for LZMA, `zst`   for zstd. Default: `none` |
| `EXPORT_STRATEGY` | string | Optional. Defines how to handle existing files. Values: `create_new`   (default), `replace`   . See[  Write strategies](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#write-strategies)   section below. |

### Supported regions [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#supported-regions)

The Tinybird S3 Sink feature only supports exporting data to the following AWS regions:

- `us-east-*`
- `us-west-*`
- `eu-central-*`
- `eu-west-*`
- `eu-south-*`
- `eu-north-*`

### Scheduling considerations [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#scheduling-considerations)

The schedule applied to a Sink pipe doesn't guarantee that the underlying job executes immediately at the configured time. The job is placed into a job queue when the configured time elapses. It is possible that, if the queue is busy, the job could be delayed and executed after the scheduled time.

To reduce the chances of a busy queue affecting your Sink pipe execution schedule, distribute the jobs over a wider period of time rather than grouping them close together.

### Write strategies [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#write-strategies)

The `EXPORT_STRATEGY` parameter determines how Tinybird handles existing files in your S3 bucket:

- ** `create_new`**   (default): Creates new files without overwriting existing ones. If a file with the same name already exists, Tinybird will append a suffix to make the filename unique.
- ** `replace`**   : Overwrites existing files with the same name. Use this when you want to replace previous exports entirely.

### Query parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#query-parameters)

You can add [query parameters](/docs/forward/work-with-data/query-parameters) to your Sink pipes, the same way you do in API Endpoints or Copy pipes.

- For on-demand executions, you can set parameters when you trigger the Sink pipe to whatever values you wish.
- For scheduled executions, the default values for the parameters will be used when the Sink pipe runs.

## Execute the Sink pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#execute-the-sink-pipe)

### On-demand execution [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#on-demand-execution)

You can trigger your Sink pipe manually using:

tb sink run <pipe_name> When triggering a Sink pipe you have the option of overriding several of its settings, like format or compression. Refer to the [Sink pipes API spec](/docs/api-reference/sink-pipes-api) for the full list of parameters.

### Scheduled execution [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#scheduled-execution)

If you configured a schedule with `EXPORT_SCHEDULE` , the Sink pipe will run automatically according to the cron expression.

Once the Sink pipe is triggered, it creates a standard Tinybird job that can be followed via the `v0/jobs` API or using `tb job ls --kind=sink`.

## File template [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#file-template)

The export process allows you to partition the result in different files, allowing you to organize your data and get smaller files. The partitioning is defined in the file template and based on the values of columns of the result set.

### Partition by column [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#partition-by-column)

Add a template variable like `{COLUMN_NAME}` to the filename. For instance, consider the following query schema and result for an export:

| customer_id | invoice_id | amount |
| --- | --- | --- |
| ACME | INV20230608 | 23.45 |
| ACME | 12345INV | 12.3 |
| GLOBEX | INV-ABC-789 | 35.34 |
| OSCORP | INVOICE2023-06-08 | 57 |
| ACME | INV-XYZ-98765 | 23.16 |
| OSCORP | INV210608-001 | 62.23 |
| GLOBEX | 987INV654 | 36.23 |

With the given file template `invoice_summary_{customer_id}.csv` you'd get 3 files:

`invoice_summary_ACME.csv`

| customer_id | invoice_id | amount |
| --- | --- | --- |
| ACME | INV20230608 | 23.45 |
| ACME | 12345INV | 12.3 |
| ACME | INV-XYZ-98765 | 23.16 |

`invoice_summary_OSCORP.csv`

| customer_id | invoice_id | amount |
| --- | --- | --- |
| OSCORP | INVOICE2023-06-08 | 57 |
| OSCORP | INV210608-001 | 62.23 |

`invoice_summary_GLOBEX.csv`

| customer_id | invoice_id | amount |
| --- | --- | --- |
| GLOBEX | INV-ABC-789 | 35.34 |
| GLOBEX | 987INV654 | 36.23 |

### Values format [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#values-format)

In the case of DateTime columns, it can be dangerous to partition just by the column. Why? Because you could end up with as many files as seconds, as they're the different values for a DateTime column. In an hour, that's potentially 3600 files.

To help partition in a sensible way, you can add a format string to the column name using the following placeholders:

| Placeholder | Description | Example |
| --- | --- | --- |
| %Y | Year | 2023 |
| %m | Month as an integer number (01-12) | 06 |
| %d | Day of the month, zero-padded (01-31) | 07 |
| %H | Hour in 24h format (00-23) | 14 |
| %i | Minute (00-59) | 45 |

For instance, for a result like this:

| timestamp | invoice_id | amount |
| --- | --- | --- |
| 2023-07-07 09:07:05 | INV20230608 | 23.45 |
| 2023-07-07 09:07:01 | 12345INV | 12.3 |
| 2023-07-07 09:06:45 | INV-ABC-789 | 35.34 |
| 2023-07-07 09:05:35 | INVOICE2023-06-08 | 57 |
| 2023-07-06 23:14:05 | INV-XYZ-98765 | 23.16 |
| 2023-07-06 23:14:02 | INV210608-001 | 62.23 |
| 2023-07-06 23:10:55 | 987INV654 | 36.23 |

Note that all 7 events have different times in the column timestamp. Using a file template like `invoices_{timestamp}` would create 7 different files.

If you were interested in writing one file per hour, you could use a file template like `invoices_{timestamp, '%Y%m%d-%H'}` . You'd then get only two files for that dataset:

`invoices_20230707-09.csv`

| timestamp | invoice_id | amount |
| --- | --- | --- |
| 2023-07-07 09:07:05 | INV20230608 | 23.45 |
| 2023-07-07 09:07:01 | 12345INV | 12.3 |
| 2023-07-07 09:06:45 | INV-ABC-789 | 35.34 |
| 2023-07-07 09:05:35 | INVOICE2023-06-08 | 57 |

`invoices_20230706-23.csv`

| timestamp | invoice_id | amount |
| --- | --- | --- |
| 2023-07-06 23:14:05 | INV-XYZ-98765 | 23.16 |
| 2023-07-06 23:14:02 | INV210608-001 | 62.23 |
| 2023-07-06 23:10:55 | 987INV654 | 36.23 |

### By number of files [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#by-number-of-files)

You also have the option to write the result into X files. Instead of using a column name, use an integer between brackets.

Example: `invoice_summary.{8}.csv`

This is convenient to reduce the file size of the result, especially when the files are meant to be consumed by other services, like Snowflake where uploading big files is discouraged.

The results are written in random order. This means that the final result rows would be written in X files, but you can't count the specific order of the result.

There are a maximum of 16 files.

### Combining different partitions [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#combining-different-partitions)

It's possible to add more than one partitioning parameter in the file template. This is useful, for instance, when you do a daily dump of data, but want to export one file per hour.

Setting the file template as `invoices/dt={timestamp, '%Y-%m-%d'}/H{timestamp, '%H}.csv` would create the following file structure in different days and executions:

Invoices
├── dt=2023-07-07
│   └── H23.csv
│   └── H22.csv
│   └── H21.csv
│   └── ...
├── dt=2023-07-06
│   └── H23.csv
│   └── H22.csv You can also mix column names and number of files. For instance, setting the file template as `invoices/{customer_id}/dump_{4}.csv` would create the following file structure in different days and executions:

Invoices
├── ACME
│   └── dump_0.csv
│   └── dump_1.csv
│   └── dump_2.csv
│   └── dump_3.csv
├── OSCORP
│   └── dump_0.csv
│   └── dump_1.csv
│   └── dump_2.csv
│   └── dump_3.csv Be careful with excessive partitioning. Take into consideration that the write process will create as many files as combinations of the values of the partitioning columns for a given result set.

## Supported file types [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#supported-file-types)

The S3 Sink supports exporting data in the following file formats:

| File type | Accepted extensions | Compression formats supported |
| --- | --- | --- |
| CSV | `.csv`  , `.csv.gz` | `gzip` |
| NDJSON | `.ndjson`  , `.ndjson.gz`  , `.jsonl`  , `.jsonl.gz`  , `.json`  , `.json.gz` | `gzip` |
| Parquet | `.parquet`  , `.parquet.gz` | `snappy`  , `gzip`  , `lzo`  , `brotli`  , `lz4`  , `zstd` |

You can optionally configure the export format using the `EXPORT_FORMAT` parameter (defaults to CSV) and compression using the `EXPORT_COMPRESSION` parameter in your Sink pipe configuration.

## AWS permissions [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#aws-permissions)

The S3 connector requires an IAM Role with specific permissions to access objects in your Amazon S3 bucket:

- `s3:GetObject`
- `s3:PutObject`
- `s3:PutObjectAcl`
- `s3:ListBucket`
- `s3:GetBucketLocation`

You need to create both an access policy and a trust policy in AWS:

- AWS Access Policy
- AWS Trust Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": "arn:aws:s3:::{bucket-name}/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetBucketLocation",
                "s3:ListBucket"
            ],
            "Resource": "arn:aws:s3:::{bucket-name}"
        }
    ]
}
## Observability [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#observability)

Sink pipes operations are logged in the [tinybird.jobs_log](/docs/forward/monitoring/service-datasources#tinybird-jobs-log) Service Data Source. You can filter by `job_type = 'sink'` to see only Sink pipe executions.

For more detailed Sink-specific information, you can also use [tinybird.sinks_ops_log](/docs/forward/monitoring/service-datasources#tinybird-sinks-ops-log).

Data Transfer incurred by Sink pipes is tracked in [tinybird.data_transfer](/docs/forward/monitoring/service-datasources#tinybird-data-transfer) Service Data Source.

## Limits & quotas [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#limits-quotas)


Check the [limits page](/docs/forward/pricing/limits) for limits on ingestion, queries, API Endpoints, and more.

## Billing [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#billing)

Tinybird bills Sink pipes based on Data Transfer. When a Sink pipe executes, it uses your plan's included compute resources (vCPUs and active minutes) to run the query, then writes the result to a bucket (Data Transfer). If the resulting files are compressed, Tinybird accounts for the compressed size.

### Data Transfer [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#data-transfer)

Data Transfer depends on your environment. There are two scenarios:

- The destination bucket is in the**  same**   cloud provider and region as your Tinybird Workspace: $0.01 / GB
- The destination bucket is in a**  different**   cloud provider or region as your Tinybird Workspace: $0.10 / GB

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/s3-sink#next-steps)

- Get familiar with the[  Service Data Source](/docs/forward/monitoring/service-datasources)   and see what's going on in your account
- Deep dive on Tinybird's[  pipes concept](/docs/forward/work-with-data/pipes)


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink
Last update: 2025-06-26T06:36:23.000Z
Content:
---
title: "Kafka Sink · Tinybird Docs"
theme-color: "#171612"
description: "Push events to Kafka on a batch-based schedule using Tinybird's fully managed Kafka Sink Connector."
---


# Kafka Sink [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#kafka-sink)

Copy as MD You can set up a Kafka Sink to export your data from Tinybird to any Kafka topic. The Kafka Sink allows you to push events to Kafka on a batch-based schedule using Tinybird's fully managed connector.

Setting up the Kafka Sink requires:

1. Creating a connection file in Tinybird with your Kafka configuration.
2. Creating a Sink pipe that uses this connection.

Tinybird represents Sinks using the icon.

Kafka Sinks are available on the Developer and Enterprise plans. See [Plans](/docs/forward/pricing).

## Environment considerations [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#environment-considerations)

Before setting up the Kafka Sink, understand how it works in different environments.

### Cloud environment [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#cloud-environment)

In the Tinybird Cloud environment, Tinybird connects directly to your Kafka cluster using the connection credentials you provide.

### Local environment [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#local-environment)

When using the Kafka Sink in the Tinybird Local environment, the connection is made from within the container to your Kafka cluster. Ensure your Kafka cluster is accessible from the container network.

When using the Kafka Sink in the `--local` environment, scheduled sink operations are not supported. You can only run on-demand sinks using `tb sink run <pipe_name>` . For scheduled sink operations, use the Cloud environment.

## Set up the sink [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#set-up-the-sink)

1
### Create a Kafka connection [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#create-a-kafka-connection)

You can create a Kafka connection in Tinybird using either the guided CLI process or by manually creating a connection file.

#### Option 1: Use the guided CLI process (recommended) [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#option-1-use-the-guided-cli-process-recommended)

The Tinybird CLI provides a guided process that helps you set up the Kafka connection:

tb connection create kafka When prompted, you'll need to:

1. Enter a name for your connection.
2. Provide the Kafka bootstrap servers (comma-separated list).
3. Configure authentication settings (SASL/SSL if required).
4. Optionally configure additional Kafka client properties.

#### Option 2: Manually create a connection file [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#option-2-manually-create-a-connection-file)

Create a [.connection file](/docs/forward/dev-reference/datafiles/connection-files) with the required credentials stored in secrets. For example:

##### kafka_sample.connection

TYPE kafka
KAFKA_BOOTSTRAP_SERVERS bootsrap_servers:port
KAFKA_SECURITY_PROTOCOL SASL_SSL
KAFKA_SASL_MECHANISM PLAIN
KAFKA_KEY {{ tb_secret("KAFKA_KEY", "key") }}
KAFKA_SECRET {{ tb_secret("KAFKA_SECRET", "secret") }} See [Connection files](/docs/forward/dev-reference/datafiles/connection-files) for more details on how to create a connection file and manage secrets.

You need to create separate connections for each environment you're working with, Local and Cloud.

For example, you can create:

- `kafka-local`   for your Local environment
- `kafka-cloud`   for your Cloud environment

2
### Create a Sink pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#create-a-sink-pipe)

To create a Sink pipe, create a regular .pipe and filter the data you want to export to your Kafka topic in the SQL section as in any other pipe. Then, specify the pipe as a sink type and add the needed configuration. Your pipe should have the following structure:

##### kafka_export.pipe

NODE node_0

SQL >
    SELECT 
        customer_id,
        event_type,
        status,
        amount
    FROM events
    WHERE status = 'completed'

TYPE sink
EXPORT_CONNECTION_NAME "kafka_connection"
EXPORT_KAFKA_TOPIC "events_topic"
EXPORT_SCHEDULE "*/5 * * * *" 3
### Deploy the Sink pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#deploy-the-sink-pipe)

After defining your Sink pipe and connection, test it by running a deploy check:

tb --cloud deploy --check This runs the connection locally and checks if the connection is valid. To see the connection details, run `tb --cloud connection ls`.

When ready, push the datafile to your Workspace using `tb deploy` to create the Sink pipe:

tb --cloud deploy This creates the Sink pipe in your workspace and makes it available for execution.

## .connection settings [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#connection-settings)

The Kafka connector uses the following settings in .connection files:

| Instruction | Required | Description |
| --- | --- | --- |
| `KAFKA_BOOTSTRAP_SERVERS` | Yes | Comma-separated list of one or more Kafka brokers, including Port numbers. |
| `KAFKA_KEY` | Yes | Key used to authenticate with Kafka. Sometimes called Key, Client Key, or Username depending on the Kafka distribution. |
| `KAFKA_SECRET` | Yes | Secret used to authenticate with Kafka. Sometimes called Secret, Secret Key, or Password depending on the Kafka distribution. |
| `KAFKA_SECURITY_PROTOCOL` | No | Security protocol for the connection. Accepted values are `PLAINTEXT`   and `SASL_SSL`   . Default value is `SASL_SSL`  . |
| `KAFKA_SASL_MECHANISM` | No | SASL mechanism to use for authentication. Supported values are `PLAIN`  , `SCRAM-SHA-256`  , `SCRAM-SHA-512`   . Default value is `PLAIN`  . |
| `KAFKA_SCHEMA_REGISTRY_URL` | No | URL of the Kafka schema registry. Used for `avro`   and `json_with_schema`   deserialization of   keys and values. If Basic Auth is required, it must be included in the URL as in `https://user:password@registry_url` |
| `KAFKA_SSL_CA_PEM` | No | Content of the CA certificate in PEM format for SSL connections. |

## .pipe settings [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#pipe-settings)

The Kafka Sink pipe uses the following settings in .pipe files:

| Key | Type | Description |
| --- | --- | --- |
| `EXPORT_CONNECTION_NAME` | string | Required. The connection name to the destination service. This is the connection created in Step 1. |
| `EXPORT_KAFKA_TOPIC` | string | Required. The Kafka topic where events will be published. |
| `EXPORT_SCHEDULE` | string | Required. A crontab expression that sets the frequency of the Sink operation or the @on-demand string. |

### Scheduling considerations [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#scheduling-considerations)

The schedule applied to a Sink pipe doesn't guarantee that the underlying job executes immediately at the configured time. The job is placed into a job queue when the configured time elapses. It is possible that, if the queue is busy, the job could be delayed and executed after the scheduled time.

To reduce the chances of a busy queue affecting your Sink pipe execution schedule, distribute the jobs over a wider period of time rather than grouping them close together.

### Query parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#query-parameters)

You can add [query parameters](/docs/forward/work-with-data/query-parameters) to your Sink pipes, the same way you do in API Endpoints or Copy pipes.

- For on-demand executions, you can set parameters when you trigger the Sink pipe to whatever values you wish.
- For scheduled executions, the default values for the parameters will be used when the Sink pipe runs.

## Execute the Sink pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#execute-the-sink-pipe)

### On-demand execution [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#on-demand-execution)

You can trigger your Sink pipe manually using:

tb sink run <pipe_name> When triggering a Sink pipe you have the option of overriding several of its settings, like topic or format. Refer to the [Sink pipes API spec](/docs/api-reference/sink-pipes-api) for the full list of parameters.

### Scheduled execution [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#scheduled-execution)

If you configured a schedule with `EXPORT_SCHEDULE` , the Sink pipe will run automatically according to the cron expression.

Once the Sink pipe is triggered, it creates a standard Tinybird job that can be followed via the `v0/jobs` API or using `tb job ls --kind=sink`.

## Observability [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#observability)

Sink pipes operations are logged in the [tinybird.jobs_log](/docs/forward/monitoring/service-datasources#tinybird-jobs-log) Service Data Source. You can filter by `job_type = 'sink'` to see only Sink pipe executions.

For more detailed Sink-specific information, you can also use [tinybird.sinks_ops_log](/docs/forward/monitoring/service-datasources#tinybird-sinks-ops-log).

## Limits & quotas [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#limits-quotas)


Check the [limits page](/docs/forward/pricing/limits) for limits on ingestion, queries, API Endpoints, and more.

## Billing [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#billing)

When a Sink pipe executes, it uses your plan's included compute resources (vCPUs and active minutes) to run the query, then publishes the result to Kafka.

### Enterprise customers [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#enterprise-customers)

Tinybird includes Data Transfer allowances for Enterprise customers. Contact your Customer Success team or email us at [support@tinybird.co](mailto:support@tinybird.co) to discuss your specific requirements.

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/kafka-sink#next-steps)

- Get familiar with the[  Service Data Source](/docs/forward/monitoring/service-datasources)   and see what's going on in your account
- Deep dive on Tinybird's[  pipes concept](/docs/forward/work-with-data/pipes)


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink
Last update: 2025-07-02T21:49:40.000Z
Content:
---
title: "GCS Sink · Tinybird Docs"
theme-color: "#171612"
description: "Offload data to Google Cloud Storage on a batch-based schedule using Tinybird's fully managed GCS Sink Connector."
---


# GCS Sink [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#gcs-sink)

Copy as MD You can set up a GCS Sink to export your data from Tinybird to any Google Cloud Storage bucket in CSV, NDJSON, or Parquet format. The GCS Sink allows you to offload data on a batch-based schedule using Tinybird's fully managed connector.

Setting up the GCS Sink requires:

1. Configuring a[  Service Account](https://cloud.google.com/iam/docs/service-accounts-create)   with these[  permissions](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#gcs-permissions)   in GCP.
2. Creating a connection file in Tinybird.
3. Creating a Sink pipe that uses this connection

Tinybird represents Sinks using the icon.

The GCS Sink feature is available for Developer and Enterprise plans. See [Plans](/docs/forward/pricing).

## Environment considerations [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#environment-considerations)

Before setting up the GCS Sink, understand how it works in different environments.

### Cloud environment [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#cloud-environment)

In the Tinybird Cloud environment, Tinybird uses the Service Account credentials you provide to write to your GCS bucket.

### Local environment [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#local-environment)

When using the GCS Sink in the Tinybird Local environment, which runs in a container, you need to pass your local GCP credentials to the container. These credentials must have the [permissions described in the GCS permissions section](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#gcs-permissions) , including access to GCS operations like `storage.objects.create`, `storage.objects.get` , etc.

When using the GCS Sink in the `--local` environment, scheduled sink operations are not supported. You can only run on-demand sinks using `tb sink run <pipe_name>` . For scheduled sink operations, use the Cloud environment.

## Set up the sink [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#set-up-the-sink)

1
### Create a GCS connection [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#create-a-gcs-connection)

You can create a GCS connection in Tinybird using either the guided CLI process or by manually creating a connection file.

#### Option 1: Use the guided CLI process (recommended) [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#option-1-use-the-guided-cli-process-recommended)

The Tinybird CLI provides a guided process that helps you set up the required GCP permissions and creates the connection file automatically:

tb connection create gcs When prompted, you'll need to:

1. Enter a name for your connection.
2. Enter the GCS bucket name.
3. Provide the service account credentials (JSON key file).
4. The credentials will be stored securely using[  tb secret](/docs/forward/dev-reference/commands/tb-secret)   , which will allow you to have different credentials for each environment.

#### Option 2: Create a connection file manually [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#option-2-create-a-connection-file-manually)

You can also set up a connection manually by creating a [connection file](/docs/forward/dev-reference/datafiles/connection-files) with the required credentials. There are two authentication methods available:

##### Option 2a: Service Account Authentication (recommended)

This method uses Google Cloud Service Account credentials for authentication:

##### gcs_sample.connection

TYPE gcs
GCS_SERVICE_ACCOUNT_CREDENTIALS_JSON {{ tb_secret("GCS_KEY") }} When creating your connection manually with Service Account authentication, you need to set up the required GCP Service Account with appropriate permissions. See the [GCS permissions](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#gcs-permissions) section for details on the required role configurations.

##### Option 2b: HMAC Authentication

This method uses HMAC keys for S3-compatible authentication with Google Cloud Storage:

##### gcs_sample.connection

TYPE gcs
GCS_ACCESS_ID {{ tb_secret('gcs_access_id') }}
GCS_SECRET {{ tb_secret('gcs_secret') }} When using HMAC authentication, you need to:

1. Create HMAC keys for your Google Cloud Storage bucket through the Cloud Console or CLI
2. Ensure the associated service account has the same permissions described in the[  GCS permissions](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#gcs-permissions)   section
3. Store the HMAC access ID and secret as[  Tinybird secrets](/docs/forward/dev-reference/commands/tb-secret)

Service Account authentication (Option 2a) is recommended over HMAC authentication as it provides better integration with Google Cloud's IAM system and more granular permission control.

See [Connection files](/docs/forward/dev-reference/datafiles/connection-files) for more details on how to create a connection file and manage secrets.

You need to create separate connections for each environment you're working with, Local and Cloud.

For example, you can create:

- `my-gcs-local`   for your Local environment
- `my-gcs-cloud`   for your Cloud environment

2
### Create a Sink pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#create-a-sink-pipe)

To create a Sink pipe, create a regular .pipe and filter the data you want to export to your bucket in the SQL section as in any other pipe. Then, specify the pipe as a sink type and add the needed configuration. Your pipe should have the following structure:

##### gcs_export.pipe

NODE node_0

SQL >
    SELECT *
    FROM events
    WHERE status = 'processed'

TYPE sink
EXPORT_CONNECTION_NAME "gcs_sample"
EXPORT_BUCKET_URI "gs://tinybird-sinks"
EXPORT_FILE_TEMPLATE "daily_prices" # Supports partitioning
EXPORT_SCHEDULE "*/5 * * * *" 
EXPORT_FORMAT "csv" # Optional
EXPORT_COMPRESSION "gz" # Optional
EXPORT_STRATEGY "create_new" # Optional 3
### Deploy the Sink pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#deploy-the-sink-pipe)

After defining your GCS data source and connection, test it by running a deploy check:

tb --cloud deploy --check This runs the connection locally and checks if the connection is valid. To see the connection details, run `tb --cloud connection ls`.

When ready, push the datafile to your Workspace using `tb deploy` to create the Sink pipe:

tb --cloud deploy This creates the Sink pipe in your workspace and makes it available for execution.

## .connection settings [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#connection-settings)

The GCS connector uses the following settings in .connection files:

| Instruction | Required | Description |
| --- | --- | --- |
| `GCS_SERVICE_ACCOUNT_CREDENTIALS_JSON` | No* | Service Account Key in JSON format for Service Account authentication. We recommend using[  Tinybird Secrets](/docs/forward/dev-reference/commands/tb-secret)  . |
| `GCS_ACCESS_ID` | No* | HMAC access ID for HMAC authentication. Store as a[  Tinybird secret](/docs/forward/dev-reference/commands/tb-secret)  . |
| `GCS_SECRET` | No* | HMAC secret key for HMAC authentication. Store as a[  Tinybird secret](/docs/forward/dev-reference/commands/tb-secret)  . |

*Either `GCS_SERVICE_ACCOUNT_CREDENTIALS_JSON` (for Service Account authentication) or both `GCS_ACCESS_ID` and `GCS_SECRET` (for HMAC authentication) are required.

## .pipe settings [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#pipe-settings)

The GCS Sink pipe uses the following settings in .pipe files:

| Key | Type | Description |
| --- | --- | --- |
| `EXPORT_CONNECTION_NAME` | string | Required. The connection name to the destination service. This is the connection created in Step 1. |
| `EXPORT_BUCKET_URI` | string | Required. The path to the destination bucket. Example: `gs://tinybird-export` |
| `EXPORT_FILE_TEMPLATE` | string | Required. The target file name. Can use parameters to dynamically name and partition the files. See File partitioning section below. Example: `daily_prices_{customer_id}` |
| `EXPORT_SCHEDULE` | string | Required. A crontab expression that sets the frequency of the Sink operation or the @on-demand string. |
| `EXPORT_FORMAT` | string | Optional. The output format of the file. Values: CSV, NDJSON, Parquet. Default value: CSV |
| `EXPORT_COMPRESSION` | string | Optional. Accepted values: `none`  , `gz`   for gzip, `br`   for brotli, `xz`   for LZMA, `zst`   for zstd. Default: `none` |
| `EXPORT_STRATEGY` | string | Optional. Defines how to handle existing files. Values: `create_new`   (default), `replace`   . See[  Write strategies](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#write-strategies)   section below. |

### Supported regions [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#supported-regions)

The Tinybird GCS Sink feature only supports exporting data to the following Google Cloud regions:

- `us-*`
- `eu-*`
- `us-central-*`
- `us-east-*`
- `us-west-*`
- `europe-west-*`
- `northamerica-northeast-*`

### Scheduling considerations [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#scheduling-considerations)

The schedule applied to a Sink pipe doesn't guarantee that the underlying job executes immediately at the configured time. The job is placed into a job queue when the configured time elapses. It is possible that, if the queue is busy, the job could be delayed and executed after the scheduled time.

To reduce the chances of a busy queue affecting your Sink pipe execution schedule, distribute the jobs over a wider period of time rather than grouping them close together.

### Write strategies [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#write-strategies)

The `EXPORT_STRATEGY` parameter determines how Tinybird handles existing files in your GCS bucket:

- ** `create_new`**   (default): Creates new files without overwriting existing ones. If a file with the same name already exists, Tinybird will append a suffix to make the filename unique.
- ** `replace`**   : Overwrites existing files with the same name. Use this when you want to replace previous exports entirely.

### Query parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#query-parameters)

You can add [query parameters](/docs/forward/work-with-data/query-parameters) to your Sink pipes, the same way you do in API Endpoints or Copy pipes.

- For on-demand executions, you can set parameters when you trigger the Sink pipe to whatever values you wish.
- For scheduled executions, the default values for the parameters will be used when the Sink pipe runs.

## Execute the Sink pipe [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#execute-the-sink-pipe)

### On-demand execution [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#on-demand-execution)

You can trigger your Sink pipe manually using:

tb sink run <pipe_name> When triggering a Sink pipe you have the option of overriding several of its settings, like format or compression. Refer to the [Sink pipes API spec](/docs/api-reference/sink-pipes-api) for the full list of parameters.

### Scheduled execution [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#scheduled-execution)

If you configured a schedule with `EXPORT_SCHEDULE` , the Sink pipe will run automatically according to the cron expression.

Once the Sink pipe is triggered, it creates a standard Tinybird job that can be followed via the `v0/jobs` API or using `tb job ls --kind=sink`.

## File template [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#file-template)

The export process allows you to partition the result in different files, allowing you to organize your data and get smaller files. The partitioning is defined in the file template and based on the values of columns of the result set.

### Partition by column [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#partition-by-column)

Add a template variable like `{COLUMN_NAME}` to the filename. For instance, consider the following query schema and result for an export:

| customer_id | invoice_id | amount |
| --- | --- | --- |
| ACME | INV20230608 | 23.45 |
| ACME | 12345INV | 12.3 |
| GLOBEX | INV-ABC-789 | 35.34 |
| OSCORP | INVOICE2023-06-08 | 57 |
| ACME | INV-XYZ-98765 | 23.16 |
| OSCORP | INV210608-001 | 62.23 |
| GLOBEX | 987INV654 | 36.23 |

With the given file template `invoice_summary_{customer_id}.csv` you'd get 3 files:

`invoice_summary_ACME.csv`

| customer_id | invoice_id | amount |
| --- | --- | --- |
| ACME | INV20230608 | 23.45 |
| ACME | 12345INV | 12.3 |
| ACME | INV-XYZ-98765 | 23.16 |

`invoice_summary_OSCORP.csv`

| customer_id | invoice_id | amount |
| --- | --- | --- |
| OSCORP | INVOICE2023-06-08 | 57 |
| OSCORP | INV210608-001 | 62.23 |

`invoice_summary_GLOBEX.csv`

| customer_id | invoice_id | amount |
| --- | --- | --- |
| GLOBEX | INV-ABC-789 | 35.34 |
| GLOBEX | 987INV654 | 36.23 |

### Values format [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#values-format)

In the case of DateTime columns, it can be dangerous to partition just by the column. Why? Because you could end up with as many files as seconds, as they're the different values for a DateTime column. In an hour, that's potentially 3600 files.

To help partition in a sensible way, you can add a format string to the column name using the following placeholders:

| Placeholder | Description | Example |
| --- | --- | --- |
| %Y | Year | 2023 |
| %m | Month as an integer number (01-12) | 06 |
| %d | Day of the month, zero-padded (01-31) | 07 |
| %H | Hour in 24h format (00-23) | 14 |
| %i | Minute (00-59) | 45 |

For instance, for a result like this:

| timestamp | invoice_id | amount |
| --- | --- | --- |
| 2023-07-07 09:07:05 | INV20230608 | 23.45 |
| 2023-07-07 09:07:01 | 12345INV | 12.3 |
| 2023-07-07 09:06:45 | INV-ABC-789 | 35.34 |
| 2023-07-07 09:05:35 | INVOICE2023-06-08 | 57 |
| 2023-07-06 23:14:05 | INV-XYZ-98765 | 23.16 |
| 2023-07-06 23:14:02 | INV210608-001 | 62.23 |
| 2023-07-06 23:10:55 | 987INV654 | 36.23 |

Note that all 7 events have different times in the column timestamp. Using a file template like `invoices_{timestamp}` would create 7 different files.

If you were interested in writing one file per hour, you could use a file template like `invoices_{timestamp, '%Y%m%d-%H'}` . You'd then get only two files for that dataset:

`invoices_20230707-09.csv`

| timestamp | invoice_id | amount |
| --- | --- | --- |
| 2023-07-07 09:07:05 | INV20230608 | 23.45 |
| 2023-07-07 09:07:01 | 12345INV | 12.3 |
| 2023-07-07 09:06:45 | INV-ABC-789 | 35.34 |
| 2023-07-07 09:05:35 | INVOICE2023-06-08 | 57 |

`invoices_20230706-23.csv`

| timestamp | invoice_id | amount |
| --- | --- | --- |
| 2023-07-06 23:14:05 | INV-XYZ-98765 | 23.16 |
| 2023-07-06 23:14:02 | INV210608-001 | 62.23 |
| 2023-07-06 23:10:55 | 987INV654 | 36.23 |

### By number of files [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#by-number-of-files)

You also have the option to write the result into X files. Instead of using a column name, use an integer between brackets.

Example: `invoice_summary.{8}.csv`

This is convenient to reduce the file size of the result, especially when the files are meant to be consumed by other services where uploading big files is discouraged.

The results are written in random order. This means that the final result rows would be written in X files, but you can't count the specific order of the result.

There are a maximum of 16 files.

### Combining different partitions [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#combining-different-partitions)

It's possible to add more than one partitioning parameter in the file template. This is useful, for instance, when you do a daily dump of data, but want to export one file per hour.

Setting the file template as `invoices/dt={timestamp, '%Y-%m-%d'}/H{timestamp, '%H}.csv` would create the following file structure in different days and executions:

Invoices
├── dt=2023-07-07
│   └── H23.csv
│   └── H22.csv
│   └── H21.csv
│   └── ...
├── dt=2023-07-06
│   └── H23.csv
│   └── H22.csv You can also mix column names and number of files. For instance, setting the file template as `invoices/{customer_id}/dump_{4}.csv` would create the following file structure in different days and executions:

Invoices
├── ACME
│   └── dump_0.csv
│   └── dump_1.csv
│   └── dump_2.csv
│   └── dump_3.csv
├── OSCORP
│   └── dump_0.csv
│   └── dump_1.csv
│   └── dump_2.csv
│   └── dump_3.csv Be careful with excessive partitioning. Take into consideration that the write process will create as many files as combinations of the values of the partitioning columns for a given result set.

## Supported file types [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#supported-file-types)

The GCS Sink supports exporting data in the following file formats:

| File type | Accepted extensions | Compression formats supported |
| --- | --- | --- |
| CSV | `.csv`  , `.csv.gz` | `gzip` |
| NDJSON | `.ndjson`  , `.ndjson.gz`  , `.jsonl`  , `.jsonl.gz`  , `.json`  , `.json.gz` | `gzip` |
| Parquet | `.parquet`  , `.parquet.gz` | `snappy`  , `gzip`  , `lzo`  , `brotli`  , `lz4`  , `zstd` |

You can optionally configure the export format using the `EXPORT_FORMAT` parameter (defaults to CSV) and compression using the `EXPORT_COMPRESSION` parameter in your Sink pipe configuration.

## GCS permissions [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#gcs-permissions)

The GCS connector requires a Service Account with specific permissions to access objects in your Google Cloud Storage bucket. Following the principle of least privilege, the minimum required roles are:

- **  Storage Object Creator**   ( `roles/storage.objectCreator`   ) - Allows users to create objects
- **  Storage Object Viewer**   ( `roles/storage.objectViewer`   ) - Grants access to view objects and their metadata, and list objects in a bucket
- **  Storage Bucket Viewer**   ( `roles/storage.legacyBucketReader`   ) - Allows users to list objects in a bucket and view bucket metadata

You need to create a Service Account in Google Cloud Platform:

1. In the Google Cloud Console, create or use an existing service account.
2. Assign the following roles to the service account for the specific bucket or project:
  - `roles/storage.legacyBucketReader`
  - `roles/storage.objectCreator`
  - `roles/storage.objectViewer`
3. Generate a JSON key file and download it.
4. Store the key as a Tinybird secret.

Alternatively, you can use the broader **Storage Object Admin** ( `roles/storage.objectAdmin` ) role which includes all the necessary permissions, but it grants additional permissions beyond what's required for the sink operation.

## Observability [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#observability)

Sink pipes operations are logged in the [tinybird.jobs_log](/docs/forward/monitoring/service-datasources#tinybird-jobs-log) Service Data Source. You can filter by `job_type = 'sink'` to see only Sink pipe executions.

For more detailed Sink-specific information, you can also use [tinybird.sinks_ops_log](/docs/forward/monitoring/service-datasources#tinybird-sinks-ops-log).

Data Transfer incurred by Sink pipes is tracked in [tinybird.data_transfer](/docs/forward/monitoring/service-datasources#tinybird-data-transfer) Service Data Source.

## Limits & quotas [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#limits-quotas)


Check the [limits page](/docs/forward/pricing/limits) for limits on ingestion, queries, API Endpoints, and more.

## Billing [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#billing)

Tinybird bills Sink pipes based on Data Transfer. When a Sink pipe executes, it uses your plan's included compute resources (vCPUs and active minutes) to run the query, then writes the result to a bucket (Data Transfer). If the resulting files are compressed, Tinybird accounts for the compressed size.

### Data Transfer [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#data-transfer)

Data Transfer depends on your environment. There are two scenarios:

- The destination bucket is in the**  same**   cloud provider and region as your Tinybird Workspace: $0.01 / GB
- The destination bucket is in a**  different**   cloud provider or region as your Tinybird Workspace: $0.10 / GB

You must include the **Storage Bucket Viewer** permission in your Service Account configuration. This permission allows Tinybird to determine your bucket's region and apply the correct billing rate. Without this permission, Tinybird cannot detect the bucket region and will charge the higher cross-region rate ($0.10 / GB) regardless of your bucket's actual location.

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/sinks/gcs-sink#next-steps)

- Get familiar with the[  Service Data Source](/docs/forward/monitoring/service-datasources)   and see what's going on in your account
- Deep dive on Tinybird's[  pipes concept](/docs/forward/work-with-data/pipes)


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/share-endpoint-documentation
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Share API endpoints documentation · Tinybird Docs"
theme-color: "#171612"
description: "In this guide you'll learn how to share your Tinybird API endpoint documentation with development teams."
---


# Share Tinybird API endpoint documentation [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/share-endpoint-documentation#share-tinybird-api-endpoint-documentation)

Copy as MD Learn how to share your Tinybird API endpoint documentation with development teams.

## The Tinybird API endpoint page [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/share-endpoint-documentation#the-tinybird-api-endpoint-page)

When you publish an API endpoint, Tinybird generates a documentation page for you that is ready to share and compatible with OpenAPI 3.0. It contains your API endpoint description, information about the dynamic parameters you can use when querying this endpoint, and code snippets for quickly integrating your API in third-party applications.

To share your published API endpoint, navigate to the "Create Chart" button (top right of the UI) > "Share this API endpoint" modal:

## Use Static Tokens to define API endpoint subsets [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/share-endpoint-documentation#use-static-tokens-to-define-api-endpoint-subsets)

Tinybird authentication is based on [Tokens](../../../administration/auth-tokens) which contain different scopes for specific resources. For example, a token lets you read from one or many API endpoints, or get write permissions for a particular data source.

If you take a closer look at the URLs generated for sharing a public API endpoint page, you'll see that after the Endpoint ID, it includes a Token parameter. This means that this page is only accessible if the token provided in the URL has read permissions for it:

https://api.tinybird.co/endpoint/t_bdcad2252e794c6573e21e7e?token=<token_with_permissions> For security, Tinybird automatically generates a read-only Token when sharing a public API endpoint page for the first time. If you don't explicitly use it, your Admin Token won't ever get exposed.

### The API endpoints list page [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/share-endpoint-documentation#the-api-endpoints-list-page)

Tinybird also allows you to render the API endpoints information for a given Token.

https://app.tinybird.co/<provider>/<region>/endpoints?token=<your_token> Enter the previous URL, with your Token and the provider and region where the API endpoint is published, into the browser to retrieve a list that shows all API endpoints that the token can read from.

When integrating your API endpoint in your applications, manage dedicated tokens. The easiest way is creating a token for every application environment, so that you can also track the different requests to your API endpoints by application, and choose which API endpoints are accessible for them.

Once you do that, you can share auto-generated documentation with ease, without compromising your data privacy and security.

API endpoint docs pages include a read token by default. In the "Share this API endpoint" modal, you can also see public URLs for every token with read permissions for your pipe.

## Browse your API docs [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/share-endpoint-documentation#browse-your-api-docs)

All endpoint documentation is compatible with OpenAPI 3.0 and accessible through the API. If you use a token with permissions for more than one API endpoint, the OpenAPI documentation contains information about all the API endpoints at once.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/reliable-scheduling-with-trigger
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Reliable scheduling with Trigger.dev · Tinybird Docs"
theme-color: "#171612"
description: "Learn how to create complex, reliable scheduling with Trigger.dev"
---


# Reliable scheduling with Trigger.dev [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/reliable-scheduling-with-trigger#reliable-scheduling-with-trigger-dev)

Copy as MD [Trigger.dev](https://trigger.dev/) is an open source background job platform. With Trigger.dev you can easily create, schedule, and manage background jobs using code.

Read on to learn how to create complex, reliable scheduling with Trigger.dev.

## Before you start [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/reliable-scheduling-with-trigger#before-you-start)

Before you start, ensure:

- You have a[  Trigger.dev account](https://trigger.dev/)  .
- You have a[  Tinybird workspace](https://www.tinybird.co/)  .

## Create your first trigger task [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/reliable-scheduling-with-trigger#create-your-first-trigger-task)

The [tinybird-trigger-tasks package](https://www.npmjs.com/package/@sdairs/tinybird-trigger-tasks) implements tasks for Tinybird copy pipes and the Query API. You can [find the source code in the @sdairs/tinybird-trigger repo](https://github.com/sdairs/tinybird-trigger).

1. Create a working directory, and run `npx trigger.dev@latest init`   to connect the project to Trigger.dev.
2. Inside the `trigger`   directory, install the npm package with `npm install @sdairs/tinybird-trigger-tasks`  .
3. Create a new file called `myTask.ts`   and add the following code:

import { task } from "@trigger.dev/sdk/v3";
import { tinybirdCopyTask } from "@sdairs/tinybird-trigger-tasks";

export const exampleExecutor = task({
    id: "example-executor",
    run: async (payload, { ctx }) => {
        console.log("Example executor task is running");

        // Run a copy job
        const copyResult = await tinybirdCopyTask.triggerAndWait({ pipeId: <COPY_PIPE_ID> });
        console.log(copyResult);

    },
});
1. Create a new pipe using the following SQL:

SELECT number + 1 AS value
FROM numbers(100)
1. Name the pipe `my_copy`   and deploy the changes.
2. Update `myTask.ts`   , replacing `<COPY_PIPE_ID>`   with the name of your pipe, `my_copy`   in this case.
3. Create a `.env`   file in your directory root.
4. Go to your Tinybird workspace and copy the Admin Token, then add it to the `.env`   file as follows:

TINYBIRD_TOKEN=p.eyJ...
1. Run `npx trigger.dev@latest dev`   to push the task to Trigger.dev.
2. Go to your Trigger.dev dashboard, and perform a test run to trigger the task and the copy pipe.
3. Go to your Tinybird workspace and check the copy pipe results.

## See also [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/reliable-scheduling-with-trigger#see-also)

- [  Trigger.dev quick start](https://trigger.dev/docs/quick-start)
- [  tinybird-trigger repo](https://github.com/sdairs/tinybird-trigger)
- [  YouTube: Using Trigger.dev with Tinybird for code-first background job execution](https://www.youtube.com/watch?v=0TcQfcMrGNw)


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/multitenant-real-time-apis-with-clerk-and-tinybird
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Multi-tenant real-time APIs with Clerk and Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide, you'll learn how to build a multi-tenant real-time API with Clerk and Tinybird."
---


# Multi-tenant real-time APIs with Clerk and Tinybird [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/multitenant-real-time-apis-with-clerk-and-tinybird#multi-tenant-real-time-apis-with-clerk-and-tinybird)

Copy as MD Learn how to build a multi-tenant real-time API with Clerk and Tinybird.

You can view the [live demo](https://clerk-tinybird.vercel.app/) or browse the [GitHub repo (clerk-tinybird)](https://github.com/tinybirdco/clerk-tinybird).

## Prerequisites [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/multitenant-real-time-apis-with-clerk-and-tinybird#prerequisites)

This guide assumes that you have a Tinybird account, and you are familiar with creating a Tinybird workspace and pushing resources to it.

You need a working familiarity with Clerk and Next.js.

## JWT Template [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/multitenant-real-time-apis-with-clerk-and-tinybird#jwt-template)

Create a JWT template in Clerk, and use the generated JWT to access Tinybird pipe endpoints.

In Clerk go to `Configure` > `JWT Templates` and choose Tinybird.


<-figure->
![](/docs/_next/image?url=%2Fdocs%2Fassets%2Fguides%2Fclerk%2Fclerk-jwt-tinybird.png&w=3840&q=75)

<-figcaption->
Clerk JWT tokens Tinybird template

</-figcaption->


</-figure->
Modify the Tinybird JWT template with these claims:

{
	"name": "frontend_jwt",
	"limits": {
		"rps": 10
	},
	"scopes": [
		{
			"type": "PIPES:READ",
			"resource": "<YOUR-TINYBIRD-PIPE-NAME>",
			"fixed_params": {
				"org": "{{org.slug}}",
				"user": "{{user.id}}"
			}
		}
	],
	"workspace_id": "<YOUR-TINYBIRD-WORKSPACE-ID>"
}
- Use your Tinybird admin token as signking key.
- Add as many scopes as needed, use fixed params to filter your Tinybird API endpoints.
- Configure `fixed_params`   to match the parameter names and values in your Tinybird API endpoints.

On your application request a token to `Clerk` using the `tinybird` template, where `tinybird` is the name you gave to the template.

const authentication = await auth()
  const { userId, sessionId, getToken } = authentication
  const token = await getToken({ template: "tinybird" })

  fetch('https://api.tinybird.co/v0/pipes/your_pipe.json', {
  headers: {
    Authorization: `Bearer ${token}`
  }
}) Use this [demo project](https://www.tinybird.co/templates/clerk-jwt) to for a fully working example.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Consume APIs in a Next.js frontend with JWTs · Tinybird Docs"
theme-color: "#171612"
description: "In this guide, you'll learn how to generate self-signed JWTs from your backend, and call Tinybird APIs directly from your frontend, using Next.js."
---


# Consume APIs in a Next.js frontend with JWTs [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#consume-apis-in-a-next-js-frontend-with-jwts)

Copy as MD In this guide, you'll learn how to generate self-signed JWTs from your backend, and call Tinybird APIs directly from your frontend, using Next.js.

JWTs are signed tokens that allow you to securely authorize and share data between your application and Tinybird.

You can view the [live demo](https://guide-nextjs-jwt-auth.vercel.app/) or browse the [GitHub repo (guide-nextjs-jwt-auth)](https://github.com/tinybirdco/guide-nextjs-jwt-auth).

## Prerequisites [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#prerequisites)

This guide assumes that you have a Tinybird account, and you are familiar with creating a Tinybird workspace and pushing resources to it.

Make sure you understand the concept of Tinybird's [Static Tokens](../../../administration/auth-tokens#what-should-i-use-tokens-for).

You need a working familiarity with JWTs, JavaScript, and Next.js.

## Run the demo [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#run-the-demo)

These steps cover running the GitHub demo locally. [Skip to the next section](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#understand-the-code) for a breakdown of the code.

### 1. Clone the GitHub repo [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#1-clone-the-github-repo)

Clone the [GitHub repo (guide-nextjs-jwt-auth)](https://github.com/tinybirdco/guide-nextjs-jwt-auth) to your local machine.

### 2. Push Tinybird resources [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#2-push-tinybird-resources)

The repo includes two sample Tinybird resources:

- `events.datasource`   : The data source for incoming events.
- `top_airlines.pipe`   : An API endpoint giving a list of top 10 airlines by booking volume.

### 3. Generate some fake data [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#3-generate-some-fake-data)

Use [Mockingbird](https://tbrd.co/mockingbird-nextjs-jwt-demo) to generate fake data for the `events` data source.

Using this link ^ provides a pre-configured schema, but you will need to enter your workspace admin Token and Host. When configured, scroll down and select `Start Generating!`.

In the Tinybird UI, confirm that the `events` data source is successfully receiving data.

### 4. Install dependencies [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#4-install-dependencies)

Navigate to the cloned repo and install the dependencies with `npm install`.

### 5. Configure .env [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#5-configure-env)

First create a new file `.env.local`

cp .env.example .env.local Copy your [Tinybird host](/docs/api-reference#regions-and-endpoints) and admin Token (used as the `TINYBIRD_SIGNING_TOKEN` ) to the `.env.local` file:

TINYBIRD_SIGNING_TOKEN="TINYBIRD_SIGNING_TOKEN>" # Use your Admin Token as the signing Token
TINYBIRD_WORKSPACE="YOUR_WORKSPACE_ID"  # The UUID of your workspace
NEXT_PUBLIC_TINYBIRD_HOST="YOUR_TINYBIRD_API_REGION e.g. https://api.tinybird.co" # Your regional API host

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

### Run the demo app [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#run-the-demo-app)

Run it locally:

npm run dev Then open `localhost:3000` with your browser.

## Understand the code [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#understand-the-code)

This section breaks down the key parts of code from the example.

### .env [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#env)

<a href="https://github.com/tinybirdco/guide-nextjs-jwt-auth/blob/main/.env.example">The `.env` file</a> contains the environment variables used in the application.

##### .env file

TINYBIRD_SIGNING_TOKEN="YOUR SIGNING TOKEN"
TINYBIRD_WORKSPACE="YOUR WORKSPACE ID"
NEXT_PUBLIC_TINYBIRD_HOST="YOUR API HOST e.g. https://api.tinybird.co"
#### TINYBIRD_SIGNING_TOKEN [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#tinybird-signing-token)

`TINYBIRD_SIGNING_TOKEN` is the token used to sign JWTs. **You must use your admin Token** . It is a shared secret between your application and Tinybird. Your application uses this Token to sign JWTs, and Tinybird uses it to verify the JWTs. It should be kept secret, as exposing it could allow unauthorized access to your Tinybird resources. It is best practice to store this in an environment variable instead of hardcoding it in your application.

#### TINYBIRD_WORKSPACE [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#tinybird-workspace)

`TINYBIRD_WORKSPACE` is the ID of your workspace. It is used to identify the workspace that the JWT is generated for. The workspace ID is included inside the JWT payload. workspace IDs are UUIDs and can be found using the CLI `tb workspace current` command or from the Tinybird UI.

#### NEXT_PUBLIC_TINYBIRD_HOST [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#next-public-tinybird-host)

`NEXT_PUBLIC_TINYBIRD_HOST` is the base URL of the Tinybird API. It is used to construct the URL for the Tinybird API endpoints. You must use the correct URL for [your Tinybird region](/docs/api-reference#regions-and-endpoints) . The `NEXT_PUBLIC_` prefix is required for Next.js to expose the variable to the client side.

### token.ts [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#token-ts)

<a href="https://github.com/tinybirdco/guide-nextjs-jwt-auth/blob/main/server/token.ts">The `token.ts` file</a> contains the logic to generate and sign JWTs. It uses the `jsonwebtoken` library to create the Token.

##### token.ts

"use server";

import jwt from "jsonwebtoken";

const TINYBIRD_SIGNING_TOKEN = process.env.TINYBIRD_SIGNING_TOKEN ?? "";
const WORKSPACE_ID = process.env.TINYBIRD_WORKSPACE ?? "";
const PIPE_ID = "top_airlines"; 

export async function generateJWT() {
  const next10minutes = new Date();
  next10minutes.setTime(next10minutes.getTime() + 1000 * 60 * 10);

  const payload = {
    workspace_id: WORKSPACE_ID,
    name: "my_demo_jwt",
    exp: Math.floor(next10minutes.getTime() / 1000),
    scopes: [
      {
        type: "PIPES:READ",
        resource: PIPE_ID,
      },
    ],
  };

  return jwt.sign(payload, TINYBIRD_SIGNING_TOKEN, {noTimestamp: true});
} This code runs on the backend to generate JWTs without exposing secrets to the user.

It pulls in the `TINYBIRD_SIGNING_TOKEN` and `WORKSPACE_ID` from the environment variables.

As this example only exposes a single API endpoint ( `top_airlines.pipe` ), the `PIPE_ID` is hardcoded to its deployed ID. If you had multiple API endpoints, you would need to create an item in the `scopes` array for each one.

The `generateJWT` function handles creation of the JWT. A JWT has various [required fields](../../../administration/auth-tokens#jwt-payload).

The `exp` field sets the expiration time of the JWT in the form a UTC timestamp. In this case, it's set to 10 minutes in the future. You can adjust this value to suit your needs.

The `name` field is a human-readable name for the JWT. This value is only used for logging.

The `scopes` field defines what the JWT can access. This is an array, which allows you create one JWT that can access multiple API endpoints. In this case, you only have one API endpoint. Under `scopes` , the `type` field is always `PIPES:READ` for reading data from a pipe. The `resource` field is the ID or name of the pipe you want to access. If required, you can also add `fixed_parameters` here to supply parameters to the API endpoint.

Finally, the payload is signed using the `jsonwebtoken` library and the `TINYBIRD_SIGNING_TOKEN`.

### useFetch.tsx [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#usefetch-tsx)

<a href="https://github.com/tinybirdco/guide-nextjs-jwt-auth/blob/main/hooks/useFetch.tsx">The `useFetch.tsx` file</a> contains a custom React hook that fetches data from the Tinybird API using a JWT. It also handles refreshing the token if it expires.

##### useFetch.tsx

import { generateJWT } from "@/server/token";
import { useState } from "react";

export function useFetcher() {
  const [token, setToken] = useState("");

  const refreshToken = async () => {
    const newToken = await generateJWT();
    setToken(newToken);
    return newToken;
  };

  return async (url: string) => {
    let currentToken = token;
    if (!currentToken) {
      currentToken = await refreshToken();
    }
    const response = await fetch(url + "?token=" + currentToken);

    if (response.status === 200) {
      return response.json();
    }
    if (response.status === 403) {
      const newToken = await refreshToken();
      return fetch(url + "?token=" + newToken).then((res) => res.json());
    }
  };
} This code runs on the client side and is used to fetch data from the Tinybird API.

It uses the `generateJWT` function from the<a href="about:blank#token-ts"> `token.ts` file</a> to get a JWT. The JWT is stored in the `token` state.

Most importantly, it uses the standard `fetch` API to make requests to the Tinybird API. The JWT is passed as a `token` query parameter in the URL.

If the request returns a `403` status code, the hook then calls `refreshToken` to get a new JWT and retries the request. However, note that this is a simple implementation and there are other reasons why a request might fail with a `403` status code (e.g., the JWT is invalid, the API endpoint has been removed, etc.).

### page.tsx [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#page-tsx)

<a href="https://github.com/tinybirdco/guide-nextjs-jwt-auth/blob/main/app/page.tsx">The `page.tsx` file</a> contains the main logic for the Next.js page. It is responsible for initiating the call to the Tinybird API endpoints and rendering the data into a chart.

##### page.tsx

"use client";

import { BarChart, Card, Subtitle, Text, Title } from "@tremor/react";
import useSWR from "swr";
import { getEndpointUrl } from "@/utils";
import { useFetcher } from "@/hooks/useFetch";

const REFRESH_INTERVAL_IN_MILLISECONDS = 5000; // five seconds

export default function Dashboard() {
  const endpointUrl = getEndpointUrl();
  const fetcher = useFetcher();

  let top_airline, latency, errorMessage;

  const { data } = useSWR(endpointUrl, fetcher, {
    refreshInterval: REFRESH_INTERVAL_IN_MILLISECONDS,
    onError: (error) => (errorMessage = error),
  });

  if (!data) return;

  if (data?.error) {
    errorMessage = data.error;
    return;
  }

  top_airline = data.data;
  latency = data.statistics?.elapsed;

  return (
    <Card>
      <Title>Top airlines by bookings</Title>
      <Subtitle>Ranked from highest to lowest</Subtitle>
      {top_airline && (
        <BarChart
          className="mt-6"
          data={top_airline}
          index="airline"
          categories={["bookings"]}
          colors={["blue", "red"]}
          yAxisWidth={48}
          showXAxis={true}
        />
      )}
      {latency && <Text>Latency: {latency * 1000} ms</Text>}
      {errorMessage && (
        <div className="mt-4 text-red-600">
          <p>
            Oops, something happens: <strong>{errorMessage}</strong>
          </p>
          <p className="text-sm">Check your console for more information</p>
        </div>
      )}
    </Card>
  );
} It uses [SWR](https://swr.vercel.app/) and the `useFetcher` hook from [useFetch.tsx](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-nextjs#usefetch-tsx) to fetch data from the Tinybird API.

When the API endpoint returns data, it's rendered as bar chart using the `BarChart` component from the<a href="https://www.tremor.so/"> `@tremor/react` library</a>.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-in-a-notebook
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Consume APIs in a Notebook · Tinybird Docs"
theme-color: "#171612"
description: "Notebooks are a great resource for exploring data and generating plots. In this guide, you'll learn how to consume Tinybird APIs in a colab notebook."
---


# Consume APIs in a Notebook [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-in-a-notebook#consume-apis-in-a-notebook)

Copy as MD Notebooks are a great resource for exploring data and generating plots. In this guide, you'll learn how to consume Tinybird APIs in a Colab Notebook.

## Prerequisites [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-in-a-notebook#prerequisites)

This [Colab notebook](https://github.com/tinybirdco/examples/blob/master/notebook/consume_from_apis.ipynb) uses a data source of updates to Wikipedia to show how to consume data from queries. There are two options: Using the [Query API](/docs/api-reference/query-api) , and using API endpoints using the [Pipes API](/docs/api-reference/pipe-api) and parameters. The full code for every example in this guide can be found in the notebook.

This guide assumes some familiarity with Python.

## Setup [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-in-a-notebook#setup)

Follow the setup steps in the [notebook file](https://github.com/tinybirdco/examples/blob/master/notebook/consume_from_apis.ipynb) and use the linked CSV file of Wikipedia updates to create a new data source in your workspace.

For less than 100 MB of data, you can fetch all the data. For calls with than 100 MB of data, you need to do it sequentially, with not more than 100 MB per API call. The solution is to get batches using data source sorting keys. Selecting the data by columns used in the sorting key keeps it fast. In this example, the data source is sorted on the `timestamp` column, so you can use batches of a fixed amount of time. In general, time is a good way to batch.

The functions `fetch_table_streaming_query` and `fetch_table_streaming_endpoint` in the notebook work as generators. They should always be used in a `for` loop or as the input for another generator.

You should process each batch as it arrives and discard unwanted fetched data. Only fetch the data you need in the processing. The idea here isn't to recreate a data source in the notebook, but to process each batch as it arrives and write less data to your DataFrame.

## Fetch data with the Query API [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-in-a-notebook#fetch-data-with-the-query-api)

This guide uses the [requests library for Python](https://pypi.org/project/requests/) . The SQL query pulls in an hour less of data than the full data source. A DataFrame is created from the text part of the response.

##### DataFrame from the query API

table_name = 'wiki'
host = 'api.tinybird.co'
format = 'CSVWithNames'
time_column = 'toDateTime(timestamp)'
date_end = 'toDateTime(1644754546)'
 
s = requests.Session()
s.headers['Authorization'] = f'Bearer {token}'
 
URL = f'https://{host}/v0/sql'
sql = f'select * from {table_name} where {time_column} <= {date_end}'
params = {'q': sql + f" FORMAT {format}"}
 
r = s.get(f"{URL}?{urlencode(params)}")
df = pd.read_csv(StringIO(r.text))

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

## Fetch data from an API endpoint & parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-in-a-notebook#fetch-data-from-an-api-endpoint-parameters)

This Endpoint node in the pipe `endpoint_wiki` selects from the data source within a range of dates, using the parameters for `date_start` and `date_end`.

##### Endpoint wiki

%
SELECT * FROM wiki
WHERE timestamp BETWEEN 
toInt64(toDateTime({{String(date_start, '2022-02-13 10:30:00')}}))
AND
toInt64(toDateTime({{String(date_end, '2022-02-13 11:00:00')}})) These parameters are passed in the call to the API endpoint to select only the data within the range. A DataFrame is created from the text part of the response.

##### Dataframe from API endpoint

host = 'api.tinybird.co'
api_endpoint = 'endpoint_wiki'
format = 'csv'
 
date_start = '2022-02-13 10:30:00'
date_end = '2022-02-13 11:30:00'
 
s = requests.Session()
s.headers['Authorization'] = f'Bearer {token}'
 
URL = f'https://{host}/v0/pipes/{api_endpoint}.{format}'
params = {'date_start': date_start,
         'date_end': date_end
         }
        
r = s.get(f"{URL}?{urlencode(params)}")
df = pd.read_csv(StringIO(r.text))

Replace the Tinybird API hostname or region with the [API region](/docs/api-reference#regions-and-endpoints) that matches your Workspace.

## Fetch batches of data using the Query API [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-in-a-notebook#fetch-batches-of-data-using-the-query-api)

The function `fetch_table_streaming_query` in the notebook accepts more complex queries than a date range. Here you choose what you filter and sort by. This example reads in batches of 5 minutes to create a small DataFrame, which should then be processed, with the results of the processing appended to the final DataFrame.


<-figure->
![](/docs/_next/image?url=%2Fdocs%2Fimg%2Fconsume-apis-in-a-notebook-1.png&w=3840&q=75)

<-figcaption->
5-minute batches of data using the index

</-figcaption->


</-figure->
##### DataFrames from batches returned by the Query API

tinybird_stream = fetch_table_streaming_query(token,
                                       'wiki',
                                       60*5,
                                       1644747337,
                                       1644758146,
                                       sorting='timestamp',
                                       filters="type IN ['edit','new']",
                                       time_column="timestamp",
                                       host='api.tinybird.co')
 
df_all=pd.DataFrame()
for x in tinybird_stream:
   df_batch = pd.read_csv(StringIO(x))
   # TO DO: process batch and discard fetched data
   df_proc=process_dataframe(df_batch)
   df_all = df_all.append(df_proc) # Careful: appending dfs means keeping a lot of data in memory
## Fetch batches of data from an API endpoint and parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-apis-in-a-notebook#fetch-batches-of-data-from-an-api-endpoint-and-parameters)

The function `fetch_table_streaming_endpoint` in the notebook sends a call to the API with parameters for the `batch size`, `start` and `end` dates, and, optionally, filters on the `bot` and `server_name` columns. This example reads in batches of 5 minutes to create a small DataFrame, which should then be processed, with the results of the processing appended to the final DataFrame.

‍The API endpoint `wiki_stream_example` first selects data for the range of dates, then for the batch, and then applies the filters on column values.

##### API endpoint wiki_stream_example

%
SELECT * from wiki
--DATE RANGE
WHERE timestamp BETWEEN toUInt64(toDateTime({{String(date_start, '2022-02-13 10:30:00', description="start")}}))
AND toUInt64(toDateTime({{String(date_end, '2022-02-13 10:35:00', description="end")}}))
--BATCH BEGIN
AND timestamp BETWEEN toUInt64(toDateTime({{String(date_start, '2022-02-13 10:30:00', description="start")}})
              + interval {{Int16(batch_no, 1, description="batch number")}}
              * {{Int16(batch_size, 10, description="size of the batch")}} second)
--BATCH END
AND toUInt64(toDateTime({{String(date_start, '2022-02-13 10:30:00', description="start")}})
              + interval ({{Int16(batch_no, 1, description="batch number")}} + 1)
              * {{Int16(batch_size, 10, description="size of the batch")}} second)
--FILTERS
{% if defined(bot) %}
 AND bot = {{String(bot, description="is a bot")}}
{% end %}
{% if defined(server_name) %}
 AND server_name = {{String(server_name, description="server")}}
{% end %} These parameters are passed in the call to the API endpoint to select only the data for the batch. A DataFrame is created from the text part of the response.

##### DataFrames from batches from the API endpoint

tinybird_stream = fetch_table_streaming_endpoint(token,
                                                 'csv',
                                                 60*5, 
                                                 '2022-02-13 10:15:00',
                                                 '2022-02-13 13:15:00',
                                                 bot = False,
                                                 server_name='en.wikipedia.org'
                                                )

df_all=pd.DataFrame()
for x in tinybird_stream:
    df_batch = pd.read_csv(StringIO(x))
    # TO DO: process batch and discard fetched data 
    df_proc=process_dataframe(df_batch)
    df_all = df_all.append(df_proc) # Careful: appending dfs means keeping a lot of data in memory

---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format
Last update: 2025-06-17T11:38:44.000Z
Content:
---
title: "Consume API endpoints in Prometheus format · Tinybird Docs"
theme-color: "#171612"
description: "Export pipe endpoints in Prometheus format to integrate Tinybird data into your monitoring stack."
---


# Consume API endpoints in Prometheus format [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format#consume-api-endpoints-in-prometheus-format)

Copy as MD Prometheus is a powerful open source monitoring and alerting toolkit widely used for metrics collection and visualization. You can export pipe endpoints in Prometheus format to integrate your Tinybird data into your monitoring stack.

## Prerequisites [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format#prerequisites)

This guide assumes you have a Tinybird workspace with an active data source, pipes, and at least one API endpoint.

## Structure data for Prometheus [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format#structure-data-for-prometheus)

To export the pipe output in Prometheus format, data must conform to the following structure:

### Mandatory columns [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format#mandatory-columns)

- `name (String)`   : The name of the metric.
- `value (Number)`   : The numeric value for the metric.

### Optional columns [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format#optional-columns)

- `help (String)`   : A description of the metric.
- `timestamp (Number)`   : A Unix timestamp for the metric.
- `type (String)`   : Defines the metric type ( `counter`  , `gauge`  , `histogram`  , `summary`  , `untyped`   , or empty).
- `labels (Map(String, String))`   : A set of key-value pairs providing metric dimensions.

### Add token [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format#add-token)

Add the right authentication token to your API endpoint. If you are monitoring your Tinybird's Organization metrics, you would need to use `ORG_DATASOURCES:READ` scope [to query Organization level data sources](/forward/administration/tokens/static-tokens#user-created-tokens).

### Example [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format#example)

Here’s an example of a Tinybird pipe query that outputs two metrics, `http_request_count` and `http_request_duration_seconds` , in the same query. Both metrics include labels for `method` and `status_code`.

SELECT
    -- Metric 1: http_request_count
    'http_request_count' AS name,
    toFloat64(count(*)) AS value,
    'Total number of HTTP requests' AS help,
    'counter' AS type,
    map('method', method, 'status_code', status_code) AS labels
FROM
    http_requests
GROUP BY
    method, status_code

UNION ALL

SELECT
    -- Metric 2: http_request_duration_seconds
    'http_request_duration_seconds' AS name,
    avg(request_time) AS value,
    'Average HTTP request duration in seconds' AS help,
    'gauge' AS type,
    map('method', method, 'status_code', status_code) AS labels
FROM
    http_requests
GROUP BY
    method, status_code

ORDER BY
    name
## Export Tinybird pipe endpoint in Prometheus format [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format#export-tinybird-pipe-endpoint-in-prometheus-format)

Export pipe data in Prometheus format by appending .prometheus to your API endpoint URI. For example:

https://api.tinybird.co/v0/pipes/your_pipe_name.prometheus The following is an example Prometheus output:

# HELP http_request_count Total number of HTTP requests
# TYPE http_request_count counter
http_request_count{method="PUT",status_code="203"} 1
http_request_count{method="PATCH",status_code="203"} 1
http_request_count{method="DELETE",status_code="201"} 4
http_request_count{method="POST",status_code="203"} 1
http_request_count{method="OPTIONS",status_code="203"} 1
http_request_count{method="PATCH",status_code="204"} 1
http_request_count{method="PUT",status_code="204"} 1
http_request_count{method="HEAD",status_code="203"} 1
http_request_count{method="GET",status_code="201"} 4
http_request_count{method="POST",status_code="204"} 1
http_request_count{method="GET",status_code="203"} 1
http_request_count{method="POST",status_code="201"} 4
http_request_count{method="DELETE",status_code="204"} 1
http_request_count{method="OPTIONS",status_code="201"} 4
http_request_count{method="GET",status_code="204"} 1
http_request_count{method="PATCH",status_code="201"} 4
http_request_count{method="PUT",status_code="201"} 4
http_request_count{method="DELETE",status_code="203"} 1
http_request_count{method="HEAD",status_code="201"} 4

# HELP http_request_duration_seconds Average HTTP request duration in seconds
# TYPE http_request_duration_seconds gauge
http_request_duration_seconds{method="GET",status_code="200"} 75.01
http_request_duration_seconds{method="DELETE",status_code="201"} 11.01
http_request_duration_seconds{method="POST",status_code="202"} 102.00999999999999
http_request_duration_seconds{method="HEAD",status_code="204"} 169.01
http_request_duration_seconds{method="PATCH",status_code="204"} 169.01
http_request_duration_seconds{method="PUT",status_code="204"} 169.01
http_request_duration_seconds{method="HEAD",status_code="202"} 102.00999999999999
http_request_duration_seconds{method="OPTIONS",status_code="202"} 102.00999999999999
http_request_duration_seconds{method="DELETE",status_code="200"} 75.01
http_request_duration_seconds{method="OPTIONS",status_code="204"} 169.01
http_request_duration_seconds{method="GET",status_code="201"} 11.01
http_request_duration_seconds{method="PATCH",status_code="202"} 102.00999999999999
http_request_duration_seconds{method="PUT",status_code="202"} 102.00999999999999
http_request_duration_seconds{method="POST",status_code="204"} 169.01
http_request_duration_seconds{method="DELETE",status_code="202"} 102.00999999999999
http_request_duration_seconds{method="PUT",status_code="200"} 75.01
http_request_duration_seconds{method="POST",status_code="200"} 75.01
http_request_duration_seconds{method="PATCH",status_code="200"} 75.01
http_request_duration_seconds{method="POST",status_code="201"} 11.01
http_request_duration_seconds{method="DELETE",status_code="204"} 169.01
http_request_duration_seconds{method="OPTIONS",status_code="201"} 11.01
http_request_duration_seconds{method="GET",status_code="204"} 169.01
http_request_duration_seconds{method="PATCH",status_code="201"} 11.01
http_request_duration_seconds{method="PUT",status_code="201"} 11.01
http_request_duration_seconds{method="GET",status_code="202"} 102.00999999999999
http_request_duration_seconds{method="HEAD",status_code="201"} 11.01
## Integrate endpoints in Prometheus-compatible tools [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format#integrate-endpoints-in-prometheus-compatible-tools)

Now that you’ve structured and exported your pipe data in Prometheus format, you can integrate it into monitoring and observability tools.

Prometheus is widely supported by various visualization and alerting platforms, making it easy to use your Tinybird data with tools like Grafana, Datadog, and more. See the documentation for these tools to integrate them with the Prometheus endpoints from Tinybird.

## Monitoring your Tinybird Organization with Grafana and Datadog [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/consume-api-endpoints-in-prometheus-format#monitoring-your-tinybird-organization-with-grafana-and-datadog)

Check the [Tinybird Organization metrics](https://github.com/tinybirdco/tinybird-org-metrics-exporter) repository for a working example of how to consume the Prometheus endpoints in Grafana and Datadog to monitor your Tinybird Organization.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/connect-grafana
Last update: 2025-06-18T09:08:17.000Z
Content:
---
title: "Connect Grafana to Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "This guide covers the steps to connect Grafana to Tinybird using Altinity and ClickHouse plugins, enabling the visualization of Tinybird data within Grafana."
---


# Connect Grafana to Tinybird [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/connect-grafana#connect-grafana-to-tinybird)

Copy as MD Grafana can connect to Tinybird using ClickHouse® plugins, taking advantage of Tinybird's ClickHouse® HTTP protocol compatibility through the SQL API ( `/v0/sql` endpoint).

This guide covers how to set up Grafana with Tinybird using two popular ClickHouse® plugins:

- **[  ClickHouse® plugin](https://grafana.com/grafana/plugins/grafana-clickhouse-datasource/)**
- **[  Altinity plugin](https://grafana.com/grafana/plugins/vertamedia-clickhouse-datasource/)**

Remember you can expose metrics like API Endpoints in Prometheus format. See an [example](https://www.tinybird.co/templates/tinybird-org-metrics).
And if you want to connect regular API Endpoints and not Data Sources you can use Infinity data source plugin for Grafana.

## Prerequisites [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/connect-grafana#prerequisites)

- A Tinybird workspace
- A Grafana instance (local or cloud)
- A Tinybird Auth Token with read permissions for the workspace data sources, and Organization data sources.

## Install the ClickHouse® plugin [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/connect-grafana#install-the-clickhouse-plugin)

Install one of the ClickHouse® plugins in your Grafana instance:

1. **  ClickHouse® plugin**   - The official ClickHouse® plugin for Grafana
2. **  Altinity plugin**   - developed by Vertamedia, maintaned by Altinity since 2020

Select one based on your needs and install it through the Grafana plugin catalog or using grafana-cli.

## Configure the ClickHouse® plugin [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/connect-grafana#configure-the-clickhouse-plugin)

If you're using the official ClickHouse® plugin:

1. Go to**  Configuration**   >**  Data Sources**   in Grafana
2. Select**  Add data source**   and select**  ClickHouse®**
3. Configure the connection with these settings:

Server address: <HOST> #your_host, like api.eu-west-1.aws.tinybird.co
Server port: 443
Protocol: HTTP
Secure Connection: yes
HTTP URL Path: /v0/sql
1. Add a custom HTTP header for authentication:

Header Name: Authorization
Header Value: Bearer <TOKEN>
1. Select**  Save & Test**   to verify the connection


<-figure->
![ClickHouse® plugin configuration](/docs/_next/image?url=%2Fdocs%2Fimg%2Fgrafana-plugin-1.png&w=3840&q=75)

<-figcaption->
ClickHouse® plugin configuration

</-figcaption->


</-figure->
## Configure the Altinity plugin [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/connect-grafana#configure-the-altinity-plugin)

If you're using the Altinity plugin:

1. Go to**  Configuration**   >**  Data Sources**   in Grafana
2. Select**  Add data source**   and select**  Altinity plugin for ClickHouse®**
3. Configure the connection with these settings:

URL: https://<HOST>/v0/sql # like https://api.eu-west-1.aws.tinybird.co/v0/sql
Access: Server
1. Add a custom HTTP header for authentication:

Header: Authorization
Value: Bearer <TOKEN>
1. Select**  Save & Test**   to verify the connection


<-figure->
![Altinity plugin configuration](/docs/_next/image?url=%2Fdocs%2Fimg%2Fgrafana-plugin-2.png&w=3840&q=75)

<-figcaption->
Altinity plugin configuration

</-figcaption->


</-figure->
## Test the connection [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/connect-grafana#test-the-connection)

Once configured, test your connection by creating a simple query:

1. Create a new dashboard and add a panel
2. Select your newly created data source
3. Write a simple query to test the connection

If the connection is working, you should see the result.

## Databases and tables [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/connect-grafana#databases-and-tables)

The SQL API exposes four databases: organization, system, tinybird, and workspace name:

- organization: for the Organization data sources, like `organization.workspaces`  , `organization.pipe_stats_rt`
- system: for `system.tables`  , `system.columns`
- tinybird: for the Workspace Service data sources like `tinybird.datasources_ops_log`  , `tinybird.pipe_stats_rt`
- your workspace name: for your Workspace data sources.

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/connect-grafana#next-steps)

- Explore Grafana's visualization options with your Tinybird data
- Set up alerts based on your real-time data
- Create dashboards that combine data from multiple sources
- Use Grafana variables to make your dashboards interactive

## Troubleshooting [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/connect-grafana#troubleshooting)

**Connection failed** : Verify that your Auth Token has read permissions and that the host URL is correct for your region.

**Query timeout** : For large datasets, consider using Materialized Views or adding appropriate filters to your queries.

**Authentication errors** : Double-check that your Bearer token format is correct.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions
Last update: 2025-05-08T12:27:33.000Z
Content:
---
title: "Advanced template functions for dynamic API endpoint · Tinybird Docs"
theme-color: "#171612"
description: "Learn more about creating dynamic API endpoint using advanced templates."
---


# Advanced template functions for dynamic API endpoint [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#advanced-template-functions-for-dynamic-api-endpoint)

Copy as MD Learn how to use [template functions](../../../dev-reference/template-functions) to create dynamic API endpoint with Tinybird.

## Prerequisites [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#prerequisites)

Make sure you're familiar with template functions and [query parameters](../../../work-with-data/query-parameters).

## Example data [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#example-data)

This guide uses the eCommerce events data enriched with products. The data looks like the following:

##### Events and products data

SELECT *, price, city, day FROM events_mat
ANY LEFT JOIN products_join_sku ON product_id = sku
## Tips and tricks [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#tips-and-tricks)

When the complexity of pipes and API endpoints grows, developing them and knowing what's going-on to debug problems can become challenging. Here are some useful tricks for using Tinybird's product:

### WHERE 1=1 [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#where-11)

When you filter by different criteria, given by dynamic parameters that can be omitted, you'll need a `WHERE` clause. But if none of the parameters are present, you'll need to add a `WHERE` statement with a dummy condition (like `1=1` ) that's always true, and then add the other filter statements dynamically if the parameters are defined, like you do in the [defined](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#defined) example of this guide.

### Use the set function [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#use-the-set-function)

The `set` function present in the previous snippet lets you set the value of a parameter in a node, so that you can check the output of a query depending on the value of the parameters it takes. Otherwise, you'd have to publish an API endpoint and make requests to it with different parameters.

Using `set` , you don't have to exit the Tinybird UI while creating an API endpoint and the whole process is faster, without needing to go back and forth between your browser or IDE and Postman or cURL.

Another example of its usage:

##### Using set to try out different parameter values

%
{% set select_cols = 'date,user_id,event,city' %}
SELECT
  {{columns(select_cols)}}
FROM events_mat You can use more than one `set` statement. Put each one on a separate line at the beginning of a node.

`set` is also a way to set defaults for parameters. If you used `set` statements to test your API endpoint while developing, remember to remove them before publishing your code, because if not, the `set` function overrides any incoming parameter.

### Default argument [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#default-argument)

Another way to set default values for parameters is using the `default` argument that most Tinybird template functions accept. The previous code could be rewritten as follows:

##### Using the default argument

%
SELECT
  {{columns(select_cols, 'date,user_id,event,city')}}
FROM events_mat Keep in mind that defining the same parameter in more than one place in your code in different ways can lead to inconsistent behavior. Here's a solution to avoid that:

### Using WITH statements to avoid duplicating code [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#using-with-statements-to-avoid-duplicating-code)

If you plan to use the same dynamic parameters more than once in a node of a pipe, define them in one place to avoid duplicating code. This also makes it clearer which parameters will appear in the node. You can do this with one or more statements at the beginning of a node, using the `WITH` clause.

The WITH clause supports CTEs. These are preprocessed before executing the query, and can only return one row. This is different to other databases such as Postgres. For example:

##### DRY with the with clause

%
{% set terms='orchid' %}
WITH {{split_to_array(terms, '1,2,3')}} AS needles
SELECT
  *,
  joinGet(products_join_sku, 'color', product_id) color,
  joinGet(products_join_sku, 'title', product_id) title
FROM events
WHERE
  multiMatchAny(lower(color), needles)
  OR multiMatchAny(lower(title), needles)
### Documenting your API endpoints [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#documenting-your-api-endpoints)

Tinybird creates auto-generated documentation for all your published API endpoints, taking the information from the dynamic parameters found in the pipe. It's best practice to set default values and descriptions for every parameter in one place (also because some functions don't accept a description, for example). This is typically done in the final node, with `WITH` statements at the beginning. See how to do it in the [last section](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#putting-it-all-together) of this guide.

### Hidden parameters [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#hidden-parameters)

If you use some functions like `enumerate_with_last` in the [enumarate with last example](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#enumerate-with-last) , you might end up with some variables (called `x`, `last` in that code snippet) that Tinybird interprets as if they were parameters that you can set. They appear in the auto-generated documentation page. To avoid that, add a leading underscore to their name, renaming `x` to `_x` and `last` to `_last`.

## Advanced functions [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#advanced-functions)

The following are practical examples of advanced template functions usage so that it's easier for you to understand how to use them.

### defined [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#defined)

The `defined` function lets you check if a query string parameter exists in the request URL or not.

Imagine you want to filter events with a price within a minimum or a maximum price, set by two dynamic parameters that could be omitted. A way to define the API endpoint would be like this:

##### filter by price

%
{% set min_price=20 %}
{% set max_price=50 %}

SELECT *, price
FROM events_mat
WHERE 1 = 1
{% if defined(min_price) %}
  AND price >= {{Float32(min_price)}}
{% end %}
{% if defined(max_price) %}
  AND price <= {{Float32(max_price)}}
{% end %} To see the effect of having a parameter not defined, use `set` to set its value to `None` like this:

##### filter by price, price not defined

%
{% set min_price=None %}
{% set max_price=None %}

SELECT *, price
FROM events_mat
WHERE 1 = 1
{% if defined(min_price) %}
  AND price >= {{Float32(min_price)}}
{% end %}
{% if defined(max_price) %}
  AND price <= {{Float32(max_price)}}
{% end %} It's also possible to provide smart defaults to avoid needing to use the `defined` function at all:

##### filter by price with default values

%
SELECT *, price
FROM events_mat_cols
WHERE price >= {{Float32(min_price, 0)}}
  AND price <= {{Float32(max_price, 999999999)}}
### Array(variable_name, 'type', [default]) [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#arrayvariable-name-type-default)

Transforms a comma-separated list of values into a Tuple. You can provide a default value for it or not:

%
SELECT
    {{Array(code, 'UInt32', default='13412,1234123,4123')}} AS codes_1,
    {{Array(code, 'UInt32', '13412,1234123,4123')}} AS codes_2,
    {{Array(code, 'UInt32')}} AS codes_3 To filter events whose type belongs to the ones provided in a dynamic parameter, separated by commas, you'd define the API endpoint like this:

##### Filter by list of elements

%
SELECT *
FROM events
WHERE event IN {{Array(event_types, 'String', default='buy,view')}} And then the URL of the API endpoint would be something like `{% user("apiHost") %}/v0/pipes/your_pipe_name.json?event_types=buy,view`

### sql_and [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#sql-and)

`sql_and` lets you create a filter with `AND` operators and several expressions dynamically, taking into account if the dynamic parameters in a template it are present in the request URL.

It's not possible to use Tinybird functions inside the `{{ }}` brackets in templates. `sql_and` can only be used with the `{column_name}__{operand}` syntax. This function does the same as what you saw in the previous query: filtering a column by the values that are present in a tuple generated by `Array(...)` if `operand` is `in` , are greater than (with the `gt` operand), or less than (with the `lt` operand). Let's see an example to make it clearer:

- Endpoint template code
- Generated SQL

##### SQL_AND AND COLUMN__IN

%
SELECT
  *,
  joinGet(products_join_sku, 'section_id', product_id) section_id
FROM events
WHERE {{sql_and(event__in=Array(event_types, 'String', default='buy,view'),
                section_id__in=Array(sections, 'Int16', default='1,2'))}} You don't have to provide default values. If you set the `defined` argument of `Array` to `False` , when that parameter isn't provided, no SQL expression will be generated. You can see this in the next code snippet:

- Endpoint template code
- Generated SQL

##### defined=False

%
SELECT
  *,
  joinGet(products_join_sku, 'section_id', product_id) section_id
FROM events
WHERE {{sql_and(event__in=Array(event_types, 'String', default='buy,view'),
                section_id__in=Array(sections, 'Int16', defined=False))}}
### split_to_array(name, [default]) [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#split-to-arrayname-default)

This works similarly to `Array` , but it returns an Array of Strings (instead of a tuple). You'll have to cast the result to the type you want after. As you can see here too, they behave in a similar way:

##### array and split_to_array

%
SELECT
    {{Array(code, 'UInt32', default='1,2,3')}},
    {{split_to_array(code, '1,2,3')}},
    arrayMap(x->toInt32(x), {{split_to_array(code, '1,2,3')}}),
    1 in {{Array(code, 'UInt32', default='1,2,3')}},
    '1' in {{split_to_array(code, '1,2,3')}} One thing that you want to keep in mind is that you can't pass non-constant values (arrays, for example) to operations that require them. For example, this would fail:

##### using a non-constant expression where one is required

%
SELECT
    1 IN arrayMap(x->toInt32(x), {{split_to_array(code, '1,2,3')}}) If you find an error like this, you should use a Tuple instead (remember that `{{Array(...)}}` returns a tuple). This will work:

##### Use a tuple instead

%
SELECT
    1 IN {{Array(code, 'Int32', default='1,2,3')}} `split_to_array` is often used with [enumerate_with_last](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#enumerate-with-last).

### column and columns [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#column-and-columns)

They let you select one or several columns from a data source or pipe, given their name. You can also provide a default value.

##### columns

%
SELECT {{columns(cols, 'date,user_id,event')}}
FROM events
##### column

%
SELECT date, {{column(user, 'user_id')}}
FROM events
### enumerate_with_last [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#enumerate-with-last)

Creates an iterable array, returning a Boolean value that allows checking if the current element is the last element in the array. Its most common usage is to select several columns, or compute some function over them. See an example of `columns` and `enumerate_with_last` here:

- Endpoint template code
- Generated SQL

##### enumerate_with_last + columns

%
SELECT
    {% if defined(group_by) %}
        {{columns(group_by)}},
    {% end %}
    sum(price) AS revenue,
    {% for last, x in enumerate_with_last(split_to_array(count_unique_vals_columns, 'section_id,city')) %}
        uniq({{symbol(x)}}) as {{symbol(x)}}
        {% if not last %},{% end %}
    {% end %}
  FROM events_enriched
{% if defined(group_by) %}
    GROUP BY
         {{columns(group_by)}}

    ORDER BY
        {{columns(group_by)}}
{% end %} If you use the `defined` function around a parameter it doesn't make sense to give it a default value because if it's not provided, that line will never be run.

### error and custom_error [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#error-and-custom-error)

They let you return customized error responses. With `error` you can customize the error message:

##### error

%
{% if not defined(event_types) %}
  {{error('You need to provide a value for event_types')}}
{% end %}
SELECT
  *,
  joinGet(products_join_sku, 'section_id', product_id) section_id
FROM events
WHERE event IN {{Array(event_types, 'String')}}
##### error response using error

{"error": "You need to provide a value for event_types"} And with `custom_error` you can also customize the response code:

##### custom_error

%
{% if not defined(event_types) %}
  {{custom_error({'error': 'You need to provide a value for event_types', 'code': 400})}}
{% end %}
SELECT
  *,
  joinGet(products_join_sku, 'section_id', product_id) section_id
FROM events
WHERE event IN {{Array(event_types, 'String')}}
##### error response using custom_error

{"error": "You need to provide a value for event_types", "code": 400} **Note:** `error` and `custom_error` have to be placed at the start of a node or they won't work. The order should be:

1. `set`   lines, to give some parameter a default value (optional)
2. Parameter validation functions: `error`   and `custom_error`   definitions
3. The SQL query itself

## Putting it all together [¶](https://www.tinybird.co/docs/forward/work-with-data/publish-data/guides/advanced-dynamic-endpoints-functions#putting-it-all-together)

You've created a pipe where you use most of these advanced techniques to filter ecommerce events.

This is its code:

##### advanced_dynamic_endpoints.pipe

NODE events_enriched
SQL >

    SELECT
        *,
        price,
        city,
        day
    FROM events_mat_cols
    ANY LEFT JOIN products_join_sku ON product_id = sku


NODE filter_by_price
SQL >

    %
        SELECT * FROM events_enriched
        WHERE 1 = 1
        {% if defined(min_price) %}
          AND price >= {{Float32(min_price)}}
        {% end %}
        {% if defined(max_price) %}
          AND price <= {{Float32(max_price)}}
        {% end %}


NODE filter_by_event_type_and_section_id
SQL >

    %
    SELECT
          *
        FROM filter_by_price
        {% if defined(event_types) or defined(section_ids) %} ...
            WHERE {{sql_and(event__in=Array(event_types, 'String', defined=False, enum=['remove_item_from_cart','view','search','buy','add_item_to_cart']),
                            section_id__in=Array(section_ids, 'Int32', defined=False))}}
        {% end %}


NODE filter_by_title_or_color
SQL >

    %
    SELECT *
    FROM filter_by_event_type_and_section_id
    {% if defined(search_terms) %}
    WHERE
      multiMatchAny(lower(color), {{split_to_array(search_terms)}})
      OR multiMatchAny(lower(title), {{split_to_array(search_terms)}})
    {% end %}


NODE group_by_or_not
SQL >

    %
        SELECT
            {% if defined(group_by) %}
              {{columns(group_by)}},
              sum(price) AS revenue,
              {% for _last, _x in enumerate_with_last(split_to_array(count_unique_vals_columns)) %}
                  uniq({{symbol(_x)}}) as {{symbol(_x)}}
                  {% if not _last %},{% end %}
              {% end %}
            {% else %}
              *
            {% end %}
        FROM filter_by_title_or_color
        {% if defined(group_by) %}
            GROUP BY {{columns(group_by)}}
            ORDER BY {{columns(group_by)}}
        {% end %}


NODE pagination
SQL >

    %
    WITH
      {{Array(group_by,
      'String',
      '',
               description='Comma-separated name of columns. If defined, group by and order the results by these columns. The sum of revenue will be returned')}},
      {{Array(count_unique_vals_columns, 'String', '',
               description='Comma-separated name of columns. If both group_by and count_unique_vals_columns are defined, the number of unique values in the columns given in count_unique_vals_columns will be returned as well')}},
       {{Array(search_terms, 'String', '',
              description='Comma-separated list of search terms present in the color or title of products')}},
       {{Array(event_types, 'String', '',
               description="Comma-separated list of event name types", enum=['remove_item_from_cart','view','search','buy','add_item_to_cart'])}},
       {{Array(section_ids, 'String', '',
               description="Comma-separated list of section IDs. The minimum value for an ID is 0 and the max is 50.")}}
    SELECT * FROM group_by_or_not
    LIMIT {{Int32(page_size, 100)}}
    OFFSET {{Int32(page, 0) * Int32(page_size, 100)}} To replicate it in your account, copy the previous code to a new endpoint file called `advanced_dynamic_endpoints.pipe` locally and deploy your changes.


---

URL: https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/lambda-architecture
Last update: 2025-05-22T22:35:06.000Z
Content:
---
title: "Build a lambda architecture in Tinybird · Tinybird Docs"
theme-color: "#171612"
description: "In this guide, you'll learn a useful alternative processing pattern for when the typical Tinybird flow doesn't fit."
---


# Build a lambda architecture in Tinybird [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/lambda-architecture#build-a-lambda-architecture-in-tinybird)

Copy as MD In this guide, you'll learn a useful alternative processing pattern for when the typical Tinybird flow doesn't fit.

This page introduces a useful data processing pattern for when the typical Tinybird flow (Data Source --> incremental transformation through Materialized Views --> and API Endpoint publication) doesn't fit. Sometimes, the way Materialized Views work means you need to use **Copy Pipes** to create the intermediate Data Sources that will keep your API Endpoints performant.

## The ideal Tinybird flow [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/lambda-architecture#the-ideal-tinybird-flow)

You ingest data (usually streamed in, but can also be in batch), transform it using SQL, and serve the results of the queries via [parameterizable](/docs/forward/work-with-data/query-parameters) API Endpoints. Tinybird provides freshness, low latency, and high concurrency: Your data is ready to be queried as soon as it arrives.


<-figure->
![Data flow with Data Source and API Endpoint](/docs/_next/image?url=%2Fdocs%2Fimg%2Fguides-lambda-1.png&w=3840&q=75)

<-figcaption->
Data flow with Data Source and API Endpoint

</-figcaption->


</-figure->
Sometimes, transforming the data at query time isn't ideal. Some operations - doing aggregations, or extracting fields from JSON - are better if done at ingest time, then you can query that prepared data. [Materialized Views](/docs/forward/work-with-data/optimize/materialized-views) are perfect for this kind of situation. They're triggered at ingest time and create intermediate tables (data sources in Tinybird lingo) to keep your API Endpoints performance super efficient.


<-figure->
![Data flow with Data Source, MV, and API Endpoint](/docs/_next/image?url=%2Fdocs%2Fimg%2Fguides-lambda-2.png&w=3840&q=75)

<-figcaption->
Data flow with Data Source, MV, and API Endpoint

</-figcaption->


</-figure->
The best practice for this approach is usually having a Materialized View (MV) per use case:


<-figure->
![Materialized Views for different use cases](/docs/_next/image?url=%2Fdocs%2Fimg%2Fguides-lambda-3.png&w=3840&q=75)

<-figcaption->
Materialized Views for different use cases

</-figcaption->


</-figure->
If your use case fits in these first two paragraphs, stop reading. No need to over-engineer it.

## When the ideal flow isn't enough [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/lambda-architecture#when-the-ideal-flow-isnt-enough)

There are some cases where you may need intermediate Data Sources (tables) and Materialized Views don't fit.

- Most common: Things like Window Functions where you need to check the whole table to make calculations.
- Fairly common: Needing an Aggregation MV over a deduplication table (ReplacingMergeTree).
- Scenarios where Materialized Views fit but aren't super efficient (hey `uniqState`   ).
- And lastly, one of the hardest problems in syncing OLTP and OLAP databases: Change data capture (CDC).

Want to know more about *why* Materialized Views don't work in these cases? [Read the docs.](/docs/forward/work-with-data/optimize/materialized-views#limitations)

As an example, let's look at the *Aggregation Materialized Views over deduplication DS* scenario.

Deduplication in Tinybird happens asynchronously, during merges, which you can't force in Tinybird. That's why you always have to add `FINAL` or the `-Merge` combinator when querying.

Plus, Materialized Views only see the block of data that is being processed at the time, so when materializing an aggregation, it will process any new row, no matter if it was a new id or a duplicated id. That's why this pattern fails.


<-figure->
![](/docs/_next/image?url=%2Fdocs%2Fimg%2Fguides-lambda-4.png&w=3840&q=75)

<-figcaption->
Aggregating MV over deduplication DS will not work as expected

</-figcaption->


</-figure->
## Solution: Use an alternative architecture with Copy Pipes [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/lambda-architecture#solution-use-an-alternative-architecture-with-copy-pipes)

Tinybird has another kind of Pipe that will help here: [Copy Pipes](/docs/forward/work-with-data/optimize/copy-pipes).

At a high level, they're a helpful `INSERT INTO SELECT` , and they can be set to execute following a cron expression. You write your query, and (either on a recurring basis or on demand), the Copy Pipe appends the result in a different table.

So, in this example, you can have a clean, deduplicated snapshot of your data, with the correct Sorting Keys, and can use it to materialize:


<-figure->
![](/docs/_next/image?url=%2Fdocs%2Fimg%2Fguides-lambda-5.png&w=3840&q=75)

<-figcaption->
Copy Pipes to the rescue

</-figcaption->


</-figure->
## Avoid loss of freshness [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/lambda-architecture#avoid-loss-of-freshness)

*"But if you recreate the snapshot every hour/day/whatever... Aren't you losing freshness?"* Yes - you're right. That's when the lambda architecture comes into play:


<-figure->
![](/docs/_next/image?url=%2Fdocs%2Fimg%2Fguides-lambda-6.png&w=3840&q=75)

<-figcaption->
Lambda/Kappa Architecture

</-figcaption->


</-figure->
You'll be combining the already-prepared data with the same operations over the fresh data being ingested at that moment. This means you end up with higher performance despite quite complex logic over both fresh and old data.

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/lambda-architecture#next-steps)

- Read more about[  Copy Pipes](/docs/forward/work-with-data/optimize/copy-pipes)  .
- Read more about[  Materialized Views](/docs/forward/work-with-data/optimize/materialized-views)  .


---

URL: https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies
Last update: 2025-05-22T22:27:11.000Z
Content:
---
title: "Deduplicate data in your data source · Tinybird Docs"
theme-color: "#171612"
description: "Tinybird provides you with an easy way to ingest and query large amounts of data with low-latency, and automatically create APIs to consume those queries. This makes it extremely easy to build fast and scalable applications that query your data; no back-end needed!"
---


# Deduplicate data in your data source [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#deduplicate-data-in-your-data-source)

Copy as MD Sometimes you might need to deduplicate data, for example to receive updates or data from a transactional database through CDC. You might want to retrieve only the latest data point, or keep a historic record of the evolution of the attributes of an object over time. Because Tinybird doesn't enforce uniqueness for primary keys when inserting rows, you need to follow different strategies to deduplicate data with minimal side effects.

## Deduplication strategies [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#deduplication-strategies)

You can use one of the following strategies to deduplicate your data.

| Method | When to use |
| --- | --- |
| [  Deduplicate at query time](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#deduplicate-at-query-time) | Deduplicate data at query time if you are still prototyping or the data source is small. |
| [  Use ReplacingMergeTree](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#use-the-replacingmergetree-engine) | Use `ReplacingMergeTree`   or `AggregatingMergeTree`   for greater performance. |
| [  Snapshot based deduplication](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#snapshot-based-deduplication) | If data freshness isn't required, generate periodic snapshots of the data and take advantage of subsequent materialized views for rollups. |
| [  Hybrid approach using Lambda architecture](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#hybrid-approach-using-lambda-architecture) | When you need to overcome engine approach limitations while preserving freshness, combine approaches in a Lambda architecture. |
| Using argMax with null values |  |

For dimensional and small tables, a periodical full replace is usually the best option.

## Example case [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#example-case)

Consider a dataset from a social media analytics company that wants to track some data content over time. You receive an event with the latest info for each post, identified by `post_id` . The three fields, `views`, `likes`, `tags` , vary from event to event. For example:

##### post.ndjson

{ "timestamp": "2024-07-02T02:22:17", "post_id": 956, "views": 856875, "likes": 2321, "tags": "Sports" }
## Deduplicate at query time [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#deduplicate-at-query-time)

Imagine you're only interested in the latest value of views for each post. In that case, you can deduplicate data on `post_id` and get the latest value with these strategies:

- Get the max date for each post in a subquery and then filter by its results.
- Group data by `post_id`   and use the `argMax`   function.
- Use the `LIMIT BY`   clause.

Select `Subquery`, `argMax` , or `LIMIT BY` to see the example queries for each.

- Subquery
- argMax
- LIMIT BY

##### Deduplicating data on post_id using Subquery

SELECT *
FROM posts_info
WHERE (post_id, timestamp) IN
(
    SELECT
        post_id,
        max(timestamp)
    FROM posts_info
    GROUP BY post_id
) Depending on your data and how you define the sorting keys in your data sources to store it on disk, one approach is faster than the others.

In general, deduplicating at query time is fine if the size of your data is small. If you have lots of data, use a specific Engine that takes care of deduplication for you.

## Use the ReplacingMergeTree engine [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#use-the-replacingmergetree-engine)

If you've lots of data and you're interested in the latest insertion for each unique key, use the [ReplacingMergeTree](/docs/sql-reference/engines/replacingmergetree) engine with the following options: `ENGINE_SORTING_KEY`, `ENGINE_VER` , and `ENGINE_IS_DELETED`.

- Rows with the same `ENGINE_SORTING_KEY`   are deduplicated. You can select one or more columns.
- If you specify a type for `ENGINE_VER`   , the row with the highest `ENGINE_VER`   for each unique `ENGINE_SORTING_KEY`   is kept, for example a timestamp.
- `ENGINE_IS_DELETED`   is only active if you use `ENGINE_VER`   . This column determines whether the row represents the state or is to be deleted; `1`   is a deleted row, `0`   is a state row. The type must be `UInt8`  .
- You can omit `ENGINE_VER`   , so that the last inserted row for each unique `ENGINE_SORTING_KEY`   is kept.

Do not build materialized views with an AggregatingMergeTree on top of a ReplacingMergeTree. The target data source will always contain duplicates due to the incremental nature of materialized views.

### Define a data source [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#define-a-data-source)

Define a data source like the following:

##### post_views_rmt.datasource

DESCRIPTION >
    data source to save post info. ReplacingMergeTree Engine.

SCHEMA >
    `post_id` Int32 `json:$.post_id`,
    `views` Int32 `json:$.views`,
    `likes` Int32 `json:$.likes`,
    `tag` String `json:$.tag`,
    `timestamp` DateTime `json:$.timestamp`,
    `_is_deleted` UInt8 `json:$._is_deleted`

ENGINE "ReplacingMergeTree"
ENGINE_PARTITION_KEY ""
ENGINE_SORTING_KEY "post_id"
ENGINE_VER "timestamp"
ENGINE_IS_DELETED "_is_deleted" ReplacingMergeTree deduplicates during a merge, and merges can't be controlled. Consider adding the `FINAL` clause, or an alternative deduplication method, to apply the merge at query time. Note also that rows are masked, not removed, when using `FINAL`.

- FINAL
- Subquery
- argMax
- LIMIT BY

##### Deduplicating data on post_id using FINAL

SELECT *
FROM posts_info_rmt FINAL You can define the `posts_info_rmt` as the landing data source, the one you send events to, or as a materialized view from `posts_info` . You can also create a data source with an `AggregatingMergeTree` Engine using `maxState(ts)` and `argMaxState(field,ts)`.

## Snapshot based deduplication [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#snapshot-based-deduplication)

Use [Copy Pipes](/docs/forward/work-with-data/optimize/copy-pipes) to take a query result and write it to a new data source in the following situations:

- You need other Sorting Keys that might change with updates.
- You need to do rollups and want to use materialized views.
- Response times are too long with a `ReplacingMergeTree`  .

The following is an example snapshot:

##### post_generate_snapshot.pipe

NODE gen_snapshot
SQL >

    SELECT
        post_id,
        argMax(views, timestamp) views,
        argMax(likes, timestamp) likes,
        argMax(tag, timestamp) tag,
        max(timestamp) as ts,
        toStartOfMinute(now()) - INTERVAL 1 MINUTE as snapshot_ts
    FROM posts_info
    WHERE timestamp <= toStartOfMinute(now()) - INTERVAL 1 MINUTE
    GROUP BY post_id

TYPE COPY
TARGET_DATASOURCE post_snapshot
COPY_MODE replace
COPY_SCHEDULE 0 * * * * Because the `TARGET_DATASOURCE` engine is a MergeTree, you can use fields that you expect to be updated as sorting keys in the ReplacingMergeTree.

##### post_snapshot.datasource

SCHEMA >
    `post_id` Int32,
    `views` Int32,
    `likes` Int32,
    `tag` String,
    `ts` DateTime,
    `snapshot_ts` DateTime

ENGINE "MergeTree"
ENGINE_PARTITION_KEY ""
ENGINE_SORTING_KEY "tag, post_id"
## Hybrid approach using Lambda architecture [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#hybrid-approach-using-lambda-architecture)

Snapshots might decrease data freshness, and running Copy Pipes too frequently might be more expensive than materializations. A way to mitigate these issues is to combine batch and real-time processing, reading the latest snapshot and incorporating the changes that happened since then.

This pattern is described in the [Lambda architecture](/docs/forward/work-with-data/optimize/guides/lambda-architecture) guide.

Using the `post_snapshot` data source created before, the real-time Pipe would be like the following:

##### latest_values.pipe

NODE get_latest_changes
SQL >

    SELECT
        max(timestamp) last_ts,
        post_id,
        argMax(views, timestamp) views,
        argMax(likes, timestamp) likes,
        argMax(tag, timestamp) tag
    FROM posts_info_rmt
    WHERE timestamp > (SELECT max(snapshot_ts) FROM post_snapshot)
    GROUP BY post_id

NODE get_snapshot
SQL >

    SELECT
        last_ts,
        post_id,
        views,
        likes,
        tag
    FROM post_snapshot
    WHERE snapshot_ts = (SELECT max(snapshot_ts) FROM post_snapshot)
    AND post_id NOT IN (SELECT post_id FROM get_latest_changes)


NODE combine_both
SQL >

    SELECT * FROM get_snapshot
    UNION ALL
    SELECT * FROM get_latest_changes
## A note on using argMax with null values [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#a-note-on-using-argmax-with-null-values)

Here are the definitions for `argMax` functions:

- `argMaxState`   : used to pipe a constantly updating max state into a materialized view.
- `argMaxMerge`   : used to query a max state value out of a materialized view.

**When dealing with null values, `argMax` functions might not behave as you expect.** In raw queries, you might see a null value for the most recent record. However, when data is piped into an AggregatingMergeTree materialized view using `argMaxState` and later queried using `argMaxMerge` , the result can be different.

Returning to the social media analytics example, imagine you want to track most recent time when a post was flagged.

The raw query would be as follows:

##### get_latest_flagged_at_raw.pipe

# This returns `null` if the most recent record's `flaggedAt` value is null.

NODE get_latest_flagged_at_raw
SQL >

    SELECT flaggedAt
    FROM posts
    WHERE post_id = 'abc123'
    ORDER BY timestamp DESC
    LIMIT 1 First, data is aggregated into a materialized view with a pipe that uses `argMaxState`:

##### get_latest_flagged_at_mat.pipe

NODE get_latest_flagged_at
SQL >

    SELECT argMaxState(flaggedAt, timestamp)
    FROM posts

TYPE materialized
DATASOURCE post_analytics_mv Later, the aggregated state is merged with `argMaxMerge` when the materialized view is queried:

##### get_latest_flagged_at_mat.pipe

NODE get_absolute_latest_flagged_at
SQL >

    SELECT argMaxMerge(flaggedAt)
    FROM post_analytics_mv
    WHERE post_id = 'abc123' Although you might expect a `null` result because the raw query returned `null` , the merging process prefers any non‑null value over a null value—even if its associated timestamp is lower. The result is the most recent non‑null `flaggedAt` value.

#### Why this happens and workaround [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#why-this-happens-and-workaround)

During the merge process, the materialized view combines key‑value candidate states. A key-value candidate in this example would be `timestamp` and `flaggedAt` . If one candidate has a non‑null `flaggedAt` and another has a null value for `flaggedAt` , the non‑null value "wins" regardless of its timestamp. This behavior is inherent to the merging logic in `argMaxMerge` . It explains why in an argMaxMerge query, you might not get the absolute max.

To prevent the merging behavior from overriding a null with a non‑null candidate, you need to handle null values explicitly before they enter the materialized view. One common approach is to transform null values to a known default—often the Unix epoch `1970-01-01 00:00:00` . However, be aware that such conversions produce "fake" values which must be recognized in subsequent processing.

Assume your raw data includes a flaggedAt column, which may contain nulls. You can pre-process the data during aggregation as follows:

##### get_latest_flagged_at_no_nulls.pipe

NODE get_latest_flagged_at_no_nulls
SQL >

    SELECT
        argMaxState(
            CASE
                WHEN flaggedAt IS NULL
                THEN toDateTime('1970-01-01 00:00:00')
                ELSE flaggedAt
            END,
            timestamp
        )
    FROM posts Here, a CASE expression is used to convert any null flaggedAt values into the default datetime 1970-01-01 00:00:00. This ensures that during the merge, the aggregation logic processes these explicit default values rather than implicitly "overriding" nulls.

Since `1970-01-01 00:00:00` is used as a default placeholder, ensure that any downstream logic differentiates between genuine datetime values and these default values.

## Next steps [¶](https://www.tinybird.co/docs/forward/work-with-data/optimize/guides/deduplication-strategies#next-steps)

- Read the[  materialized views docs](/docs/forward/work-with-data/optimize/materialized-views)  .
- Read the[  Lambda architecture guide](/docs/forward/work-with-data/optimize/guides/lambda-architecture)  .
- Publish your data with[  API endpoints](/docs/forward/work-with-data/publish-data/endpoints)  .


---