Kafka Sink

Kafka Sinks are currently in private beta. If you have any feedback or suggestions, contact Tinybird at support@tinybird.co or in the Community Slack.

Tinybird's Kafka Sink allows you to push the results of a query to a Kafka topic. Queries can be executed on a defined schedule or on-demand.

Common uses for the Kafka Sink include:

  • Push events to Kafka as part of an event-driven architecture.
  • Exporting data to other systems that consume data from Kafka.
  • Hydrating a data lake or data warehouse with real-time data.

Prerequisites

To use the Kafka Sink, you need to have a Kafka cluster that Tinybird can reach via the internet, or via private networking for Enterprise customers.

Configure using the UI

1. Create a Pipe and promote it to Sink Pipe

In the Tinybird UI, create a Pipe and write the query that produces the result you want to export. In the top right "Create API Endpoint" menu, select "Create Sink". In the modal, choose the destination (Kafka).

2. Choose the scheduling options

You can configure your Sink to run using a cron expression, so it runs automatically when needed.

3. Configure destination topic

Enter the Kafka topic where events are going to be pushed.

4. Preview and create

The final step is to check and confirm that the preview matches what you expect.

🎉 Congratulations! You've created your first Sink.

Configure using the CLI

1. Create the Kafka Connection

Run the tb connection create kafka command, and follow the instructions.

2. Create Kafka Sink Pipe

To create a Sink Pipe, create a regular .pipe and filter the data you want to export to your bucket in the SQL section as in any other Pipe. Then, specify the Pipe as a sink type and add the needed configuration. Your Pipe should have the following structure:

NODE node_0

SQL >
    SELECT *
    FROM events
    WHERE time >= toStartOfMinute(now()) - interval 30 minute)

TYPE sink
EXPORT_SERVICE kafka
EXPORT_CONNECTION_NAME "test_kafka"
EXPORT_KAFKA_TOPIC "test_kafka_topic"
EXPORT_SCHEDULE "*/5 * * * *"

Pipe parameters

For this step, you will need to configure the following Pipe parameters:

KeyTypeDescription
EXPORT_CONNECTION_NAMEstringRequired. The connection name to the destination service. This the connection created in Step 1.
EXPORT_KAFKA_TOPICstringRequired. The desired topic for the export data.
EXPORT_SCHEDULEstringA crontab expression that sets the frequency of the Sink operation or the @on-demand string.

Once ready, push the datafile to your Workspace using tb push (or tb deploy if you are using version control) to create the Sink Pipe.

Scheduling

The schedule applied doesn't guarantee that the underlying job executes immediately at the configured time. The job is placed into a job queue when the configured time elapses. It is possible that, if the queue is busy, the job could be delayed and executed after the scheduled time.

To reduce the chances of a busy queue affecting your Sink Pipe execution schedule, we recommend distributing the jobs over a wider period of time rather than grouped close together.

For Enterprise customers, these settings can be customized. Reach out to your Customer Success team or email us at support@tinybird.co.

Query parameters

You can add query parameters to your Sink, the same way you do in API Endpoints or Copy Pipes.

For scheduled executions, the default values for the parameters will be used when the Sink runs.

Iterating a Kafka Sink (Coming soon)

Iterating features for Kafka Sinks are not yet supported in the beta. They are documented here for future reference.

Sinks can be iterated using version control, similar to other resources in your project. When you create a Branch, resources are cloned from the main Branch.

However, there are two considerations for Kafka Sinks to understand:

1. Schedules

When you create a Branch with an existing Kafka Sink, the resource will be cloned into the new Branch. However, it will not be scheduled. This prevents Branches from running exports unintentially and consuming resources, as it is common that development Branches do not need to export to external systems. If you want these queries to run in a Branch, you must recreate the Kafka Sink in the new Branch.

2. Connections

Connections are not cloned when you create a Branch. You need to create a new Kafka connection in the new Branch for the Kafka Sink.

Observability

Kafka Sink operations are logged in the tinybird.sinks_ops_log Service Data Source.

Limits & quotas

Check the limits page for limits on ingestion, queries, API Endpoints, and more.

Billing

Any Processed Data incurred by a Kafka Sink is charged at the standard rate for your account. The Processed Data is already included in your plan, and counts towards your commitment. If you're on an Enterprise plan, view your plan and commitment on the Organizations tab in the UI.

Next steps