Copy pipes

Copy pipes capture the result of a pipe at a moment in time and write the result into a target data source. They can be run on a schedule, or executed on demand.

Use copy pipes for:

  • Event-sourced snapshots, such as change data capture (CDC).
  • Copy data from Tinybird to another location in Tinybird to experiment.
  • De-duplicate with snapshots.

Copy pipes should not be confused with Materialized views. Materialized views continuously re-evaluate a query as new events are inserted, while copy pipes create a single snapshot at a given point in time.

This is an experimental version of Tinybird. Join #forward in our Slack community to share your feedback.

Create a copy pipe

Copy pipes are defined in a .pipe file, including defining nodes that contain your SQL queries. See .pipe files for more information.

In the .pipe file you define the queries that filter and transform the data as needed. The final result of all queries is the result that you want to write into a data source.

The file must contain a TYPE COPY node that defines which node contains the final result. To do this, include the following parameters at the end of a node:

TYPE COPY
TARGET_DATASOURCE datasource_name
COPY_SCHEDULE --(optional) a cron expression or @on-demand. If not defined, it would default to @on-demand.
COPY_MODE append --(Optional) The strategy to ingest data for copy jobs. One of `append` or `replace`, if empty the default strategy is `append`.

There can be only one copy node per pipe, and no other outputs, such as materialized views or API endpoints.

Schedule a copy pipe

You can schedule copy pipes to run at a specific time using a cron expression. To schedule a copy pipe, configure COPY_SCHEDULE with a cron expression. On-demand copy pipes are defined by configuring COPY_SCHEDULE with the value @on-demand.

Here is an example of a copy pipe scheduled to run every hour and that writes the results of a query into the sales_hour_copy data source:

NODE daily_sales
SQL >
    %
    SELECT toStartOfDay(starting_date) day, country, sum(sales) as total_sales
    FROM teams
    WHERE
    day BETWEEN toStartOfDay({{DateTime(job_timestamp)}}) - interval 1 day AND toStartOfDay({{DateTime(job_timestamp)}})
    and country = {{ String(country, 'US')}}
    GROUP BY day, country

TYPE COPY
TARGET_DATASOURCE sales_hour_copy
COPY_SCHEDULE 0 * * * *

Before pushing the copy pipe to your workspace, make sure that the target data source already exists and has a schema that matches the output of the query result.

All schedules are executed in the UTC time zone. If you are configuring a schedule that runs at a specific time, be careful to consider that you will need to convert the desired time from your local time zone to UTC.

List your copy pipes

Use the tb copy ls command to list all your copy pipes. See tb copy.

Run, pause, or resume a copy pipe

Use the tb copy command to run, pause, or resume a copy pipe. See tb copy.

You can run tb job ls to see any running jobs, as well as any jobs that have finished during the last 48 hours.

Next steps

Updated