Copy pipes¶
Copy pipes capture the result of a pipe at a moment in time and write the result into a target data source. They can be run on a schedule, or executed on demand.
Use copy pipes for:
- Event-sourced snapshots, such as change data capture (CDC).
- Copy data from Tinybird to another location in Tinybird to experiment.
- De-duplicate with snapshots.
Copy pipes should not be confused with Materialized views. Materialized views continuously re-evaluate a query as new events are inserted, while copy pipes create a single snapshot at a given point in time.
Create a copy pipe¶
Copy pipes are defined in a .pipe file, including defining nodes that contain your SQL queries. See .pipe files for more information.
In the .pipe file you define the queries that filter and transform the data as needed. The final result of all queries is the result that you want to write into a data source.
The file must contain a TYPE COPY
node that defines which node contains the final result. To do this, include the following parameters at the end of a node:
TYPE COPY TARGET_DATASOURCE datasource_name COPY_SCHEDULE --(optional) a cron expression or @on-demand. If not defined, it would default to @on-demand. COPY_MODE append --(Optional) The strategy to ingest data for copy jobs. One of `append` or `replace`, if empty the default strategy is `append`.
There can be only one copy node per pipe, and no other outputs, such as materialized views or API endpoints.
Schedule a copy pipe¶
You can schedule copy pipes to run at a specific time using a cron expression. To schedule a copy pipe, configure COPY_SCHEDULE
with a cron expression. On-demand copy pipes are defined by configuring COPY_SCHEDULE
with the value @on-demand
.
Here is an example of a copy pipe scheduled to run every hour and that writes the results of a query into the sales_hour_copy
data source:
NODE daily_sales SQL > % SELECT toStartOfDay(starting_date) day, country, sum(sales) as total_sales FROM teams WHERE day BETWEEN toStartOfDay({{DateTime(job_timestamp)}}) - interval 1 day AND toStartOfDay({{DateTime(job_timestamp)}}) and country = {{ String(country, 'US')}} GROUP BY day, country TYPE COPY TARGET_DATASOURCE sales_hour_copy COPY_SCHEDULE 0 * * * *
Before pushing the copy pipe to your workspace, make sure that the target data source already exists and has a schema that matches the output of the query result.
All schedules are executed in the UTC time zone. If you are configuring a schedule that runs at a specific time, be careful to consider that you will need to convert the desired time from your local time zone to UTC.
List your copy pipes¶
Use the tb copy ls
command to list all your copy pipes. See tb copy.
Run, pause, or resume a copy pipe¶
Use the tb copy
command to run, pause, or resume a copy pipe. See tb copy.
You can run tb job ls
to see any running jobs, as well as any jobs that have finished during the last 48 hours.
Next steps¶
- Learn about Materialized views.
- Publish your data with API endpoints.