Continuous integration and continuous deployment (CI/CD)

Once you connect your data project and Workspace through Git you can implement a Continuous Integration (CI) and Continuous Deployment (CD) workflow to automate interaction with Tinybird.

This page covers how CI and CD work using a walkthrough example. CI/CD pipelines require the use of:

How continuous integration works

As you expand and iterate on your data projects, you can continuously validate your API Endpoints. In the same way that you write integration and acceptance tests for source code in a software project, you can write automated tests for your API Endpoints to run on each Pull or Merge request.

Continuous Integration can help with:

  • Linting: Syntax and formatting on datafiles.
  • Correctness: Making sure you can push your changes to a Tinybird Workspace.
  • Quality: Running fixture tests or data quality tests or both to validate the changes in the Pull Request.
  • Regression: Running automatic regression tests to validate endpoint performance and data quality.

The following section uses the CI template, GitHub Actions, and the Tinybird CLI to demonstrate how to test your API Endpoints on any new commit to a Pull Request.

Set these optional environment variables to adapt your CI/CD workflow:

  • TB_VERSION_WARNING=0 : Don't print CLI version warning message if there's a new version available.
  • TB_SKIP_REGRESSION=0 : Skip regression tests.

Building the CI/CD pipeline

This section demonstrates automating CI/CD pipelines using GitHub as the provider with a GitHub Action, but you can use any suitable platform. The examples on this page use the Tinybird's CI and CD templates in this repository. You can find examples for Gitlab in that repository as well.

You can those example templates or build your own pipelines inspired by them. That way you can adapt them to suit your data project needs and integrate them better with the CI/CD workflow you use for other parts of your toolset.

These steps use the Tinybird CLI commands so you can fully reproduce the pipeline locally.

Remember to add a new secret with the Workspace administrator Token to the repository's settings to be able to run the needed commands from the CLI.

1. Trigger the CI workflow

Run the CI workflow on each commit to a Pull Request, when labelling, or with other kinds of triggers
name: Tinybird - CI Workflow

on:
   workflow_dispatch:
   pull_request:
     branches:
      - main
     types: [opened, reopened, labeled, unlabeled, synchronize, closed]

Key points: The CI workflow triggers when a new Pull Request opens, reopens, synchronizs or updates labels and the base branch has to be main. On closed, it deletes the Tinybird branch created for CI.

2. Configure the CI job

Use the workflow configuration defined in the uses reference
jobs:
   ci: # ci using Branches from Workspace 'web_analytics_starter_kit'
      uses: tinybirdco/ci/.github/workflows/ci.yml@main
      with:
         data_project_dir: .
      secrets:
         tb_admin_token: ${{ secrets.TB_ADMIN_TOKEN }}  # set Workspace admin Token in GitHub secrets
         tb_host: https://api.tinybird.co

You can combine the CI trigger and the CI job configuration into a single YAML workflow file and store it in /my_repo/.github/workflows. If you aren't already familiar with GitHub actions you can checkout the quickstart guide.

If your data project directory isn't in the root of the Git repository, change the data_project_dir variable.

About secrets:

  • tb_host: The URL of the region you want to use.
  • tb_admin_token: The Workspace admin Token. This grants all the permissions for a specific Workspace. You can find more information in the Tokens docs.

The CI workflow

A potential CI workflow could run the following steps:

  1. Configuration: set up dependencies and installs the Tinybird CLI to run the required commands.
  2. Check the data project syntax and the authentication.
  3. Create a new ephemeral CI Tinybird Branch.
  4. Push the changes to the Branch.
  5. Run tests in the Branch.
  6. Delete the Branch.

0. Workflow configuration

defaults:
   run:
     working-directory: ${{ inputs.data_project_dir }}
 if: ${{ github.event.action != 'closed' }}
 steps:
   - uses: actions/checkout@master
     with:
       fetch-depth: 300
       ref: ${{ github.event.pull_request.head.sha }}
   - uses: actions/setup-python@v5
     with:
       python-version: "3.11"
       architecture: "x64"
       cache: 'pip'

   - name: Validate input
     run: |
       [[ "${{ secrets.tb_admin_token }}" ]] || { echo "Go to the tokens section in your Workspace, copy the 'admin token' and set TB_ADMIN_TOKEN as a Secret in your Git repository"; exit 1; }

   - name: Set environment variables
     run: |
       _ENV_FLAGS="${ENV_FLAGS:=--last-partition --wait}"
       _NORMALIZED_BRANCH_NAME=$(echo $DATA_PROJECT_DIR | rev | cut -d "/" -f 1 | rev | tr '.-' '_')
       GIT_BRANCH=${GITHUB_HEAD_REF}
       echo "GIT_BRANCH=$GIT_BRANCH" >> $GITHUB_ENV
       echo "_ENV_FLAGS=$_ENV_FLAGS" >> $GITHUB_ENV
       echo "_NORMALIZED_BRANCH_NAME=$_NORMALIZED_BRANCH_NAME" >> $GITHUB_ENV

Key points: This sets the default working-directory to the data_project_dir variable, check outs the main branch to get the head commit, checks the TB_ADMIN_TOKEN, and installs Python 3.11.

1. Install the Tinybird CLI

- name: Install Tinybird CLI
   run: |
    if [ -f "requirements.txt" ]; then
      pip install -r requirements.txt
    else
      pip install tinybird-cli
    fi

- name: Tinybird version
   run: tb --version

Workflow actions use the Tinybird CLI to interact with your Workspace, create a test Branch, and run the tests. You can use a requirements.txt file to pin a tinybird-cli version to avoid automatically install the latest version.

You can run this workflow locally by having a local data project and the CLI authenticated to your Tinybird Workspace.

2. Check the data project syntax and the authentication

- name: Check all the datafiles syntax
  run: tb check

- name: Check auth
  run: tb --host ${{ secrets.tb_host }} --token ${{ secrets.tb_admin_token }} auth info

3. Create a new Tinybird Branch to deploy changes and run the tests

A Branch is an isolated copy of the resources in your Workspace at a specific point in time. It's designed to be temporary and disposable so that you can develop and test changes before deploying them to your Workspace.

Each CI job creates a Branch. In this example, the Tinybird Brand name uses github.event.pull_request.number as a unique identifier so multiple tests can run in parallel. If a Branch with the same name exist, it's removed and recreated again.

The tb branch create command creates new Branches. Once you merge your changes with the Pull Request, the workflow deletes your Tinybird Branch.

- name: Try to delete previous Branch
  run: |
    output=$(tb --host ${{ secrets.tb_host }} --token ${{ secrets.tb_admin_token }} branch ls)
    BRANCH_NAME="tmp_ci_${_NORMALIZED_BRANCH_NAME}_${{ github.event.pull_request.number }}"

    # Check if the branch name exists in the output
    if echo "$output" | grep -q "\b$BRANCH_NAME\b"; then
      tb \
      --host ${{ secrets.tb_host }} \
      --token ${{ secrets.tb_admin_token }} \
      branch rm $BRANCH_NAME \
      --yes
    else
      echo "Skipping clean up: The Branch '$BRANCH_NAME' doesn't exist."
    fi

- name: Create new test Branch with data
  run: |
    tb \
    --host ${{ secrets.tb_host }} \
    --token ${{ secrets.tb_admin_token }} \
    branch create tmp_ci_${_NORMALIZED_BRANCH_NAME}_${{ github.event.pull_request.number }} \
    ${_ENV_FLAGS}

Set the _ENV_FLAGS variable to --last-partition --wait to attach the most recently ingested data in the Workspace. This way, you can run the tests using the same data as in production. Alternatively, leave it empty and use fixtures.

4. Deploy changes to the Tinybird Branch

You can push the changes in your current Pull Request to the test Branch previously created in two ways:

Standard deployment

Use tb deploy if you connected your data project and Workspace through Git. This command pushes the file changes based on the result of git diff between the latest commit deployed to the Workspace and the current git branch HEAD commit.

If your data project and Workspace aren't connected through Git, you can use tb push --only-changes --force --yes. This command pushes the file changes based on the result of tb diff between the local changes in the git branch and the remote changes in the Tinybird branch.

Common tb push options:

  • --only-changes: Deploys the changed datafiles and its dependencies.
  • --force: Overrides any existing Pipe.
  • --yes: Confirms any alter to a Data Source.
  • --no-check: Avoid running regression tests when overwriting a Pipe Endpoint.
Custom deploy command

Alternatively, for more complex changes, you can decide how to deploy the changes to the test Branch. This is convenient, for instance, if additionally to deploy the datafiles you want to automate some other data operation, such as a running a copy Pipe, truncate a Data Source, etc.

For this to work, you have to place an executable shell script file in deploy/$VERSION/deploy.sh with the CLI commands to push the changes. $VERSION should be a global variable and unique to the current active Pull Request. You can find it in the .tinyenv file in the data project.

- name: Deploy changes to the test Branch
  run: |
     DEPLOY_FILE=./deploy/${VERSION}/deploy.sh
     if [ ! -f "$DEPLOY_FILE" ]; then
        echo "$DEPLOY_FILE not found, running default tb deploy command"
        tb deploy
     fi

- name: Custom deployment to the test Branch
  run: |
     DEPLOY_FILE=./deploy/${VERSION}/deploy.sh
     if [ -f "$DEPLOY_FILE" ]; then
        echo "$DEPLOY_FILE found"
        if ! [ -x "$DEPLOY_FILE" ]; then
           echo "Error: You don't have permission to execute '$DEPLOY_FILE'. Run:"
           echo "> chmod +x $DEPLOY_FILE"
           echo "and commit your changes"
           exit 1
        else
           $DEPLOY_FILE
        fi
     fi

5. Run the tests

You can now run your test suite. This is an optional step but recommended if you want to make sure everything works as expected.

Tinybird provides three type of tests by default, but you can include any test needed for your deployment pipeline:

  • Data fixture tests: These test specific business logic based on fixture data, see datasources/fixtures.
  • Data quality tests: These test precise data scenarios.
  • Regression tests: These test that requests to your API Endpoints are still working as expected. For these tests to work, you must attach production data using the --last-partition flag when creating the test Branch.

To learn more about testing Tinybird data projects, refer to the Implementing test strategies docs.

- name: Get regression labels
  id: regression_labels
  uses: SamirMarin/get-labels-action@v0
  with:
     github_token: ${{ secrets.GITHUB_TOKEN }}
     label_key: regression

- name: Run Pipe regression tests
  run: |
     source .tinyenv
     echo ${{ steps.regression_labels.outputs.labels }}
     REGRESSION_LABELS=$(echo "${{ steps.regression_labels.outputs.labels }}" | awk -F, '{for (i=1; i<=NF; i++) if ($i ~ /^--/) print $i}' ORS=',' | sed 's/,$//')
     echo ${REGRESSION_LABELS}

     CONFIG_FILE=./tests/regression.yaml
     BASE_CMD="tb branch regression-tests"
     LABELS_CMD="$(echo ${REGRESSION_LABELS} | tr , ' ')"
     if [ -f ${CONFIG_FILE} ]; then
        echo "Config file found: ${CONFIG_FILE}"
        ${BASE_CMD} -f ${CONFIG_FILE} --wait ${LABELS_CMD}
     else
        echo "Config file not found at '${CONFIG_FILE}', running with default values"
        ${BASE_CMD} coverage --wait ${LABELS_CMD}
     fi

- name: Append fixtures
  run: |
     if [ -f ./scripts/append_fixtures.sh ]; then
     echo "append_fixtures script found"
     ./scripts/append_fixtures.sh
     fi

- name: Run fixture tests
  run: |
     if [ -f ./scripts/exec_test.sh ]; then
     ./scripts/exec_test.sh
     fi

- name: Run data quality tests
  run: |
     tb test run -v -c 4

You can find the reference append_fixtures and exec_test scripts in this repository.

6. Delete the Branch

By default, the workflow doesn't delete Branches until it's merged into the main Workspace. The following step runs after the tests:

- name: Try to delete previous Branch
  run: |
     output=$(tb --host ${{ secrets.tb_host }} --token ${{ secrets.tb_admin_token }} branch ls)
     BRANCH_NAME="tmp_ci_${_NORMALIZED_BRANCH_NAME}_${{ github.event.pull_request.number }}"

     # Check if the branch name exists in the output
     if echo "$output" | grep -q "\b$BRANCH_NAME\b"; then
        tb \
           --host ${{ secrets.tb_host }} \
           --token ${{ secrets.tb_admin_token }} \
           branch rm $BRANCH_NAME \
           --yes
     else
        echo "Skipping clean up: The Branch '$BRANCH_NAME' doesn't exist."
     fi

You can have up to simultaneous 3 Branches per Workspace at any time. Contact Tinybird at support@tinybird.co if you need to increase this limit.

How continuous deployment works

Once a Pull Request passes CI and a peer reviews and approves it, it's time to merge it to your main Git branch. Continuous Deployment or sometimes Continuous Delivery automatically deploys changes to the Workspace.

While efficient, this workflow comes with several challenges, most of them related to handling the current state of your Tinybird Workspace. For instance:

  • As opposed to when you deploy a stateless app, deployments to a Workspace are incremental, based on the previous resources in the Workspace.
  • Resources or operations that run programatically and deal with handling state: populating operations or permission handling.
  • Performing deployments in the same Workspace; you need to be aware of this and implement a policy to avoid collisions from different Pull Requests deploying at the same time, or regressions.

As deployments rely on Git commits to push resources, your Branches must not be out-of-date when merging. Use your Git provider to control branch freshness.

The CD workflow explained here is a guide relevant to many of the most common use cases. However, some complex deployments may require additional knowledge and expertise from the team deploying the change.

Continuous Deployment helps with:

  • Correctness: Ensuring you can push your changes to a Tinybird Workspace.
  • Deployment: Deploying the changes to the Workspace automatically.
  • Data Operations: Centralizing data operations required after pushing resources to the Workspace.

The following section uses the generated CD template, GitHub Actions, and the Tinybird CLI to explain how to deploy Pull Request changes after merging.

Configure the CD job

CD workflow
name: Tinybird - CD Workflow

on:
   workflow_dispatch:
   push:
      branches:
         - main
jobs:
   cd:  # deploy changes to Workspace 'Web Analytics template'
      uses: tinybirdco/ci/.github/workflows/cd.yml@main
      with:
         data_project_dir: .
      secrets:
         tb_admin_token: ${{ secrets.TB_ADMIN_TOKEN }}  # set Workspace admin Token in GitHub secrets
         tb_host: https://api.tinybird.co

You can use this YAML workflow file and store it in /my_repo/.github/workflows. This workflow deploys on merge to main to the Workspace defined by the TB_ADMIN_TOKEN set as secret in the GitHub repository's Settings.

If your data project directory isn't in the root of the Git repository, change the data_project_dir variable.

About secrets:

  • tb_host: The URL of the region you want to use.
  • tb_admin_token: The Workspace admin Token. This grants all the permissions for a specific Workspace. You can find more information in the Tokens docs.

The CD workflow

The CD pipeline deploys the changes to the main Workspace the same way the CI pipeline deploys them to the Tinybird Branch. Run CD workflow on merging a PR to keep your Workspace in sync with the git repository main branch HEAD commit.

The CD workflow performs the following steps:

  1. Configuration
  2. Install the Tinybird CLI
  3. Checks authentication
  4. Pushes changes
  5. Post-deployment

0. Workflow configuration

Same as the CI workflow.

1. Install the Tinybird CLI and check authentication

Worflow actions use the Tinybird CLI to interact with your Workspace.

You can run this workflow locally by having a local data project and the CLI authenticated to your Tinybird Workspace.

This step is equivalent to, but not identical to, the CI workflow step 1.

- name: Install Tinybird CLI
   run: |
    if [ -f "requirements.txt" ]; then
      pip install -r requirements.txt
    else
      pip install tinybird-cli
    fi

- name: Tinybird version
  run: tb --version

- name: Check auth
  run: tb --host ${{ secrets.tb_host }} --token ${{ secrets.tb_admin_token }} auth info

2. Deploy changes

Use the same exact strategy that you used in CI.

If you did automatic deployment through git, then use tb deploy, otherwise tb push --only-changes --force. If you did a custom deployment for this specific PR make sure the same exact script runs in CD.

- name: Deploy changes to the main Workspace
  run: |
     DEPLOY_FILE=./deploy/${VERSION}/deploy.sh
     if [ ! -f "$DEPLOY_FILE" ]; then
        echo "$DEPLOY_FILE not found, running default tb push command"
        tb deploy
      fi

- name: Custom deployment to the main Workspace
  run: |
     DEPLOY_FILE=./deploy/${VERSION}/deploy.sh
     if [ -f "$DEPLOY_FILE" ]; then
        echo "$DEPLOY_FILE found"
        if ! [ -x "$DEPLOY_FILE" ]; then
           echo "Error: You don't have permission to execute '$DEPLOY_FILE'. Run:"
           echo "> chmod +x $DEPLOY_FILE"
           echo "and commit your changes"
           exit 1
        else
           $DEPLOY_FILE
        fi
     fi

Other git providers

Most git vendors provide a way to run CI/CD pipelines. This example here is a guide on how you can use the Tinybird CLI + git to build CI/CD pipelines with GitHub actions, but you can create similar pipelines with other providers or adapt your own pipelines to support CI/CD to a Tinybird Workspace. You can find the a similar workflow for GitLab here.

Updated