GCS connector¶

You can set up a GCS connector to load your CSV, NDJSON, or Parquet files into Tinybird from any GCS bucket. Tinybird does not automatically detect new files; ingestion must be triggered manually.

Setting up the GCS connector requires:

Configuring a Service Account with these permissions in GCP.
Creating a connection file in Tinybird.
Creating a data source that uses this connection.

Set up the connector¶

Create a GCS connection¶

You can create a GCS connection in Tinybird using either the CLI or by manually creating a connection file.

Option 1: Use the CLI (recommended)¶

Run the following command to create a connection:

tb connection create gcs

You will be prompted to enter:

A name for your connection.
The GCS bucket name.
The service account credentials (JSON key file). You can check Google Cloud docs for mode details.
Whether to create the connection for your Cloud environment.

Option 2: Manually create a connection file¶

Create a .connection file with the required credentials:

gcs_sample.connection

TYPE gcs
GCS_SERVICE_ACCOUNT_CREDENTIALS_JSON {{ tb_secret("GCS_KEY") }}

Ensure your GCP Service Account has the roles/storage.objectViewer role.

Use different Service Account keys for each environment leveraging Tinybird Secrets.

Create a GCS data source¶

After setting up the connection, create a data source.

Create a .datasource file using tb datasource create --gcs or manually:

gcs_sample.datasource

DESCRIPTION >
    Analytics events landing data source

SCHEMA >
    `timestamp` DateTime `json:$.timestamp`,
    `session_id` String `json:$.session_id`,
    `action` LowCardinality(String) `json:$.action`,
    `version` LowCardinality(String) `json:$.version`,
    `payload` String `json:$.payload`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "timestamp"
ENGINE_TTL "timestamp + toIntervalDay(60)"

IMPORT_CONNECTION_NAME gcs_sample
IMPORT_BUCKET_URI gs://my-bucket/*.csv
IMPORT_SCHEDULE '@on-demand'

The IMPORT_CONNECTION_NAME setting must match the name of your .connection file.

Sync data¶

Since automatic ingestion (@auto mode) is not supported, you must manually sync data when new files are available.

Using the API¶

curl -X POST "https://api.tinybird.co/v0/datasources/<datasource_name>/scheduling/runs" \
  -H "Authorization: Bearer <your-tinybird-token>"

Using the CLI¶

tb datasource sync <datasource_name>

.connection settings¶

The GCS connector use the following settings in .connection files:

Instruction	Required	Description
`GCS_SERVICE_ACCOUNT_CREDENTIALS_JSON`	Yes	Service Account Key in JSON format, inlined. We recommend using Tinybird Secrets.

Once a connection is used in a data source, you can't change the Service Account Key. To modify it, you must:

Remove the connection from the data source.
Deploy the changes.
Add the connection again with the new values.

.datasource settings¶

The GCS connector uses the following settings in .datasource files:

Instruction	Required	Description
`IMPORT_CONNECTION_NAME`	Yes	Name given to the connection inside Tinybird. For example, `'my_connection'`. This is the name of the connection file you created in the previous step.
`IMPORT_BUCKET_URI`	Yes	Full bucket path, including the `gs://` protocol, bucket name, object path, and an optional pattern to match against object keys. For example, `gs://my-bucket/my-path` discovers all files in the bucket `my-bucket` under the prefix `/my-path`. You can use patterns in the path to filter objects, for example, ending the path with `*.csv` matches all objects that end with the `.csv` suffix.
`IMPORT_SCHEDULE`	Yes	Use `@on-demand` to sync new files as needed, only files added to the bucket since the last execution will be appended to the datasource. You can also use `@once`, which behaves the same as `@on-demand`. However, `@auto` mode is not supported yet; if you use this option, only the initial sync will be executed.
`IMPORT_FROM_TIMESTAMP`	No	Sets the date and time from which to start ingesting files on an GCS bucket. The format is `YYYY-MM-DDTHH:MM:SSZ`.

We don't support changing these settings after the data source is created. If you need to do that, you must:

Remove the connection from the data source.
Deploy the changes.
Add the connection again with the new values.
Deploy again.

GCS file URI¶

Use GCS wildcards to match multiple files:

* (single asterisk): Matches files at one directory level.
- Example: gs://bucket-name/*.ndjson (matches all .ndjson files in the root directory, but not in subdirectories).
** (double asterisk): Recursively matches files across multiple directory levels.
- Example: gs://bucket-name/**/*.ndjson (matches all .ndjson files anywhere in the bucket).

GCS does not allow overlapping ingestion paths. For example, you cannot have:

gs://my_bucket/**/*.csv
gs://my_bucket/transactions/*.csv

Supported file types¶

The GCS Connector supports the following formats:

File Type \| Accepted Extensions \| Supported Compression
CSV \| `.csv`, `.csv.gz` \| `gzip`	NDJSON \| `.ndjson`, `.ndjson.gz`, `.jsonl`, `.jsonl.gz` \| `gzip`	Parquet \| `.parquet`, `.parquet.gz` \| `snappy`, `gzip`, `lzo`, `brotli`, `lz4`, `zstd`

JSON files must follow the Newline Delimited JSON (NDJSON) format. Each line must be a valid JSON object and must end with a \n character.

GCS Permissions¶

To authenticate Tinybird with GCS, you need a GCP service account key in JSON format with the Object Storage Viewer role.

In the Google Cloud Console, create or use an existing service account.
Assign the roles/storage.objectViewer role.
Generate a JSON key file and download it.
Store the key as a Tinybird secret in a .env.local file to work in local:

GCS_KEY='<your-json-key-content>'

Store the key in Cloud as a Tinybird secret:

tb --cloud secret set GCS_KEY '<your-json-key-content>'

Limitations¶

No @auto mode: Ingestion must be triggered manually.
File format support: Only CSV, NDJSON, and Parquet are supported.
Permissions: Ensure your service account has the correct role assigned.

Get started

Ingest data

Work with data

Analytics agents

Test and deploy

Monitor your data

Administration

Pricing

Deployment options

Reference