Service Data Sources

Tinybird provides Service Data Sources that you can use to inspect what's going on in your Tinybird account, diagnose issues, monitor usage, and so on.

For example, you can get real time stats about API calls or a log of every operation over your Data Sources. This is similar to using system tables in a database, although Service Data Sources contain information about the usage of the service itself.

Queries made to Service Data Sources are free of charge and don't count towards your usage. However, calls to API Endpoints that use Service Data Sources do count towards API rate limits. See Billing.

Considerations

  • You can't use Service Data Sources in Materialized View queries.
  • Pass dynamic query parameters to API Endpoints to then query Service Data Sources.
  • You can only query Organization-level Service Data Sources if you're an administrator. See Consumption overview.

Service Data Sources

The following Service Data Sources are available.

tinybird.pipe_stats_rt

Contains information about all requests made to your API Endpoints in real time.

This Data Source has a TTL of 7 days. If you need to query data older than 7 days you must use the aggregated by day data available at tinybird.pipe_stats.

FieldType Description
start_datetimeDateTimeAPI call start date and time.
pipe_idStringPipe Id as returned in our Pipes API (query_api in case it is a Query API request).
pipe_nameStringPipe name as returned in our Pipes API (query_api in case it is a Query API request).
durationFloatAPI call duration in seconds.
read_bytesUInt64API call read data in bytes.
read_rowsUInt64API call rows read.
result_rowsUInt64Rows returned by the API call.
urlStringURL (token param is removed for security reasons).
errorUInt81 if query returned error, else 0.
request_idStringAPI call identifier returned in x-request-id header. Format is ULID string.
tokenStringAPI call token identifier used.
token_nameStringAPI call token name used.
status_codeInt32API call returned status code.
methodStringAPI call method POST or GET.
parametersMap(String, String)API call parameters used.
releaseStringSemantic version of the release (deprecated).
user_agentNullable(String)User Agent HTTP header from the request.
resource_tagsArray(String)Tags associated with the Pipe when the request was made.

tinybird.pipe_stats

Aggregates the request stats in tinybird.pipe_stats_rt by day.

FieldTypeDescription
dateDateRequest date and time.
pipe_idStringPipe Id as returned in our Pipes API.
pipe_nameStringName of the Pipe.
view_countUInt64Request count.
error_countUInt64Number of requests with error.
avg_duration_stateAggregateFunction(avg, Float32)Average duration state in seconds (see Querying _state columns).
quantile_timing_stateAggregateFunction(quantilesTiming(0.9, 0.95, 0.99), Float64)0.9, 0.95 and 0.99 quantiles state. Time in milliseconds (see Querying _state columns).
read_bytes_sumUInt64Total bytes read.
read_rows_sumUInt64Total rows read.
resource_tagsArray(String)All the tags associated with the resource when the aggregated requests were made.

tinybird.bi_stats_rt

Contains information about all requests to your BI Connector interface in real time.

This Data Source has a TTL of 7 days. If you need to query data older than 7 days you must use the aggregated by day data available at tinybird.bi_stats.

FieldTypeDescription
start_datetimeDateTimeQuery start timestamp.
queryStringExecuted query.
query_normalizedStringNormalized executed query. This is the pattern of the query, without literals. Useful to analyze usage patterns.
error_codeInt32Error code, if any. 0 on normal execution.
errorStringError description, if any. Empty otherwise.
durationUInt64Query duration in milliseconds.
read_rowsUInt64Read rows.
read_bytesUInt64Read bytes.
result_rowsUInt64Total rows returned.
result_bytesUInt64Total bytes returned.

tinybird.bi_stats

Aggregates the stats in tinybird.bi_stats_rt by day.

Field Type Description
dateDateStats date.
databaseStringDatabase identifier.
query_normalizedStringNormalized executed query. This is the pattern of the query, without literals. Useful to analyze usage patterns.
view_countUInt64Requests count.
error_countUInt64Error count.
avg_duration_stateAggregateFunction(avg, Float32)Average duration state in milliseconds (see Querying _state columns).
quantile_timing_stateAggregateFunction(quantilesTiming(0.9, 0.95, 0.99), Float64)0.9, 0.95 and 0.99 quantiles state. Time in milliseconds (see Querying _state columns).
read_bytes_sumUInt64Total bytes read.
read_rows_sumUInt64Total rows read.
avg_result_rows_stateAggregateFunction(avg, Float32)Total bytes returned state (see Querying _state columns).
avg_result_bytes_stateAggregateFunction(avg, Float32)Total rows returned state (see Querying _state columns).

tinybird.block_log

The Data Source contains details about how Tinybird ingests data into your Data Sources. You can use this Service Data Source to spot problematic parts of your data.

FieldTypeDescription
timestampDateTimeDate and time of the block ingestion.
import_idStringId of the import operation.
job_idNullable(String)Id of the job that ingested the block of data, if it was ingested by URL. In this case, import_id and job_id must have the same value.
request_idStringId of the request that performed the operation. In this case, import_id and job_id must have the same value. Format is ULID string.
sourceStringEither the URL or stream or body keywords.
block_idStringBlock identifier. You can cross this with the blocks_ids column from the tinybird.datasources_ops_log Service Data Source.
statusStringdone | error.
datasource_idStringData Source consistent id.
datasource_nameStringData Source name when the block was ingested.
start_offsetNullable(Int64)The starting byte of the block, if the ingestion was split, where this block started.
end_offsetNullable(Int64)If split, the ending byte of the block.
rowsNullable(Int32)How many rows it ingested.
parserNullable(String)Whether the native block parser or falling back to row by row parsing is used.
quarantine_linesNullable(UInt32)If any, how many rows went into the quarantine Data Source.
empty_linesNullable(UInt32)If any, how many empty lines were skipped.
bytesNullable(UInt32)How many bytes the block had.
processing_timeNullable(Float32)How long it took in seconds.
processing_errorNullable(String)Detailed message in case of error.

When Tinybird ingests data from a URL, it splits the download in several requests, resulting in different ingestion blocks. The same happens when the data upload happens with a multipart request.

tinybird.datasources_ops_log

Contains all operations performed to your Data Sources. Tinybird tracks the following operations:

EventDescription
createA Data Source is created.
sync-dynamodbInitial synchronization from a DynamoDB table when using the DynamoDB Connector
appendAppend operation.
append-hfiAppend operation using the High-frequency Ingestion API.
append-kafkaAppend operation using the Kafka Connector.
append-dynamodbAppend operation using the DynamoDB Connector
replaceA replace operation took place in the Data Source.
deleteA delete operation took place in the Data Source.
truncateA truncate operation took place in the Data Source.
renameThe Data Source was renamed.
populateview-queuedA populate operation was queued for execution.
populateviewA finished populate operation (up to 8 hours after it started).
copyA copy operation took place in the Data Source.
alterAn alter operation took place in the Data Source.
resource_tagsArray(String)Tags associated with the Pipe when the request was made.

Materializations are logged with same event_type and operation_id as the operation that triggers them. You can track the materialization Pipe with pipe_id and pipe_name.

Tinybird logs all operations with the following information in this Data Source:

FieldTypeDescription
timestampDateTimeDate and time when the operation started.
event_typeStringOperation being logged.
operation_idStringGroups rows affected by the same operation. Useful for checking materializations triggered by an append operation.
datasource_idStringId of your Data Source. The Data Source id is consistent after renaming operations. You should use the id when you want to track name changes.
datasource_nameStringName of your Data Source when the operation happened.
resultStringok | error
elapsed_timeFloat32How much time the operation took in seconds.
errorNullable(String)Detailed error message if the result was error.
import_idNullable(String)Id of the import operation, if data has been ingested using one of the following operations: create, append or replace
job_idNullable(String)Id of the job that performed the operation, if any. If data has been ingested, import_id and job_id must have the same value.
request_idStringId of the request that performed the operation. If data has been ingested, import_id and request_id must have the same value. Format is ULID string.
rowsNullable(UInt64)How many rows the operations affected. This depends on event_type: for the append event, how many rows got inserted; for delete or truncate events, how many rows the Data Source had; for replace, how many rows the Data Source has after the operation.
rows_quarantineNullable(UInt64)How many rows went into the quarantine Data Source, if any.
blocks_idsArray(String)List of blocks ids used for the operation. See the tinybird.block_log Service Data Source for more details.
optionsNested(Names String, Values String)Tinybird stores key-value pairs with extra information for some operations. For the replace event, Tinybird uses the rows_before_replace key to track how many rows the Data Source had before the replacement happened, the replace_condition key shows what condition was used. For append and replace events, Tinybird stores the data source, for example the URL, or body/stream keywords. For rename event, old_name and new_name. For populateview you can find there the whole populate job metadata as a JSON string. For alter events, Tinybird stores operations, and dependent pipes as dependencies if they exist.
read_bytesUInt64Read bytes in the operation.
read_rowsUInt64Read rows in the operation.
written_rowsUInt64Written rows in the operation.
written_bytesUInt64Written bytes in the operation.
written_rows_quarantineUInt64Quarantined rows in the operation.
written_bytes_quarantineUInt64Quarantined bytes in the operation.
pipe_idStringIf present, materialization Pipe id as returned in our Pipes API.
pipe_nameStringIf present, materialization Pipe name as returned in our Pipes API.
releaseStringSemantic version of the release (deprecated).

tinybird.datasource_ops_stats

Data from datasource_ops_log, aggregated by day.

FieldTypeDescription
event_dateDateDate of the event.
workspace_idStringUnique identifier for the Workspace.
event_typeStringName of your Data Source.
pipe_idStringIdentifier of the Pipe.
pipe_nameStringName of the Pipe.
error_countUInt64Number of requests with an error.
executionsUInt64Number of executions.
avg_elapsed_time_stateFloat32Average time spent in elapsed state.
quantiles_stateFloat320.9, 0.95 and 0.99 quantiles state. Time in milliseconds (see Querying _state columns).
read_bytesUInt64Read bytes in the operation.
read_rowsUInt64Read rows in the Sink operation.
written_rowsUInt64Written rows in the Sink operation.
read_bytesUInt64Read bytes in the operation.
written_bytesUInt64Written bytes in the operation.written_rows_quarantineUInt64Quarantined rows in the operation.
written_bytes_quarantineUInt64Quarantined bytes in the operation.
resource_tagsArray(String)Tags associated with the Pipe when the request was made.

tinybird.endpoint_errors

It provides the last 30 days errors of your published endpoints. Tinybird logs all errors with additional information in this Data Source.

FieldTypeDescription
start_datetimeDateTimeDate and time when the API call started.
request_idStringThe id of the request that performed the operation. Format is ULID string.
pipe_idStringIf present, Pipe id as returned in our Pipes API.
pipe_nameStringIf present, Pipe name as returned in our Pipes API.
paramsNullable(String)URL query params included in the request.
urlNullable(String)URL pathname.
status_codeNullable(Int32)HTTP error code.
errorNullable(String)Error message.
resource_tagsArray(String)Tags associated with the Pipe when the request was made.

tinybird.kafka_ops_log

Contains all operations performed to your Kafka Data Sources during the last 30 days.

FieldTypeDescription
timestampDateTimeDate and time when the operation took place.
datasource_idStringId of your Data Source. The Data Source id is consistent after renaming operations. You should use the id when you want to track name changes.
topicStringKafka topic.
partitionInt16Partition number, or -1 for all partitions.
msg_typeString'info' for regular messages, 'warning' for issues related to the user's Kafka cluster, deserialization or Materialized Views, and 'error' for other issues.
lagInt64Number of messages behind for the partition. This is the difference between the high-water mark and the last commit offset.
processed_messagesInt32Messages processed for a topic and partition.
processed_bytesInt32Amount of bytes processed.
committed_messagesInt32Messages ingested for a topic and partition.
msgStringInformation in the case of warnings or errors. Empty otherwise.

tinybird.datasources_storage

Contains stats about your Data Sources storage.

Tinybird logs maximum values per hour, the same as when it calculates storage consumption.

FieldTypeDescription
datasource_idStringId of your Data Source. The Data Source id is consistent after renaming operations. You should use the id when you want to track name changes.
datasource_nameStringName of your Data Source.
timestampDateTimeWhen storage was tracked. By hour.
bytesUInt64Max number of bytes the Data Source has, not including quarantine.
rowsUInt64Max number of rows the Data Source has, not including quarantine.
bytes_quarantineUInt64Max number of bytes the Data Source has in quarantine.
rows_quarantineUInt64Max number of rows the Data Source has in quarantine.

tinybird.releases_log (deprecated)

Contains operations performed to your releases. Tinybird tracks the following operations:

EventDescription
initFirst Release is created on Git sync.
overrideRelease commit is overridden. tb init --override-commit {{commit}}.
deployResources from a commit are deployed to a Release.
previewRelease status is changed to preview.
promoteRelease status is changed to live.
postResources from a commit are deployed to the live Release.
rollbackRollback is done a previous Release is now live.
deleteRelease is deleted.

Tinybird logs all operations with additional information in this Data Source.

FieldTypeDescription
timestampDateTime64Date and time when the operation took place.
event_typeStringName of your Data Source.
semverStringSemantic version identifies a release.
commitStringGit sha commit related to the operation.
tokenStringAPI call token identifier used.
token_nameStringAPI call token name used.
resultStringok | error
errorStringDetailed error message

tinybird.sinks_ops_log

Contains all operations performed to your Sink Pipes.

FieldTypeDescription
timestampDateTime64Date and time when the operation took place.
serviceLowCardinality(String)Type of Sink (GCS, S3, and so on)
pipe_idStringThe ID of the Sink Pipe.
pipe_nameStringthe name of the Sink Pipe.
token_nameStringToken name used.
resultLowCardinality(String)ok | error
errorNullable(String)Detailed error message
elapsed_timeFloat64The duration of the operation in seconds.
job_idNullable(String)ID of the job that performed the operation, if any.
read_rowsUInt64Read rows in the Sink operation.
written_rowsUInt64Written rows in the Sink operation.
read_bytesUInt64Read bytes in the operation.
written_bytesUInt64Written bytes in the operation.
outputArray(String)The outputs of the operation. In the case of writing to a bucket, the name of the written files.
parametersMap(String, String)The parameters used. Useful to debug the parameter query values.
optionsMap(String, String)Extra information. You can access the values with options['key'] where key is one of: file_template, file_format, file_compression, bucket_path, execution_type.

tinybird.data_transfer

Stats of data transferred per hour by a Workspace.

FieldTypeDescription
timestampDateTimeDate and time data transferred is tracked. By hour.
eventLowCardinality(String)Type of operation generated the data (ie. sink)
origin_providerLowCardinality(String)Provider data was transferred from.
origin_regionLowCardinality(String)Region data was transferred from.
destination_providerLowCardinality(String)Provider data was transferred to.
destination_regionLowCardinality(String)Region data was transferred to.
kindLowCardinality(String)intra | inter depending if the data moves within or outside the region.

tinybird.jobs_log

Contains all job executions performed in your Workspace. Tinybird logs all jobs with extra information in this Data Source:

FieldTypeDescription
job_idStringUnique identifier for the job.
job_typeLowCardinality(String)Type of job execution. delete_data, import, populateview, query, copy, copy_from_main, copy_from_branch, data_branch, deploy_branch, regression_tests, sink, sink_from_branch.
workspace_idStringUnique identifier for the Workspace.
pipe_idStringUnique identifier for the Pipe.
pipe_nameStringName of the Pipe.
created_atDateTimeTimestamp when the job was created.
updated_atDateTimeTimestamp when the job was last updated.
started_atDateTimeTimestamp when the job execution started.
statusLowCardinality(String)Current status of the job. waiting, working, done, error, cancelled.
errorNullable(String)Detailed error message if the result was error.
job_metadataJSON StringAdditional metadata related to the job execution.

Learn more about how to track background jobs execution in the Jobs monitoring guide.

Use resource_tags to better track usage

You can use tags that you've added to your resources, like Pipes or Data Sources, to analyze usage and cost attribution across your organization.

For example, you can add tags for projects, environments, or versions and compare usage in later queries to Service Data Sources such as tinybird.datasources_ops_stats, which aggregates operations data by day.

The following Service Data Sources support resource_tags:

  • pipe_stats_rt
  • pipe_stats
  • endpoint_errors
  • organization.pipe_stats_rt
  • organization.pipe_stats
  • datasources_ops_log
  • datasources_ops_stats
  • organization.datasources_ops_log
  • organization.datasources_ops_stats

To add tags to resources, see Organizing resources in Workspaces.

Query _state columns

Several of the Service Data Sources include columns suffixed with _state. This suffix identifies columns with values that are in an intermediate aggregated state. When reading these columns, merge the intermediate states to get the final value.

To merge intermediate states, wrap the column in the original aggregation function and apply the -Merge combinator.

For example, to finalize the value of the avg_duration_state column, you use the avgMerge function:

finalize the value for the avg_duration_state column
SELECT
  date,
  avgMerge(avg_duration_state) avg_time,
  quantilesTimingMerge(0.9, 0.95, 0.99)(quantile_timing_state) quantiles_timing_in_ms_array
FROM tinybird.pipe_stats
where pipe_id = 'PIPE_ID'
group by date

See Combinators to learn more about the -Merge combinator.

Organization Service Data Sources

The following is a complete list of available Organization Service Data Sources:

FieldDescription
organization.workspacesLists all Organization Workspaces and related information, including name, IDs, databases, plan, when it was created, and whether it has been soft-deleted.
organization.processed_dataInformation related to all processed data per day per workspace.
organization.datasources_storageEquivalent to tinybird.datasources_storage but with data for all Organization Workspaces.
organization.pipe_statsEquivalent to tinybird.pipe_stats but with data for all Organization Workspaces.
organization.pipe_stats_rtEquivalent to tinybird.pipe_stats_rt but with data for all Organization Workspaces.
organization.datasources_ops_logEquivalent to tinybird.datasources_ops_log but with data for all Organization Workspaces.
organization.data_transferEquivalent to tinybird.data_transfer but with data for all Organization Workspaces.
organization.jobs_logEquivalent to tinybird.jobs_log but with data for all Organization Workspaces.
organization.sinks_ops_logEquivalent to tinybird.sinks_ops_log but with data for all Organization Workspaces.
organization.bi_statsEquivalent to tinybird.bi_stats but with data for all Organization Workspaces.
organization.bi_stats_rtEquivalent to tinybird.bi_stats_rt but with data for all Organization Workspaces.

To query Organization Service Data Sources, go to any Workspace that belongs to the Organization and use the previous as regular Service Data Source from the Playground or within Pipes. Use the admin Token of an Organization Admin. You can also copy your admin Token and make queries using your preferred method, like tb sql`.

metrics_logs Service Data Source

The metrics_logs Service Data Source is available in all the organization's workspaces. As with the rest of Organization Service Data Sources, it's only available to Organization administrators. New records for each of the metrics monitored are added every minute with the following schema:

FieldTypeDescription
timestampDateTimeTimestamp of the metric
clusterLowCardinality(String)Name of the cluster
hostLowCardinality(String)Name of the host
metricLowCardinality(String)Name of the metric
valueStringValue of the metric
descriptionLowCardinality(String)Description of the metric
organization_idStringID of your organization

The available metrics are the following:

MetricDescription
MemoryTrackingTotal amount of memory, in bytes, allocated by the server.
OSMemoryTotalThe total amount of memory on the host system, in bytes.
InstanceTypeInstance type of the host.
QueryNumber of executing queries.
NumberCPUNumber of CPUs.
LoadAverage1The whole system load, averaged with exponential smoothing over 1 minute. The load represents the number of threads across all the processes (the scheduling entities of the OS kernel), that are currently running by CPU or waiting for IO, or ready to run but not being scheduled at this point of time. This number includes all the processes, not only the server. The number can be greater than the number of CPU cores, if the system is overloaded, and many processes are ready to run but waiting for CPU or IO.
LoadAverage15The whole system load, averaged with exponential smoothing over 15 minutes. The load represents the number of threads across all the processes (the scheduling entities of the OS kernel), that are currently running by CPU or waiting for IO, or ready to run but not being scheduled at this point of time. This number includes all the processes, not only the server. The number can be greater than the number of CPU cores, if the system is overloaded, and many processes are ready to run but waiting for CPU or IO.
CPUUsageThe ratio of time the CPU core was running OS kernel (system) code or userspace code. This is a system-wide metric, it includes all the processes on the host machine, not just the server. This includes also the time when the CPU was under-utilized due to the reasons internal to the CPU (memory loads, pipeline stalls, branch mispredictions, running another SMT core).
Updated