Monitor jobs in your Workspace

What are jobs?

Many operations in your Tinybird Workspace, like Imports, Copy Jobs, Sinks, and Populates, are executed as background jobs within the platform.

When you trigger these operations via the Tinybird API, they are queued and processed asynchronously on Tinybird's infrastructure. This means the API request itself completes quickly, while the actual operation runs in the background and finishes slightly later. This approach ensures that the system can handle a large volume of requests efficiently without causing timeouts or delays in your workflow.

Monitoring and managing these jobs (for instance, querying job statuses, types, and execution details) is essential for maintaining a healthy Workspace. The two mechanisms for generic job monitoring are the Jobs API and the jobs_log Data Source.

The Jobs API and the jobs_log return identical information about job execution. However, the Jobs API has some limitations: It reports only on a single Workspace, returns only 100 records, from the last 48 hours. If you want to monitor jobs outside these parameters, use the jobs_log Data Source.

You can also track more specific things using dedicated Service Data Sources, such as datasources_ops_log for import, replaces, or copy, or sinks_ops_log for sink operations, or tracking jobs across Organizations with organization.jobs_log. Read the Service Data Sources docs for more.

Track a specific job

Jobs API

The Jobs API is a convenient way to programmatically check the status of a job. By sending an HTTP GET request, you can retrieve detailed information about a specific job. This method is particularly useful for integration into scripts or applications.

curl \
    -X GET "https://$TB_HOST/v0/jobs/{job_id}" \
    -H "Authorization: Bearer $TOKEN"

Replace {job_id} with the actual job ID.

Replace the Tinybird API hostname/region with the right API URL region that matches your Workspace. Your Token lives in the Workspace under "Tokens".

SQL

Alternatively, you can use SQL to query the jobs_log Data Source from directly within a Tinybird Pipe. This method is ideal for users who are comfortable with SQL and prefer to run queries directly against the data, and then expose them with an Endpoint or perform any other actions with it.

SELECT * FROM tinybird.jobs_log WHERE job_id='{job_id}'

Replace {job_id} with the actual job ID. This query retrieves all columns for the specified job, providing comprehensive details about its execution.

Track specific job types

Tracking jobs by type allows you to monitor and analyze all jobs of a certain category, such as all copy jobs. This can help you understand the performance and status of specific job types across your entire Workspace.

Jobs API

Using the Jobs API, fetch all jobs of a specific type by making an HTTP GET request:

curl \
    -X GET "https://$TB_HOST/v0/jobs?kind=copy" \
    -H "Authorization: Bearer $TOKEN"

Replace copy with the type of job you want to track. Ensure you have set your Tinybird host ($TB_HOST) and authorization token ($TOKEN) correctly.

SQL

Alternatively, run an SQL query to fetch all jobs of a specific type from the jobs_log Data Source:

SELECT * FROM tinybird.jobs_log WHERE job_type='copy'

Replace copy with the desired job type. This query retrieves all columns for jobs of the specified type.

Track ongoing jobs

To keep track of jobs that are currently running, you can query the status of jobs in progress. This helps in monitoring the real-time workload and managing system performance.

Jobs API

By making an HTTP GET request to the Jobs API, you can fetch all jobs that are currently in the working status:

curl \
    -X GET "https://$TB_HOST/v0/jobs?status=working" \
    -H "Authorization: Bearer $TOKEN"

This command retrieves jobs that are actively running. Ensure you have set your Tinybird host ($TB_HOST) and authorization token ($TOKEN) correctly.

SQL

You can also use an SQL query to fetch currently running jobs from the jobs_log Data Source:

SELECT * FROM tinybird.jobs_log WHERE status='working'

This query retrieves all columns for jobs with the status working, allowing you to monitor ongoing operations.

Track errored jobs

Tracking errored jobs is crucial for identifying and resolving issues that may arise during job execution. Jobs API and/or SQL queries to jobs_log will help you to efficiently monitor jobs that errored during the execution.

Jobs API

The Jobs API allows you to programmatically fetch details of jobs that have ended in error.

Use the following curl command to retrieve all jobs that have a status of error:

curl \
    -X GET "https://$TB_HOST/v0/jobs?status=error" \
    -H "Authorization: Bearer $TOKEN"

This command fetches a list of jobs that are currently in an errored state, providing details that can be used for further analysis or debugging. Ensure you've set your Tinybird host ($TB_HOST) and authorization token ($TOKEN) correctly.

SQL

Alternatively, you can use SQL to query the jobs_log Data Source directly.

Use the following SQL query to fetch job IDs, job types, and error messages for jobs that have encountered errors in the past day:

SELECT job_id, job_type, error
FROM tinybird.jobs_log
WHERE
    status='error' AND
    created_at > now() - INTERVAL 1 DAY

Track success rate

Extrapolating from errored jobs, you can also use jobs_log to calculate the success rate of your Workspace jobs:

SELECT
    job_type,
    pipe_id,
    countIf(status='done') AS job_success,
    countIf(status='error') AS job_error,
    job_success / (job_success + job_error) as success_rate
FROM tinybird.jobs_log
WHERE
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type, pipe_id

Get job execution metadata

In the jobs_log Data Source, there is a property called job_metadata that contains metadata related to job executions. This includes the execution type (manual or scheduled) for Copy and Sink jobs, or the count of quarantined rows for Append operations, along with many other properties.

You can extract and analyze this metadata using JSON functions within SQL queries. This allows you to gain valuable information about job executions directly from the jobs_log Data Source.

The following SQL query is an example of how to extract specific metadata fields from the job_metadata property, such as the import mode and counts of quarantined rows and invalid lines, and how to aggregate this data for analysis:

SELECT
    job_type,
    JSONExtractString(job_metadata, 'mode') AS import_mode, 
    sum(simpleJSONExtractUInt(job_metadata, 'quarantine_rows')) AS quarantine_rows,
    sum(simpleJSONExtractUInt(job_metadata, 'invalid_lines')) AS invalid_lines
FROM tinybird.jobs_log
WHERE
    job_type='import' AND
    created_at >= toStartOfDay(now())
GROUP BY job_type, import_mode

There are many other use cases you can put together with the properties in the job_metadata; see below.

Advanced use cases

Beyond basic tracking, you can leverage the jobs_log Data Source for more advanced use cases, such as gathering statistics and performance metrics. This can help you optimize job scheduling and resource allocation.

Get queue status

The following SQL query returns the number of jobs that are waiting to be executed, the number of jobs that are in progress, and how many of them are done already:

SELECT
    job_type,
    countIf(status='waiting') AS jobs_in_queue,
    countIf(status='working') AS jobs_in_progress,
    countIf(status='done') AS jobs_succeeded,
    countIf(status='error') AS jobs_errored
FROM tinybird.jobs_log
WHERE
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type

Get statistics on run time grouped by type of job

The following SQL query calculates the maximum, minimum, median, and p95 running time (in seconds) grouped by type of job (e.g. import, copy, sinks) over the past day. This helps in understanding the efficiency of different job types:

SELECT
    job_type,
    max(date_diff('s', started_at, updated_at)) as max_run_time_in_secs,
    min(date_diff('s', started_at, updated_at)) as min_run_time_in_secs,
    median(date_diff('s', started_at, updated_at)) as median_run_time_in_secs,
    quantile(0.95)(date_diff('s', started_at, updated_at)) as p95_run_time_in_secs
FROM tinybird.jobs_log
WHERE
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type

Get statistics on queue time by type of job

The following SQL query calculates the average queue time (in seconds) for a specific type of job (e.g., copy) over the past day. This can help in identifying bottlenecks in job scheduling:

SELECT
    job_type,
    max(date_diff('s', created_at, started_at)) as max_run_time_in_secs,
    min(date_diff('s', created_at, started_at)) as min_run_time_in_secs,
    median(date_diff('s', created_at, started_at)) as median_run_time_in_secs,
    quantile(0.95)(date_diff('s', created_at, started_at)) as p95_run_time_in_secs
FROM tinybird.jobs_log
WHERE
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type

Get statistics on job completion rate

The following SQL query calculates the success rate by type of job (e.g., copy) and Pipe over the past day. This can help you to assess the reliability and efficiency of your workflows by measuring the completion rate of the jobs, and find potential issues and areas for improvement:

SELECT
    job_type,
    pipe_id,
    countIf(status='done') AS job_success,
    countIf(status='error') AS job_error,
    job_success / (job_success + job_error) as success_rate
FROM tinybird.jobs_log
WHERE
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type, pipe_id

Get statistics on the amount of manual vs. scheduled run jobs

The following SQL query calculates the percentage rate between manual and scheduled jobs. Understanding the distribution of manually-executed jobs versus scheduled jobs can let you know about some on-demand jobs performed for some specific reasons:

SELECT
    job_type,
    countIf(JSONExtractString(job_metadata, 'execution_type')='manual') AS job_manual,
    countIf(JSONExtractString(job_metadata, 'execution_type')='scheduled') AS job_scheduled
FROM tinybird.jobs_log
WHERE
    job_type='copy' AND
    created_at > now() - INTERVAL 1 DAY
GROUP BY job_type

Next steps

Updated