How to safely cancel a database query

Cancelling a query from a UI client is more nuanced than it might seem. Here's how we implemented safe KILL QUERY operations in Tinybird.

ClickHouse

Ivan Malagon

Product Manager

When you’re starting a data project, you want to play around, test your assumptions, and explore. It’s not the time to optimize. In fact, you’re almost certain not to know yet what to optimize.

This means that your queries that should run in milliseconds could take minutes. You know the feeling of triggering the query and instantly realizing that you made a mistake. You want to stop the query and fix it.

Until recently, Tinybird didn’t allow you to cancel queries. Why not? It's such an obvious action. Well, we could argue that we wanted to encourage users to optimize their queries from the beginning. We’re all about speed, and that means queries that run in milliseconds.

But we recognize speed isn't just about queries. It's also about how fast you develop. And if you're worried about optimizing too soon, you go slower.

Still, the truth is that cancelling queries presents some technical challenges. Let’s learn about them.

How to kill a query on the server

If you were directly connected to your database and wanted to cancel a query, it would be easy.

Tinybird uses ClickHouse as its analytical database. The process to cancel a query would be:

Find the query ID. You need to query the system.processes table, using clusterAllReplicas if you’re operating a cluster. Here's how we might find the query based on a part of the SQL you just triggered, but you could use your own criteria to get the desired query ID.

Call the KILL QUERY command with the given query_id.

The important takeaway: you need the query_id to kill the query.

Start building with Tinybird

If you've read this far, you might want to use Tinybird as your analytics backend.

Sign up

<--(Start for free with no time limit.)

When the query comes from the client

The scenario is not that straightforward when the query comes from another client, for example, the Tinybird web app or CLI.

When an app triggers a query, it typically follows this path:

Client calls an API endpoint via an HTTP request
API server forwards the request to the database
Database processes the query
Database passes result to API server
API server sends response to the client

A diagram showing a typical database query route from a web app client. — How you typically make a database query from the UI client

This is, of course, oversimplified. In the real world, you’ll hit a load balancer in front of the API server and another one in front of the database cluster.

You may see the problem already. The UI doesn’t know anything about what’s happening under the hood.

You make a request, but the API and the database server deal with the query IDs. In Tinybird's case, it’s assigned at the API layer. The UI won’t know the assigned ID until the request is fulfilled. By that time, the query is already finished, so having that ID to kill an already finished query is useless.

The solution is simple: Instead of leaving it to the database to assign a query ID, delegate the task to the UI and pass it along to the API server to assign to the ClickHouse client operation. We use the HTTP interface to connect to ClickHouse, passing the query_id as a parameter.

A diagram showing a modification of the typical database query route to allow the client to assign a query ID to all queries made to the database — To make sure the client has context, the client sets the context.

Now, the UI already knows the query ID (it created it), so we can ask for the query cancellation using this ID.

A diagram showing how a client with knowledge of the query ID can implement a KILL QUERY request. — Since the client knows the query ID, it can easily invoke a `KILL QUERY` request.

Subscribe to our newsletter

Links to our blog and other great reads sent every other Saturday.

Loading…

Ensuring safe query cancellation

Of course, it's not so simple. Nothing ever is. Letting users kill queries is risky. You cannot let users kill arbitrary queries, because the database server is not only running the queries coming from the UI, but also materializations, serving the queries of other endpoints, writes, etc. To reduce the surface area of this operation, we put in place several measures to keep things safe:

Unique query ID strategy

To avoid collisions with other queries, the “killable” queries use a unique ID format. We add a specific prefix to a ULID identifier. This eliminates the risk of accidentally killing critical background processes.

Endpoint restriction

We limit the query cancellation to a specific endpoint: our Query API, and only for requests coming from the UI app. We enforce this using the unique query ID format via parameter validation. This prevents unauthorized query termination coming from other endpoints.

Targeted cancellation

A special cautionary lesson: direct the KILL QUERY to the correct server. In the past, we killed queries programmatically using the `ON CLUSTER` clause, and our approach caused some issues.

Why? ZooKeeper, the service we use to handle replication, has a single queue to handle all DDL operations. We overloaded that queue with KILL queries.

We were determined not to let this happen again. So, instead of just throwing the KILL to the cluster, we find the particular server running the query and direct the cancellation there. This way, we avoid having to deal with replication problems. Since we are always cancelling READ queries, this is ok.

To achieve this, we ask for the hostname where the query is being run:

With that information, we run the KILL query against that particular server.

A diagram showing how to apply a targeted query cancellation to the specific database server — By getting the hostname of the query, we can direct the kill operation to a targeted server, which is much safer.

As always, a simple feature in appearance has a lot more nuance than you’d expect. At Tinybird, we took this approach to safely allow cancelling queries from the UI client. I hope this is useful if you’re thinking of adding something similar to your product.

Do you like this post?