Jun 04, 2019

Tinybird Changelog: Faster CSV Imports

During the last couple of weeks, we’ve made major improvements to our CSV import process, increasing its speed and reliability, getting almost a 2x performance.
Jorge Sancha
Co-founder & CEO

During the last couple of weeks, we’ve made major improvements to our csv import process, increasing its speed and reliability, getting almost a 2x performance. Tinybird Analytics is now able to ingest around 680,000 rows per second – in the smallest Tinybird paid account.

We’ve also been working on ways to ensure your data is only accessible in the way you want it to, so we’ve added SQL filters to Auth Tokens.

Define who sees what via our Auth Tokens with SQL filters:

In order to read, append or import data into Tinybird, you’ll need an Auth Token with the right permissions. When building an application that queries any of your tables, you can either use the default token that is automatically created along with the table or programmatically generate an Auth Token with a specific scope for it. You would do it as follows:

But now, with the new SQL filter capabilities, in addition to creating Auth Tokens which have read access to a single table, you can also create Auth tokens which have access to a subset of a table; or more specifically, to those rows that meet a specific criteria.

Imagine you have a table containing real estate information for different cities around the world. Most likely, at some point, you will need to only expose data for a particular city to one or many of your users. Instead of splitting the original table in lots of different ones or create a backend that adds the required filters, adding a simple SQL filter scope to your Auth Token would limit the access that your application has to your data and prevent data leaks.

Using the Auth token defined above, when running a query against your table, only rows with ‘Vancouver’ in the city column would be taken into account. It doesn’t matter if it’s an aggregation or a filter, only data for Vancouver would be considered.

Let us show you a running example. For it, we will use the nyctaxi dataset we’ve used in many other occasions. As you can see below, we are performing exactly the same query but using different Auth Tokens, which is counting rows (easy, uh?).

Let’s start by running the query with the default READ token.

Below, we are using a different Auth Token which contains the READ:payment_type==4 filtered scope. Only the rows with that payment_type id will be used in the queries.

As you can see, both queries are the same, but the results are different depending on the Auth Token (filtered or not) you use. And yeah, no back-end code needed for it.

This opens lots of possibilities, especially when used dynamically within your applications (Auth Tokens can be created/modified at runtime), so you can seamlessly integrate your Tinybird Analytics APIs with any permissions systems with just a few lines of code.

Take a look at our documentation and let us know what you think or sign up for early access to Tinybird (we're gradually opening up early access to companies and developers).

Do you like this post?

Related posts

More Data, More Apps: Improving data ingestion in Tinybird
Changelog: 7x faster and increased reliability
Querying large CSVs online with SQL
Building an enterprise-grade real-time analytics platform
A big performance boost, adding columns and more

Tinybird

Team

May 31, 2021
Iterating terabyte-sized ClickHouse®️ tables in production
Iterating terabyte-sized ClickHouse®️ tables in production
Simplifying event sourcing with scheduled data snapshots in Tinybird
Publish SQL-based endpoints on NGINX log analysis

Build fast data products, faster.

Try Tinybird and bring your data sources together and enable engineers to build with data in minutes. No credit card required, free to get started.
Need more? Contact sales for Enterprise support.