SummingMergeTree engine

The engine inherits from MergeTree. The difference is that when merging data parts for SummingMergeTree tables, the database replaces all the rows with the same primary key with one row which contains summarized values for the columns with the numeric data type. If the sorting key is composed in a way that a single key value corresponds to large number of rows, this significantly reduces storage volume and speeds up data selection.

Use the SummingMergeTree engine together with MergeTree. Store complete data in MergeTree table, and use SummingMergeTree for aggregated data storing, for example, when preparing reports. Such an approach prevents you from losing valuable data due to an incorrectly composed primary key.

Parameters of SummingMergeTree

columns

columns is a tuple with the names of columns where values are summarized. Optional parameter. The columns must be of a numeric type and must not be in the primary key.

If columns is not specified, Tinybird summarizes the values in all columns with a numeric data type that aren't in the primary key.

Query clauses

When creating a SummingMergeTree table, the same clauses as when creating a MergeTree table are required.

Common Rules for Summation

The values in the columns with the numeric data type are summarized. The set of columns is defined by the parameter columns. If the values were 0 in all of the columns for summation, the row is deleted. If the column is not in the primary key and is not summarized, an arbitrary value is selected from the existing ones.

The values are not summarized for columns in the primary key.

Summation in the Aggregatefunction Columns

For columns of the AggregateFunction type, Tinybird behaves as the AggregatingMergeTree engine, aggregating according to the function.

Nested structures

Table can have nested data structures that are processed in a special way. If the name of a nested table ends with Map and it contains at least two columns that meet the following criteria:

  • The first column is numeric (*Int*, Date, DateTime) or a string (String, FixedString), let’s call it key.
  • The other columns are arithmetic (*Int*, Float32/64), let’s call it (values...).

The nested table is then interpreted as a mapping of key => (values...). When merging its rows, the elements of two data sets are merged by key with a summation of the corresponding (values...).

Settings

For a list of supported settings, see Engine settings.

Was this page helpful?
Updated