AggregatingMergeTree engine¶
Use the AggregatingMergeTree
engine for incremental data aggregation, including for aggregated materialized views.
The engine processes all columns with the following types:
Use AggregatingMergeTree
if it reduces the number of rows by orders of magnitude.
Creating an AggregatingMergeTree Data Source¶
The AggregatingMergeTree
Data Source is usually created using a Materialized View from a MergeTree
Data Source.
Example:
user_activity.datasource
SCHEMA >
`timestamp` DateTime `json:$.timestamp`,
`user_id` UInt32 `json:$.user_id`,
`activity_type` String `json:$.activity_type`,
`payload` JSON `json:$.payload`
ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "activity_type, user_id, timestamp"
total_daily_activities_mat.pipe
NODE total_daily_activities_1
SQL >
SELECT
activity_type,
toDate(timestamp) AS date,
countState() AS total_activities
FROM user_activity
GROUP BY
activity_type,
date
TYPE materialized
DATASOURCE total_daily_activities_mv
total_daily_activities_mv.datasource
SCHEMA >
`activity_type` String,
`date` Date,
`total_activities` AggregateFunction(count)
ENGINE "AggregatingMergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(date)"
ENGINE_SORTING_KEY "activity_type, date"
Querying an AggregatingMergeTree Data Source¶
As explained in the MergeTree engine, parts are merged in the background, so you are never sure if the data is merged or not. That's why you should always use the -Merge()
modifier after a -State()
when needed.
Querying an AggregatingMergeTree Data Source
SELECT
activity_type,
date,
countMerge(total_activities) AS daily_total_activities
FROM total_daily_activities_mv
GROUP BY activity_type, date
Settings¶
For a list of supported settings, see Engine settings.