AggregatingMergeTree engine¶
Use the AggregatingMergeTree
engine for incremental data aggregation, including for aggregated materialized views.
The engine processes all columns with the following types:
Use AggregatingMergeTree
if it reduces the number of rows by orders of magnitude.
Creating an AggregatingMergeTree Data Source¶
The AggregatingMergeTree
Data Source is usually created using a Materialized View from a MergeTree
Data Source.
Example:
user_activity.datasource
SCHEMA > `timestamp` DateTime `json:$.timestamp`, `user_id` UInt32 `json:$.user_id`, `activity_type` String `json:$.activity_type`, `payload` JSON `json:$.payload` ENGINE "MergeTree" ENGINE_PARTITION_KEY "toYYYYMM(timestamp)" ENGINE_SORTING_KEY "activity_type, user_id, timestamp"
total_daily_activities_mat.pipe
NODE total_daily_activities_1 SQL > SELECT activity_type, toDate(timestamp) AS date, countState() AS total_activities FROM user_activity GROUP BY activity_type, date TYPE materialized DATASOURCE total_daily_activities_mv
total_daily_activities_mv.datasource
SCHEMA > `activity_type` String, `date` Date, `total_activities` AggregateFunction(count) ENGINE "AggregatingMergeTree" ENGINE_PARTITION_KEY "toYYYYMM(date)" ENGINE_SORTING_KEY "activity_type, date"
Querying an AggregatingMergeTree Data Source¶
As explained in the MergeTree engine, parts are merged in the background, so you are never sure if the data is merged or not. That's why you should always use the -Merge()
modifier after a -State()
when needed.
Querying an AggregatingMergeTree Data Source
SELECT activity_type, date, countMerge(total_activities) AS daily_total_activities FROM total_daily_activities_mv GROUP BY activity_type, date
Settings¶
For a list of supported settings, see Engine settings.