ReplacingMergeTree engine

The engine differs from MergeTree in that it removes duplicate entries with the same sorting key value.

Data deduplication occurs only during a merge. Merging occurs in the background at an unknown time, so you can’t plan for it. Some data may remain unprocessed.

ReplacingMergeTree is suitable for clearing out duplicate data in the background to save space, but it doesn't guarantee the absence of duplicates. Check Deduplication strategies for more information about how to handle duplicates.

Creating a ReplacingMergeTree Data Source

post_views_rmt.datasource
DESCRIPTION >
    Data Source to save latest version of post info. ReplacingMergeTree Engine.

SCHEMA >
    `post_id` Int32 `json:$.post_id`,
    `views` Int32 `json:$.views`,
    `likes` Int32 `json:$.likes`,
    `tag` String `json:$.tag`,
    `timestamp` DateTime `json:$.timestamp`,
    `_is_deleted` UInt8 `json:$._is_deleted` DEFAULT 0

ENGINE "ReplacingMergeTree"
ENGINE_PARTITION_KEY ""
ENGINE_SORTING_KEY "post_id"
ENGINE_VER "timestamp"
ENGINE_IS_DELETED "_is_deleted"

ReplacingMergeTree parameters

ENGINE_VER

ENGINE_VER is the column with the version number. Type UInt*, Date, DateTime or DateTime64. Optional parameter.

When merging, ReplacingMergeTree from all the rows with the same ENGINE_SORTING_KEY leaves only one:

  • The last in the selection, if ENGINE_VER not set. A selection is a set of rows in a set of parts participating in the merge. The most recently created part, the last insert, is the last one in the selection. After deduplication, the last row from the most recent insert remains for each unique sorting key.
  • With the maximum version, if ENGINE_VER specified. If ENGINE_VER is the same for several rows, then it uses "if ENGINE_VER isn't specified" rule for them, i.e. the most recent inserted row remains.

ENGINE_IS_DELETED

ENGINE_IS_DELETED is the name of the column used during a merge to determine whether the data in this row represents the state or is to be deleted. 1 is a "deleted" row, 0 is a "state" row.

ENGINE_IS_DELETED can only be enabled when ENGINE_VER is used. Note it's a mask, not an actual delete, so queries to the Data Source with FINAL are equivalent to queries with WHERE <is_deleted_column> = 0.

No matter the operation on the data, the version must be increased. If two inserted rows have the same version number, the last inserted row is the one kept.

Query time de-duplication and FINAL

At merge time, the ReplacingMergeTree identifies duplicate rows, using the values of the ENGINE_SORTING_KEY columns as a unique identifier, and retains only the highest version. This, however, offers eventual correctness only - it doesn't guarantee rows are deduplicated. Queries can, therefore, produce incorrect answers due to update and delete rows being considered in queries.

To obtain correct answers, complement background merges with query time deduplication and deletion removal. You can do this by using the FINAL operator.

Example:


echo '{ "timestamp": "2024-07-02T02:22:17", "post_id": 956, "views": 856875, "likes": 2321, "tag": "Sports" }' > post_views.ndjson
echo '{ "timestamp": "2024-08-02T03:22:17", "post_id": 956, "views": 956875, "likes": 3321, "tag": "Sports" }' >> post_views.ndjson
echo '{ "timestamp": "2024-08-01T00:00:00", "post_id": 1, "views": 56875, "likes": 321, "tag": "Music" }' >> post_views.ndjson
echo '{ "timestamp": "2024-09-01T00:00:00", "post_id": 1, "views": 56875, "likes": 321, "tag": "Music", "_is_deleted": 1 }' >> post_views.ndjson

tb datasource append post_views_rmt post_views.ndjson

tb sql "select * from post_views_rmt final"
-------------------------------------------------------------------------
| post_id |  views | likes | tag    | timestamp           | _is_deleted |
-------------------------------------------------------------------------
|     956 | 956875 |  3321 | Sports | 2024-08-02 03:22:17 |           0 |
-------------------------------------------------------------------------

Settings

For a list of supported settings, see Engine settings.

Updated