ReplacingMergeTree

Data deduplication occurs only during a merge. Merging occurs in the background at an unknown time, so you can’t plan for it. Some of the data may remain unprocessed. Although you can run an unscheduled merge using the query, don’t count on using it, because the OPTIMIZE query will read and write a large amount of data.

Thus, ReplacingMergeTree is suitable for clearing out duplicate data in the background in order to save space, but it doesn’t guarantee the absence of duplicates.

For a description of request parameters, see statement description.

ver — column with version. Type UInt*, Date or DateTime. Optional parameter.

When merging, from all the rows with the same sorting key leaves only one:
- Last in the selection, if ver not set.

Query clauses

Deprecated Method for Creating a Table

Attention

Do not use this method in new projects and, if possible, switch the old projects to the method described above.

CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
    ...
) ENGINE [=] ReplacingMergeTree(date-column [, sampling_expression], (primary, key), index_granularity, [ver])

ver - column with the version. Optional parameter. For a description, see the text above.