ReplacingMergeTree
Data deduplication occurs only during a merge. Merging occurs in the background at an unknown time, so you can’t plan for it. Some of the data may remain unprocessed. Although you can run an unscheduled merge using the query, don’t count on using it, because the OPTIMIZE
query will read and write a large amount of data.
Thus, ReplacingMergeTree
is suitable for clearing out duplicate data in the background in order to save space, but it doesn’t guarantee the absence of duplicates.
For a description of request parameters, see statement description.
ver
— column with version. TypeUInt*
,Date
orDateTime
. Optional parameter.When merging, from all the rows with the same sorting key leaves only one:
- Last in the selection, if
ver
not set.
- Last in the selection, if
Query clauses
Deprecated Method for Creating a Table
Attention
Do not use this method in new projects and, if possible, switch the old projects to the method described above.
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
...
) ENGINE [=] ReplacingMergeTree(date-column [, sampling_expression], (primary, key), index_granularity, [ver])
ver
- column with the version. Optional parameter. For a description, see the text above.