Compression

    IoTDB allows you to specify the compression method of the column when creating a time series, and supports the following compression methods:

    • UNCOMPRESSED

    • LZ4

    • GZIP

    The specified syntax for compression is detailed in Create Timeseries Statement.

    In IoTDB SDT compresses and discards data when flushing into the disk.

    IoTDB allows you to specify the properties of SDT when creating a time series, and supports three properties:

    • CompDev (Compression Deviation)

    CompDev is the most important parameter in SDT that represents the maximum difference between the current sample and the current linear trend. CompDev needs to be greater than 0 to perform compression.

    • CompMinTime (Compression Minimum Time Interval)

    CompMinTime is a parameter measures the time distance between two stored data points, which is used for noisy reduction. If the time interval between the current point and the last stored point is less than or equal to its value, current point will NOT be stored regardless of compression deviation. The default value is 0 with time unit ms.

    CompMaxTime is a parameter measure the time distance between two stored data points. If the time interval between the current point and the last stored point is greater than or equal to its value, current point will be stored regardless of compression deviation. The default value is 9,223,372,036,854,775,807 with time unit ms.

    The specified syntax for SDT is detailed in .

    • INT32 (Integer)
    • INT64 (Long Integer)
    • FLOAT (Single Precision Floating Point)
    • DOUBLE (Double Precision Floating Point)

    The following is an example of using SDT compression.

    Prior to flushing and SDT compression, the results are shown below:

    After flushing and SDT compression, the results are shown below:

    SDT takes effect when flushing to the disk. The SDT algorithm always stores the first point and does not store the last point.

    The data in [2017-11-01T00:06:00.001, 2017-11-01T00:06:00.007] is within the compression deviation thus discarded. The data point at time 2017-11-01T00:06:00.007 is stored because the next data point at time 2017-11-01T00:06:00.015 exceeds compression deviation. When a data point exceeds the compression deviation, SDT stores the last read point and updates the upper and lower boundaries. The last point at time 2017-11-01T00:06:00.018 is not stored.

    Compression ratio statistics file: data/system/storage_groups/compression_ratio/Ratio-{ratio_sum}-{memtable_flush_time}

    • memtable_flush_time: memtable flush times