sst_dump 可以导出sst中的数据和属性信息。

mysql_ldb 工具功能非常强大,可以解析rocksdb各类文件,sst, manifest, wal等; 可以写入和查询数据;还提供了一些维护功能,详见如下help信息

  1. ldb - RocksDB Tool
  2. commands MUST specify --db=<full_path_to_db_directory> when necessary
  3. The following optional parameters control if keys/values are input/output as hex or as plain strings:
  4. --key_hex : Keys are input/output as hex
  5. --value_hex : Values are input/output as hex
  6. --hex : Both keys and values are input/output as hex
  7. The following optional parameters control the database internals:
  8. --column_family=<string> : name of the column family to operate on. default: default column family
  9. --ttl with 'put','get','scan','dump','query','batchput' : DB supports ttl and value is internally timestamp-suffixed
  10. --try_load_options : Try to load option file from DB.
  11. --ignore_unknown_options : Ignore unknown options when loading option file.
  12. --bloom_bits=<int,e.g.:14>
  13. --fix_prefix_len=<int,e.g.:14>
  14. --compression_type=<no|snappy|zlib|bzip2|lz4|lz4hc|xpress|zstd>
  15. --compression_max_dict_bytes=<int,e.g.:16384>
  16. --block_size=<block_size_in_bytes>
  17. --auto_compaction=<true|false>
  18. --db_write_buffer_size=<int,e.g.:16777216>
  19. --write_buffer_size=<int,e.g.:4194304>
  20. --file_size=<int,e.g.:2097152>
  21. Data Access Commands:
  22. put <key> <value> [--ttl]
  23. get <key> [--ttl]
  24. batchput <key> <value> [<key> <value>] [..] [--ttl]
  25. scan [--from] [--to] [--ttl] [--timestamp] [--max_keys=<N>q] [--start_time=<N>:- is inclusive] [--end_time=<N>:- is exclusive] [--no_value]
  26. delete <key>
  27. deleterange <begin key> <end key>
  28. query [--ttl]
  29. Starts a REPL shell. Type help for list of available commands.
  30. approxsize [--from] [--to]
  31. checkconsistency
  32. Admin Commands:
  33. dump_wal --walfile=<write_ahead_log_file_path> [--header] [--print_value]
  34. compact [--from] [--to]
  35. reduce_levels --new_levels=<New number of levels> [--print_old_levels]
  36. change_compaction_style --old_compaction_style=<Old compaction style: 0 for level compaction, 1 for universal compaction> --new_compaction_style=<New compaction style: 0 for level compaction, 1 for universal compaction>
  37. dump [--from] [--to] [--ttl] [--max_keys=<N>] [--timestamp] [--count_only] [--count_delim=<char>] [--stats] [--bucket=<N>] [--start_time=<N>:- is inclusive] [--end_time=<N>:- is exclusive] [--path=<path_to_a_file>]
  38. load [--create_if_missing] [--disable_wal] [--bulk_load] [--compact]
  39. manifest_dump [--verbose] [--json] [--path=<path_to_manifest_file>]
  40. list_column_families full_path_to_db_directory
  41. dump_live_files
  42. idump [--from] [--to] [--input_key_hex] [--max_keys=<N>] [--count_only] [--count_delim=<char>] [--stats]
  43. repair
  44. backup [--backup_env_uri] [--backup_dir] [--num_threads] [--stderr_log_level=<int (InfoLogLevel)>]
  45. restore [--backup_env_uri] [--backup_dir] [--num_threads] [--stderr_log_level=<int (InfoLogLevel)>]
  46. checkpoint [--checkpoint_dir]
  1. create table t1( c1 int , c2 int, c3 varchar(10), primary key (c1), key
  2. idx1(c2)) engine=rocksdb;
  3. insert t1 values (1,101,'A');
  4. insert t1 values (2,102,'B');
  5. insert t1 values (3,103,'C');
  6. insert t1 values (4,104,'D');
  7. insert t1 values (5,105,'E');
  8. select * from t1;
  9. +----+------+------+
  10. | c1 | c2 | c3 |
  11. +----+------+------+
  12. | 1 | 101 | A |
  13. | 2 | 102 | B |
  14. | 3 | 103 | C |
  15. | 4 | 104 | D |
  16. | 5 | 105 | E |
  17. +----+------+------+
  18. SET GLOBAL rocksdb_force_flush_memtable_now=1;
  19. select d.table_schema, d.table_name, d.index_name, d.index_number,f.sst_name, f.NUM_ROWS, f.ENTRY_DELETES, f.ENTRY_SINGLEDELETES,f.ENTRY_MERGES, f.ENTRY_OTHERS from information_schema.ROCKSDB_DDL d, information_schema.rocksdb_index_file_map f where d.index_number=f.index_number and d.table_name='t1';
  20. +--------------+------------+------------+--------------+------------+----------+---------------+---------------------+--------------+--------------+
  21. | table_schema | table_name | index_name | index_number | sst_name | NUM_ROWS | ENTRY_DELETES | ENTRY_SINGLEDELETES | ENTRY_MERGES | ENTRY_OTHERS |
  22. +--------------+------------+------------+--------------+------------+----------+---------------+---------------------+--------------+--------------+
  23. | test | t1 | PRIMARY | 256 | 000030.sst | 5 | 0 | 0 | 0 | 0 |

查看test.t1数据分布情况

  1. select d.table_schema, d.table_name, d.index_name, d.index_number,f.sst_name, f.NUM_ROWS, f.ENTRY_DELETES, f.ENTRY_SINGLEDELETES,f.ENTRY_MERGES, f.ENTRY_OTHERS from information_schema.ROCKSDB_DDL d, information_schema.rocksdb_index_file_map f where d.index_number=f.index_number and d.table_name='t1';
  2. +--------------+------------+------------+--------------+------------+----------+---------------+---------------------+--------------+--------------+
  3. | table_schema | table_name | index_name | index_number | sst_name | NUM_ROWS | ENTRY_DELETES | ENTRY_SINGLEDELETES | ENTRY_MERGES | ENTRY_OTHERS |
  4. +--------------+------------+------------+--------------+------------+----------+---------------+---------------------+--------------+--------------+
  5. | test | t1 | PRIMARY | 256 | 000030.sst | 5 | 0 | 0 | 0 | 0 |
  6. | test | t1 | idx1 | 257 | 000030.sst | 5 | 0 | 0 | 0 | 0 |
  7. +--------------+------------+------------+--------------+------------+----------+---------------+---------------------+--------------+--------------+
  • 导出t1的数据
  1. bin/sst_dump --command=scan --file=data/.rocksdb/000030.sst --output_hex
  2. from [] to []
  3. Process data/.rocksdb/000030.sst
  4. Sst file format: block-based
  5. '0000010080000001' seq:13, type:1 => 00650000000141
  6. '0000010080000002' seq:16, type:1 => 00660000000142
  7. '0000010080000003' seq:19, type:1 => 00670000000143
  8. '0000010080000004' seq:22, type:1 => 00680000000144
  9. '0000010080000005' seq:25, type:1 => 00690000000145
  10. '00000101018000006580000001' seq:14, type:1 =>
  11. '00000101018000006680000002' seq:17, type:1 =>
  12. '00000101018000006780000003' seq:20, type:1 =>
  13. '00000101018000006880000004' seq:23, type:1 =>
  14. '00000101018000006980000005' seq:26, type:1 =>

前五行为主键primary数据

后五行为索引idx1的数据

MyRocks记录格式参考

以此行(1,101,’A’)数据为例子

主键为

  1. '0000010080000001' seq:13, type:1 => 00650000000141
  1. Primary index id: 256 ==> 100
  2. c1: 1 ==> 80000001 (符号反转)
  3. type:1 PUT

value 由c2+c3组成

  1. c2: 101 ==> 65
  2. c3: A ==> 141

二级索引idx1数据为

key由indexid+c2(主键)+c1(二级索引键)组成

  1. second index id: 257 ==> 101
  2. c2 : 101 ==> 80000065 主键
  3. c1: 1 ==> 80000001 (符号反转) 二级索引键

value为null

Note type值参考如下定义

  1. enum ValueType : unsigned char {
  2. kTypeDeletion = 0x0,
  3. kTypeValue = 0x1,
  4. kTypeMerge = 0x2,
  5. kTypeLogData = 0x3, // WAL only.
  6. kTypeColumnFamilyDeletion = 0x4, // WAL only.
  7. kTypeColumnFamilyValue = 0x5, // WAL only.
  8. kTypeColumnFamilyMerge = 0x6, // WAL only.
  9. kTypeSingleDeletion = 0x7,
  10. kTypeColumnFamilySingleDeletion = 0x8, // WAL only.
  11. kTypeBeginPrepareXID = 0x9, // WAL only.
  12. kTypeEndPrepareXID = 0xA, // WAL only.
  13. kTypeCommitXID = 0xB, // WAL only.
  14. kTypeRollbackXID = 0xC, // WAL only.
  15. kTypeNoop = 0xD, // WAL only.
  16. kTypeColumnFamilyRangeDeletion = 0xE, // WAL only.
  17. kTypeRangeDeletion = 0xF, // meta block
  18. kTypeColumnFamilyBlobIndex = 0x10, // Blob DB only
  19. kTypeBlobIndex = 0x11, // Blob DB only
  20. // When the prepared record is also persisted in db, we use a different
  21. // record. This is to ensure that the WAL that is generated by a WritePolicy
  22. // is not mistakenly read by another, which would result into data
  23. // inconsistency.
  24. kTypeBeginPersistedPrepareXID = 0x12, // WAL only.
  25. kMaxValue = 0x7F // Not used for storing records.
  26. };
  • 按指定范围导出数据
  1. bin/sst_dump --command=scan --file=data/.rocksdb/000030.sst --from='0x0000010080000002' --to='0x0000010080000005' --input_key_hex --output_hex
  2. from [0000010080000002] to [0000010080000005]
  3. Process data/.rocksdb/000030.sst
  4. Sst file format: block-based
  5. '0000010080000002' seq:16, type:1 => 00660000000142
  6. '0000010080000003' seq:19, type:1 => 00670000000143
  7. '0000010080000004' seq:22, type:1 => 00680000000144
  • 查看sst属性信息

command=raw可以将数据和属性信息都写到*_dump.txt文件中

  1. bin/sst_dump --command=raw --file=data/.rocksdb/000030.sst --output_hex
  2. from [] to []
  3. Process data/.rocksdb/000030.sst
  4. Sst file format: block-based
  5. raw dump written to file data/.rocksdb/000030_dump.txt

另外选项–show_properties也可以展示属性信息

  • 查看sst数据
  1. bin/mysql_ldb --db=data/.rocksdb scan --hex
  2. 0x0000010080000001 : 0x00650000000141
  3. 0x0000010080000002 : 0x00660000000142
  4. 0x0000010080000003 : 0x00670000000143
  5. 0x0000010080000004 : 0x00680000000144
  6. 0x0000010080000005 : 0x00690000000145
  7. 0x00000101018000006580000001 : 0x
  8. 0x00000101018000006680000002 : 0x
  9. 0x00000101018000006780000003 : 0x
  10. 0x00000101018000006880000004 : 0x
  11. 0x00000101018000006980000005 : 0x
  • 写入数据

MyRocks在运行过程中,mysql_ldb是不能写入数据的

  1. Failed: IO error: While lock file: data/.rocksdb/LOCK: Resource temporarily unavailable

shutdown myrocks实例后,再写入成功

  1. ok

同时可以看到新的数据(6,106,’F’);

  1. select * from t1;
  2. +----+------+------+
  3. | c1 | c2 | c3 |
  4. +----+------+------+
  5. | 1 | 101 | A |
  6. | 2 | 102 | B |
  7. | 3 | 103 | C |
  8. | 4 | 104 | D |
  9. | 5 | 105 | E |
  10. | 6 | 106 | F |
  11. +----+------+------+

二级索引没有写数据,所以查不到

  1. select * from t1 where c2=106;
  2. Empty set (0.00 sec)
  3. explain select * from t1 where c2=106;
  4. +----+-------------+-------+------+---------------+------+---------+-------+------+-------+
  5. | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
  6. +----+-------------+-------+------+---------------+------+---------+-------+------+-------+
  7. | 1 | SIMPLE | t1 | ref | idx1 | idx1 | 5 | const | 4 | NULL |
  8. +----+-------------+-------+------+---------------+------+---------+-------+------+-------+

接着二级索引插入数据

  1. bin/mysql_ldb --db=data/.rocksdb put --hex 0x00000101018000006A80000006 0x
  2. OK

此时可以从二级索引查出数据了

  1. select * from t1 where c2=106;
  2. +----+------+------+
  3. | c1 | c2 | c3 |
  4. +----+------+------+
  5. | 6 | 106 | F |
  6. +----+------+------+
  7. select d.table_schema, d.table_name, d.index_name, d.index_number,f.sst_name, f.NUM_ROWS, f.ENTRY_DELETES, f.ENTRY_SINGLEDELETES,f.ENTRY_MERGES, f.ENTRY_OTHERS from information_schema.ROCKSDB_DDL d, information_schema.rocksdb_index_file_map f where d.index_number=f.index_number and d.table_name='t1';
  8. +--------------+------------+------------+--------------+------------+----------+---------------+---------------------+--------------+--------------+
  9. | table_schema | table_name | index_name | index_number | sst_name | NUM_ROWS | ENTRY_DELETES | ENTRY_SINGLEDELETES | ENTRY_MERGES | ENTRY_OTHERS |
  10. +--------------+------------+------------+--------------+------------+----------+---------------+---------------------+--------------+--------------+
  11. | test | t1 | PRIMARY | 256 | 000030.sst | 5 | 0 | 0 | 0 | 0 |
  12. | test | t1 | idx1 | 257 | 000030.sst | 5 | 0 | 0 | 0 | 0 |
  13. | test | t1 | PRIMARY | 256 | 000041.sst | 1 | 0 | 0 | 0 | 0 |
  14. | test | t1 | idx1 | 257 | 000058.sst | 1 | 0 | 0 | 0 | 0 |
  15. +--------------+------------+------------+--------------+------------+----------+---------------+---------------------+--------------+--------------+
  • mysql_ldb 查看MANIFEST文件

MANIFEST文件格式参考这里

  1. bin/mysql_ldb manifest_dump --path=data/.rocksdb/MANIFEST-000059
  2. --------------- Column family "default" (ID 0) --------------
  3. log number: 58
  4. comparator: RocksDB_SE_v3.10
  5. --- level 0 --- version# 2 ---
  6. 58:740['
  7. --------------- Column family "__system__" (ID 1) --------------
  8. log number: 63
  9. comparator: RocksDB_SE_v3.10
  10. --- level 0 --- version# 3 ---
  11. 64:773['
  12. next_file_number 66 last_sequence 36 prev_log_number 0 max_column_family 1
  • mysql_ldb repair

人为将MANIFEST-000059删除,rocksdb数据将无法查看

  1. bin/mysql_ldb --db=data/.rocksdb scan --hex
  2. Failed: IO error: While opening a file for sequentially reading: data/.rocksdb/MANIFEST-000059

mysql_ldb 通过 repair 选项可以恢复出MANIFEST

  1. bin/mysql_ldb repair --db=data/.rocksdb scan
  2. RocksDB version: 5.9.0
  3. Git sha rocksdb_build_git_sha:7c8c83458365f8b359cae13785d15b0bdc9df380
  4. Compile date Dec 16 2017
  5. DB SUMMARY
  6. CURRENT file: CURRENT
  7. IDENTITY file: IDENTITY
  8. SST files in data/.rocksdb dir, Total Num: 7, files: 000030.sst 000039.sst 000041.sst 000047.sst 000054.sst 000058.sst 000064.sst
  9. Write Ahead Log file in data/.rocksdb: 000063.log size: 19 ;
  10. .......... //省略部分配置信息
  11. [WARN] [/home/zhangyuan.zy/git/rds6/rocksdb/db/repair.cc:209] **** Repaired rocksdb data/.rocksdb; recovered 7 files; 5847bytes. Some data may have been lost. ****
  12. OK

repair 完成后数据可以查看

同时生成了新的MANIFEST文件

  1. bin/mysql_ldb manifest_dump --path=data/.rocksdb/MANIFEST-000003
  2. --------------- Column family "default" (ID 0) --------------
  3. log number: 0
  4. comparator: RocksDB_SE_v3.10
  5. --- level 0 --- version# 2 ---
  6. 58:740['
  7. --------------- Column family "__system__" (ID 1) --------------
  8. log number: 0
  9. comparator: RocksDB_SE_v3.10
  10. --- level 0 --- version# 3 ---
  11. next_file_number 66 last_sequence 36 prev_log_number 0 max_column_family 1