TiKV Control User Guide

TiKV Control (tikv-ctl) is a command line tool of TiKV, used to manage the cluster. Its installation directory is as follows:

  • If the cluster is deployed using TiDB Ansible, tikv-ctl directory is in the resources/bin subdirectory under the ansible directory.
  • If the cluster is deployed using TiUP, tikv-ctl directory is in the in ~/.tiup/components/ctl/{VERSION}/ directory.

TiUP is a deployment tool introduced later than TiDB Ansible, and its usage is simpler. tikv-ctl is also integrated in the tiup command. Execute the following command to call the tikv-ctl tool:

tiup ctl tikv
Starting component `ctl`: ~/.tiup/components/ctl/v4.0.0-rc.2/ctl tikv TiKV Control (tikv-ctl) Release Version: 4.0.0-rc.2 Edition: Community Git Commit Hash: 2fdb2804bf8ffaab4b18c4996970e19906296497 Git Commit Branch: heads/refs/tags/v4.0.0-rc.2 UTC Build Time: 2020-05-15 11:58:49 Rust Version: rustc 1.42.0-nightly (0de96d37f 2019-12-19) Enable Features: jemalloc portable sse protobuf-codec Profile: dist_release

You can add corresponding parameters and subcommands after tiup ctl tikv.

General options

tikv-ctl provides two operation modes:

  • Remote mode: use the --host option to accept the service address of TiKV as the argument

    For this mode, if SSL is enabled in TiKV, tikv-ctl also needs to specify the related certificate file. For example:

    $ tikv-ctl --ca-path ca.pem --cert-path client.pem --key-path client-key.pem --host 127.0.0.1:20160 <subcommands>

    However, sometimes tikv-ctl communicates with PD instead of TiKV. In this case, you need to use the --pd option instead of --host. Here is an example:

    $ tikv-ctl --pd 127.0.0.1:2379 compact-cluster store:"127.0.0.1:20160" compact db:KV cf:default range:([], []) success!
  • Local mode: Use the --db option to specify the local TiKV data directory path. In this mode, you need to stop the running TiKV instance.

Unless otherwise noted, all commands support both the remote mode and the local mode.

Additionally, tikv-ctl has two simple commands --to-hex and --to-escaped, which are used to make simple changes to the form of the key.

Generally, use the escaped form of the key. For example:

$ tikv-ctl --to-escaped 0xaaff \252\377 $ tikv-ctl --to-hex "\252\377" AAFF

Subcommands, some options and flags

This section describes the subcommands that tikv-ctl supports in detail. Some subcommands support a lot of options. For all details, run tikv-ctl --help <subcommand>.

View information of the Raft state machine

Use the raft subcommand to view the status of the Raft state machine at a specific moment. The status information includes two parts: three structs (RegionLocalState, RaftLocalState, and RegionApplyState) and the corresponding Entries of a certain piece of log.

Use the region and log subcommands to obtain the above information respectively. The two subcommands both support the remote mode and the local mode at the same time. Their usage and output are as follows:

$ tikv-ctl --host 127.0.0.1:20160 raft region -r 2 region id: 2 region state key: \001\003\000\000\000\000\000\000\000\002\001 region state: Some(region {id: 2 region_epoch {conf_ver: 3 version: 1} peers {id: 3 store_id: 1} peers {id: 5 store_id: 4} peers {id: 7 store_id: 6}}) raft state key: \001\002\000\000\000\000\000\000\000\002\002 raft state: Some(hard_state {term: 307 vote: 5 commit: 314617} last_index: 314617) apply state key: \001\002\000\000\000\000\000\000\000\002\003 apply state: Some(applied_index: 314617 truncated_state {index: 313474 term: 151})

View the Region size

Use the size command to view the Region size:

$ tikv-ctl --db /path/to/tikv/db size -r 2 region id: 2 cf default region size: 799.703 MB cf write region size: 41.250 MB cf lock region size: 27616

Scan to view MVCC of a specific range

The --from and --to options of the scan command accept two escaped forms of raw key, and use the --show-cf flag to specify the column families that you need to view.

$ tikv-ctl --db /path/to/tikv/db scan --from 'zm' --limit 2 --show-cf lock,default,write key: zmBootstr\377a\377pKey\000\000\377\000\000\373\000\000\000\000\000\377\000\000s\000\000\000\000\000\372 write cf value: start_ts: 399650102814441473 commit_ts: 399650102814441475 short_value: "20" key: zmDB:29\000\000\377\000\374\000\000\000\000\000\000\377\000H\000\000\000\000\000\000\371 write cf value: start_ts: 399650105239273474 commit_ts: 399650105239273475 short_value: "\000\000\000\000\000\000\000\002" write cf value: start_ts: 399650105199951882 commit_ts: 399650105213059076 short_value: "\000\000\000\000\000\000\000\001"

View MVCC of a given key

Similar to the scan command, the mvcc command can be used to view MVCC of a given key.

$ tikv-ctl --db /path/to/tikv/db mvcc -k "zmDB:29\000\000\377\000\374\000\000\000\000\000\000\377\000H\000\000\000\000\000\000\371" --show-cf=lock,write,default key: zmDB:29\000\000\377\000\374\000\000\000\000\000\000\377\000H\000\000\000\000\000\000\371 write cf value: start_ts: 399650105239273474 commit_ts: 399650105239273475 short_value: "\000\000\000\000\000\000\000\002" write cf value: start_ts: 399650105199951882 commit_ts: 399650105213059076 short_value: "\000\000\000\000\000\000\000\001"

In this command, the key is also the escaped form of raw key.

Scan raw keys

The raw-scan command scans directly from the RocksDB. Note that to scan data keys you need to add a 'z' prefix to keys.

Use --from and --to options to specify the range to scan (unbounded by default). Use --limit to limit at most how many keys to print out (30 by default). Use --cf to specify which cf to scan (can be default, write or lock).

$ ./tikv-ctl --db /var/lib/tikv/db/ raw-scan --from 'zt' --limit 2 --cf default key: "zt\200\000\000\000\000\000\000\377\005_r\200\000\000\000\000\377\000\000\001\000\000\000\000\000\372\372b2,^\033\377\364", value: "\010\002\002\002%\010\004\002\010root\010\006\002\000\010\010\t\002\010\n\t\002\010\014\t\002\010\016\t\002\010\020\t\002\010\022\t\002\010\024\t\002\010\026\t\002\010\030\t\002\010\032\t\002\010\034\t\002\010\036\t\002\010 \t\002\010\"\t\002\010s\t\002\010&\t\002\010(\t\002\010*\t\002\010,\t\002\010.\t\002\0100\t\002\0102\t\002\0104\t\002" key: "zt\200\000\000\000\000\000\000\377\025_r\200\000\000\000\000\377\000\000\023\000\000\000\000\000\372\372b2,^\033\377\364", value: "\010\002\002&slow_query_log_file\010\004\002P/usr/local/mysql/data/localhost-slow.log" Total scanned keys: 2

To print the value of a key, use the print command.

In order to record Region state details, TiKV writes some statistics into the SST files of Regions. To view these properties, run tikv-ctl with the region-properties sub-command:

$ tikv-ctl --host localhost:20160 region-properties -r 2 num_files: 0 num_entries: 0 num_deletes: 0 mvcc.min_ts: 18446744073709551615 mvcc.max_ts: 0 mvcc.num_rows: 0 mvcc.num_puts: 0 mvcc.num_versions: 0 mvcc.max_row_versions: 0 middle_key_by_approximate_size:

The properties can be used to check whether the Region is healthy or not. If not, you can use them to fix the Region. For example, splitting the Region manually by middle_key_approximate_size.

Compact data of each TiKV manually

Use the compact command to manually compact data of each TiKV. If you specify the --from and --to options, then their flags are also in the form of escaped raw key.

  • Use the --host option to specify the TiKV that you need to compact.
  • Use the -d option to specify the RocksDB that you need to compact. The optional values are kv and raft.
  • Use the --threads option allows you to specify the concurrency that you compact and its default value is 8. Generally, a higher concurrency comes with a faster compact speed, which might yet affect the service. You need to choose an appropriate concurrency based on the scenario.
$ tikv-ctl --db /path/to/tikv/db compact -d kv success!

Compact data of the whole TiKV cluster manually

Use the compact-cluster command to manually compact data of the whole TiKV cluster. The flags of this command have the same meanings and usage as those of the compact command.

Set a Region to tombstone

The tombstone command is usually used in circumstances where the sync-log is not enabled, and some data written in the Raft state machine is lost caused by power down.

In a TiKV instance, you can use this command to set the status of some Regions to tombstone. Then when you restart the instance, those Regions are skipped to avoid the restart failure caused by damaged Raft state machines of those Regions. Those Regions need to have enough healthy replicas in other TiKV instances to be able to continue the reads and writes through the Raft mechanism.

In general cases, you can remove the corresponding Peer of this Region using the remove-peer command:

pd-ctl operator add remove-peer <region_id> <store_id>

Then use the tikv-ctl tool to set a Region to tombstone on the corresponding TiKV instance to skip the health check for this Region at startup:

tikv-ctl --db /path/to/tikv/db tombstone -p 127.0.0.1:2379 -r <region_id>
success!

However, in some cases, you cannot easily remove this Peer of this Region from PD, so you can specify the --force option in tikv-ctl to forcibly set the Peer to tombstone:

tikv-ctl --db /path/to/tikv/db tombstone -p 127.0.0.1:2379 -r <region_id>,<region_id> --force
success!

Send a consistency-check request to TiKV

Use the consistency-check command to execute a consistency check among replicas in the corresponding Raft of a specific Region. If the check fails, TiKV itself panics. If the TiKV instance specified by --host is not the Region leader, an error is reported.

$ tikv-ctl --host 127.0.0.1:20160 consistency-check -r 2 success! $ tikv-ctl --host 127.0.0.1:20161 consistency-check -r 2 DebugClient::check_region_consistency: RpcFailure(RpcStatus { status: Unknown, details: Some("StringError(\"Leader is on store 1\")") })

Dump snapshot meta

This sub-command is used to parse a snapshot meta file at given path and print the result.

To avoid checking the Regions while TiKV is started, you can use the tombstone command to set the Regions where the Raft state machine reports an error to Tombstone. Before running this command, use the bad-regions command to find out the Regions with errors, so as to combine multiple tools for automated processing.

$ tikv-ctl --db /path/to/tikv/db bad-regions all regions are healthy

If the command is successfully executed, it prints the above information. If the command fails, it prints the list of bad Regions. Currently, the errors that can be detected include the mismatches between last index, commit index and apply index, and the loss of Raft log. Other conditions like the damage of snapshot files still need further support.

View Region properties

  • To view in local the properties of Region 2 on the TiKV instance that is deployed in /path/to/tikv:

    $ tikv-ctl --db /path/to/tikv/data/db region-properties -r 2
  • To view online the properties of Region 2 on the TiKV instance that is running on 127.0.0.1:20160:

    $ tikv-ctl --host 127.0.0.1:20160 region-properties -r 2

Modify the RocksDB configuration of TiKV dynamically

You can use the modify-tikv-config command to dynamically modify the configuration arguments. Currently, it only supports dynamically modifying RocksDB related arguments.

  • -m is used to specify the target RocksDB. You can set it to kvdb or raftdb.
  • -n is used to specify the configuration name. You can refer to the arguments of [rocksdb] and [raftdb] (corresponding to kvdb and raftdb) in the TiKV configuration template. You can use default|write|lock + . + argument name to specify the configuration of different CFs. For kvdb, you can set it to default, write, or lock; for raftdb, you can only set it to default.
  • -v is used to specify the configuration value.
$ tikv-ctl modify-tikv-config -m kvdb -n max_background_jobs -v 8 success! $ tikv-ctl modify-tikv-config -m kvdb -n write.block-cache-size -v 256MB success! $ tikv-ctl modify-tikv-config -m raftdb -n default.disable_auto_compactions -v true success!

Force Region to recover the service from failure of multiple replicas

Use the unsafe-recover remove-fail-stores command to remove the failed machines from the peer list of Regions. Then after you restart TiKV, these Regions can continue to provide services using the other healthy replicas. This command is usually used in circumstances where multiple TiKV stores are damaged or deleted.

The -s option accepts multiple store_id separated by comma and uses the -r flag to specify involved Regions. Otherwise, all Regions' peers located on these stores will be removed by default.

$ tikv-ctl --db /path/to/tikv/db unsafe-recover remove-fail-stores -s 3 -r 1001,1002 success!

Recover from MVCC data corruption

Use the recover-mvcc command in circumstances where TiKV cannot run normally caused by MVCC data corruption. It cross-checks 3 CFs ("default", "write", "lock") to recover from various kinds of inconsistency.

  • Use the -r option to specify involved Regions by region_id.
  • Use the -p option to specify PD endpoints.
$ tikv-ctl --db /path/to/tikv/db recover-mvcc -r 1001,1002 -p 127.0.0.1:2379 success!

Ldb Command

The ldb command line tool offers multiple data access and database administration commands. Some examples are listed below. For more information, refer to the help message displayed when running tikv-ctl ldb or check the documents from RocksDB.

Examples of data access sequence:

To dump an existing RocksDB in HEX:

$ tikv-ctl ldb --hex --db=/tmp/db dump

To dump the manifest of an existing RocksDB:

$ tikv-ctl ldb --hex manifest_dump --path=/tmp/db/MANIFEST-000001

You can specify the column family that your query is against using the --column_family=<string> command line.

--try_load_options loads the database options file to open the database. It is recommended to always keep this option on when the database is running. If you open the database with default options, the LSM-tree might be messed up, which cannot be recovered automatically.