TiKV Control User Guide
TiKV Control (tikv-ctl
) is a command line tool of TiKV, used to manage the cluster.
When you compile TiKV, the tikv-ctl
command is also compiled at the same time. If the cluster is deployed using TiDB Ansible, the tikv-ctl
binary file exists in the corresponding tidb-ansible/resources/bin
directory. If the cluster is deployed using the binary, the tikv-ctl
file is in the bin
directory together with other files such as tidb-server
, pd-server
, tikv-server
, etc.
General options
tikv-ctl
provides two operation modes:
Remote mode: use the
--host
option to accept the service address of TiKV as the argumentFor this mode, if SSL is enabled in TiKV,
tikv-ctl
also needs to specify the related certificate file. For example:$ tikv-ctl --ca-path ca.pem --cert-path client.pem --key-path client-key.pem --host 127.0.0.1:20160 <subcommands>However, sometimes
tikv-ctl
communicates with PD instead of TiKV. In this case, you need to use the--pd
option instead of--host
. Here is an example:$ tikv-ctl --pd 127.0.0.1:2379 compact-cluster store:"127.0.0.1:20160" compact db:KV cf:default range:([], []) success!Local mode: use the
--db
option to specify the local TiKV data directory path. In this mode, you need to stop the running TiKV instance.
Unless otherwise noted, all commands supports both the remote mode and the local mode.
Additionally, tikv-ctl
has two simple commands --to-hex
and --to-escaped
, which are used to make simple changes to the form of the key.
Generally, use the escaped
form of the key. For example:
$ tikv-ctl --to-escaped 0xaaff
\252\377
$ tikv-ctl --to-hex "\252\377"
AAFF
Subcommands, some options and flags
This section describes the subcommands that tikv-ctl
supports in detail. Some subcommands support a lot of options. For all details, run tikv-ctl --help <subcommand>
.
View information of the Raft state machine
Use the raft
subcommand to view the status of the Raft state machine at a specific moment. The status information includes two parts: three structs (RegionLocalState, RaftLocalState, and RegionApplyState) and the corresponding Entries of a certain piece of log.
Use the region
and log
subcommands to obtain the above information respectively. The two subcommands both support the remote mode and the local mode at the same time. Their usage and output are as follows:
$ tikv-ctl --host 127.0.0.1:20160 raft region -r 2
region id: 2
region state key: \001\003\000\000\000\000\000\000\000\002\001
region state: Some(region {id: 2 region_epoch {conf_ver: 3 version: 1} peers {id: 3 store_id: 1} peers {id: 5 store_id: 4} peers {id: 7 store_id: 6}})
raft state key: \001\002\000\000\000\000\000\000\000\002\002
raft state: Some(hard_state {term: 307 vote: 5 commit: 314617} last_index: 314617)
apply state key: \001\002\000\000\000\000\000\000\000\002\003
apply state: Some(applied_index: 314617 truncated_state {index: 313474 term: 151})
View the Region size
Use the size
command to view the Region size:
$ tikv-ctl --db /path/to/tikv/db size -r 2
region id: 2
cf default region size: 799.703 MB
cf write region size: 41.250 MB
cf lock region size: 27616
Scan to view MVCC of a specific range
The --from
and --to
options of the scan
command accept two escaped forms of raw key, and use the --show-cf
flag to specify the column families that you need to view.
$ tikv-ctl --db /path/to/tikv/db scan --from 'zm' --limit 2 --show-cf lock,default,write
key: zmBootstr\377a\377pKey\000\000\377\000\000\373\000\000\000\000\000\377\000\000s\000\000\000\000\000\372
write cf value: start_ts: 399650102814441473 commit_ts: 399650102814441475 short_value: "20"
key: zmDB:29\000\000\377\000\374\000\000\000\000\000\000\377\000H\000\000\000\000\000\000\371
write cf value: start_ts: 399650105239273474 commit_ts: 399650105239273475 short_value: "\000\000\000\000\000\000\000\002"
write cf value: start_ts: 399650105199951882 commit_ts: 399650105213059076 short_value: "\000\000\000\000\000\000\000\001"
View MVCC of a given key
Similar to the scan
command, the mvcc
command can be used to view MVCC of a given key.
$ tikv-ctl --db /path/to/tikv/db mvcc -k "zmDB:29\000\000\377\000\374\000\000\000\000\000\000\377\000H\000\000\000\000\000\000\371" --show-cf=lock,write,default
key: zmDB:29\000\000\377\000\374\000\000\000\000\000\000\377\000H\000\000\000\000\000\000\371
write cf value: start_ts: 399650105239273474 commit_ts: 399650105239273475 short_value: "\000\000\000\000\000\000\000\002"
write cf value: start_ts: 399650105199951882 commit_ts: 399650105213059076 short_value: "\000\000\000\000\000\000\000\001"
In this command, the key is also the escaped form of raw key.
Print a specific key value
To print the value of a key, use the print
command.
Print some properties about Region
In order to record Region state details, TiKV writes some statistics into the SST files of Regions. To view these properties, run tikv-ctl
with the region-properties
sub-command:
$ tikv-ctl --host localhost:20160 region-properties -r 2
num_files: 0
num_entries: 0
num_deletes: 0
mvcc.min_ts: 18446744073709551615
mvcc.max_ts: 0
mvcc.num_rows: 0
mvcc.num_puts: 0
mvcc.num_versions: 0
mvcc.max_row_versions: 0
middle_key_by_approximate_size:
The properties can be used to check whether the Region is healthy or not. If not, you can use them to fix the Region. For example, splitting the Region manually by middle_key_approximate_size
.
Compact data of each TiKV manually
Use the compact
command to manually compact data of each TiKV. If you specify the --from
and --to
options, then their flags are also in the form of escaped raw key. You can use the --host
option to specify the TiKV that you need to compact. The -d
option is used to specify the RocksDB that will be compacted. The optional values are kv
and raft
. Also, the --threads
option allows you to specify the concurrency that you compact and its default value is 8. Generally, a higher concurrency comes with a faster compact speed, which might yet affect the service. You need to choose an appropriate concurrency based on the scenario.
$ tikv-ctl --host 127.0.0.1:20160 compact -d kv
success!
Compact data of the whole TiKV cluster manually
Use the compact-cluster
command to manually compact data of the whole TiKV cluster. The flags of this command have the same meanings and usage as those of the compact
command.
Set a Region to tombstone
The tombstone
command is usually used in circumstances where the sync-log is not enabled, and some data written in the Raft state machine is lost caused by power down.
In a TiKV instance, you can use this command to set the status of some Regions to Tombstone. Then when you restart the instance, those Regions are skipped. Those Regions need to have enough healthy replicas in other TiKV instances to be able to continue writing and reading through the Raft mechanism.
Follow the two steps to set a Region to Tombstone:
Remove the corresponding Peer of this Region on the machine in
pd-ctl
:pd-ctl operator add remove-peer <region_id> <store_id>Use the
tombstone
command to set a Region to Tombstone:tikv-ctl --db /path/to/tikv/db tombstone -p 127.0.0.1:2379 -r <region_id>success!
Send a consistency-check
request to TiKV
Use the consistency-check
command to execute a consistency check among replicas in the corresponding Raft of a specific Region. If the check fails, TiKV itself panics. If the TiKV instance specified by --host
is not the Region leader, an error is reported.
$ tikv-ctl --host 127.0.0.1:20160 consistency-check -r 2
success!
$ tikv-ctl --host 127.0.0.1:20161 consistency-check -r 2
DebugClient::check_region_consistency: RpcFailure(RpcStatus { status: Unknown, details: Some("StringError(\"Leader is on store 1\")") })
Dump snapshot meta
This sub-command is used to parse a snapshot meta file at given path and print the result.
Print the Regions where the Raft state machine corrupts
To avoid checking the Regions while TiKV is started, you can use the tombstone
command to set the Regions where the Raft state machine reports an error to Tombstone. Before running this command, use the bad-regions
command to find out the Regions with errors, so as to combine multiple tools for automated processing.
$ tikv-ctl --db /path/to/tikv/db bad-regions
all regions are healthy
If the command is successfully executed, it prints the above information. If the command fails, it prints the list of bad Regions. Currently, the errors that can be detected include the mismatches between last index
, commit index
and apply index
, and the loss of Raft log. Other conditions like the damage of snapshot files still need further support.
View Region properties
To view in local the properties of Region 2 on the TiKV instance that is deployed in
/path/to/tikv
:$ tikv-ctl --db /path/to/tikv/data/db region-properties -r 2To view online the properties of Region 2 on the TiKV instance that is running on
127.0.0.1:20160
:$ tikv-ctl --host 127.0.0.1:20160 region-properties -r 2
Modify the RocksDB configuration of TiKV dynamically
You can use the modify-tikv-config
command to dynamically modify the configuration arguments. Currently, it only supports dynamically modifying RocksDB related arguments.
-m
is used to specify the target RocksDB. You can set it tokvdb
orraftdb
.-n
is used to specify the configuration name.You can refer to the arguments of
[rocksdb]
and[raftdb]
(corresponding tokvdb
andraftdb
) in the TiKV configuration template.You can use
default|write|lock + . + argument name
to specify the configuration of different CFs. Forkvdb
, you can set it todefault
,write
, orlock
; forraftdb
, you can only set it todefault
.-v
is used to specify the configuration value.
$ tikv-ctl modify-tikv-config -m kvdb -n max_background_jobs -v 8
success!
$ tikv-ctl modify-tikv-config -m kvdb -n write.block-cache-size -v 256MB
success!
$ tikv-ctl modify-tikv-config -m raftdb -n default.disable_auto_compactions -v true
success!
Force Region to recover the service from failure of multiple replicas
Use the unsafe-recover remove-fail-stores
command to remove the failed machines from the peer list of the specified Regions. This command has only one mode "local". Before running this command, you need to stop the target TiKV process to release the file lock.
The -s
option accepts multiple store_id
separated by comma and uses the -r
flag to specify involved Regions. To recover the service from the failure of multiple replicas for all the Regions in one store, specify --all-regions
.
$ tikv-ctl --db /path/to/tikv/db unsafe-recover remove-fail-stores -s 3 -r 1001,1002
success!
$ tikv-ctl --db /path/to/tikv/db unsafe-recover remove-fail-stores -s 4,5 --all-regions
Then after you restart TiKV, these Regions can continue to provide services using the other healthy replicas. This command is usually used in circumstances where multiple TiKV stores are damaged or deleted.
Recover from MVCC data corruption
Use the recover-mvcc
command in circumstances where TiKV cannot run normally caused by MVCC data corruption. It cross-checks 3 CFs ("default", "write", "lock") to recover from various kinds of inconsistency.
Use the -r
option to specify involved Regions by region_id
. Use the -p
option to specify PD endpoints.
$ tikv-ctl --db /path/to/tikv/db recover-mvcc -r 1001,1002 -p 127.0.0.1:2379
success!