Modify Configuration Online

This document describes how to modify the cluster configuration online.

You can update the configuration of components (including TiDB, TiKV, and PD) online using SQL statements, without restarting the cluster components. Currently, the method of changing TiDB instance configuration is different from that of changing configuration of other components (such TiKV and PD).

Common Operations

This section describes the common operations of modifying configuration online.

View instance configuration

To view the configuration of all instances in the cluster, use the show config statement. The result is as follows:

show config;
+------+-----------------+-----------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Type | Instance | Name | Value | +------+-----------------+-----------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | tidb | 127.0.0.1:4001 | advertise-address | 127.0.0.1 | | tidb | 127.0.0.1:4001 | binlog.binlog-socket | | | tidb | 127.0.0.1:4001 | binlog.enable | false | | tidb | 127.0.0.1:4001 | binlog.ignore-error | false | | tidb | 127.0.0.1:4001 | binlog.strategy | range | | tidb | 127.0.0.1:4001 | binlog.write-timeout | 15s | | tidb | 127.0.0.1:4001 | check-mb4-value-in-utf8 | true | ...

You can filter the result by fields. For example:

show config where type='tidb' show config where instance in (...) show config where name like '%log%' show config where type='tikv' and name='log.level'

Modify TiKV configuration online

When using the set config statement, you can modify the configuration of a single instance or of all instances according to the instance address or the component type.

  • Modify the configuration of all TiKV instances:
set config tikv `split.qps-threshold`=1000;
  • Modify the configuration of a single TiKV instance:

    set config "127.0.0.1:20180" `split.qps-threshold`=1000;

If the modification is successful, Query OK is returned:

Query OK, 0 rows affected (0.01 sec)

If an error occurs during the batch modification, a warning is returned:

set config tikv `log-level`='warn';
Query OK, 0 rows affected, 1 warning (0.04 sec)
show warnings;
+---------+------+---------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +---------+------+---------------------------------------------------------------------------------------------------------------+ | Warning | 1105 | bad request to http://127.0.0.1:20180/config: fail to update, error: "config log-level can not be changed" | +---------+------+---------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)

The batch modification does not guarantee atomicity. The modification might succeed on some instances, while failing on others. If you modify the configuration of the entire TiKV cluster using set tikv key=val, your modification might fail on some instances. You can use show warnings to check the result.

If some modifications fail, you need to re-execute the corresponding statement or modify each failed instance. If some TiKV instances cannot be accessed due to network issues or machine failure, modify these instances after they are recovered.

If a configuration item is successfully modified, the result is persisted in the configuration file, which will prevail in the subsequent operations. The names of some configuration items might conflict with TiDB reserved words, such as limit and key. For these configuration items, use backtick ` to enclose them. For example, `raftstore.raft-log-gc-size-limit`.

The following TiKV configuration items can be modified online:

Configuration itemDescription
raftstore.raft-max-inflight-msgsThe number of Raft logs to be confirmed. If this number is exceeded, the Raft state machine slows down log sending.
raftstore.raft-log-gc-tick-intervalThe time interval at which the polling task of deleting Raft logs is scheduled
raftstore.raft-log-gc-thresholdThe soft limit on the maximum allowable number of residual Raft logs
raftstore.raft-log-gc-count-limitThe hard limit on the allowable number of residual Raft logs
raftstore.raft-log-gc-size-limitThe hard limit on the allowable size of residual Raft logs
raftstore.raft-max-size-per-msgThe soft limit on the size of a single message packet that is allowed to be generated
raftstore.raft-entry-cache-life-timeThe maximum remaining time allowed for the log cache in memory
raftstore.split-region-check-tick-intervalThe time interval at which to check whether the Region split is needed
raftstore.region-split-check-diffThe maximum value by which the Region data is allowed to exceed before Region split
raftstore.region-compact-check-intervalThe time interval at which to check whether it is necessary to manually trigger RocksDB compaction
raftstore.region-compact-check-stepThe number of Regions checked at one time for each round of manual compaction
raftstore.region-compact-min-tombstonesThe number of tombstones required to trigger RocksDB compaction
raftstore.region-compact-tombstones-percentThe proportion of tombstone required to trigger RocksDB compaction
raftstore.pd-heartbeat-tick-intervalThe time interval at which a Region's heartbeat to PD is triggered
raftstore.pd-store-heartbeat-tick-intervalThe time interval at which a store's heartbeat to PD is triggered
raftstore.snap-mgr-gc-tick-intervalThe time interval at which the recycle of expired snapshot files is triggered
raftstore.snap-gc-timeoutThe longest time for which a snapshot file is saved
raftstore.lock-cf-compact-intervalThe time interval at which TiKV triggers a manual compaction for the Lock Column Family
raftstore.lock-cf-compact-bytes-thresholdThe size at which TiKV triggers a manual compaction for the Lock Column Family
raftstore.messages-per-tickThe maximum number of messages processed per batch
raftstore.max-peer-down-durationThe longest inactive duration allowed for a peer
raftstore.max-leader-missing-durationThe longest duration allowed for a peer to be without a leader. If this value is exceeded, the peer verifies with PD whether it has been deleted.
raftstore.abnormal-leader-missing-durationThe normal duration allowed for a peer to be without a leader. If this value is exceeded, the peer is seen as abnormal and marked in metrics and logs.
raftstore.peer-stale-state-check-intervalThe time interval to check whether a peer is without a leader
raftstore.consistency-check-intervalThe time interval to check consistency (NOT recommended because it is not compatible with the garbage collection in TiDB)
raftstore.raft-store-max-leader-leaseThe longest trusted period of a Raft leader
raftstore.merge-check-tick-intervalThe time interval for merge check
raftstore.cleanup-import-sst-intervalThe time interval to check expired SST files
raftstore.local-read-batch-sizeThe maximum number of read requests processed in one batch
raftstore.hibernate-timeoutThe shortest wait duration before entering hibernation upon start. Within this duration, TiKV does not hibernate (not released).
raftstore.apply-pool-sizeThe number of threads in the pool that flushes data to the disk, which is the size of the Apply thread pool
raftstore.store-pool-sizeThe number of threads in the pool that processes Raft, which is the size of the Raftstore thread pool
raftstore.apply-max-batch-sizeRaft state machines process data write requests in batches by the BatchSystem. This configuration item specifies the maximum number of Raft state machines that can execute the requests in one batch.
raftstore.store-max-batch-sizeRaft state machines process requests for flushing logs into the disk in batches by the BatchSystem. This configuration item specifies the maximum number of Raft state machines that can process the requests in one batch.
readpool.unified.max-thread-countThe maximum number of threads in the thread pool that uniformly processes read requests, which is the size of the UnifyReadPool thread pool
coprocessor.split-region-on-tableEnables to split Region by table
coprocessor.batch-split-limitThe threshold of Region split in batches
coprocessor.region-max-sizeThe maximum size of a Region
coprocessor.region-split-sizeThe size of the newly split Region
coprocessor.region-max-keysThe maximum number of keys allowed in a Region
coprocessor.region-split-keysThe number of keys in the newly split Region
pessimistic-txn.wait-for-lock-timeoutThe longest duration that a pessimistic transaction waits for the lock
pessimistic-txn.wake-up-delay-durationThe duration after which a pessimistic transaction is woken up
pessimistic-txn.pipelinedDetermines whether to enable the pipelined pessimistic locking process
pessimistic-txn.in-memoryDetermines whether to enable the in-memory pessimistic lock
gc.ratio-thresholdThe threshold at which Region GC is skipped (the number of GC versions/the number of keys)
gc.batch-keysThe number of keys processed in one batch
gc.max-write-bytes-per-secThe maximum bytes that can be written into RocksDB per second
gc.enable-compaction-filterWhether to enable compaction filter
gc.compaction-filter-skip-version-checkWhether to skip the cluster version check of compaction filter (not released)
{db-name}.max-total-wal-sizeThe maximum size of total WAL
{db-name}.max-background-jobsThe number of background threads in RocksDB
{db-name}.max-background-flushesThe maximum number of flush threads in RocksDB
{db-name}.max-open-filesThe total number of files that RocksDB can open
{db-name}.compaction-readahead-sizeThe size of readahead during compaction
{db-name}.bytes-per-syncThe rate at which OS incrementally synchronizes files to disk while these files are being written asynchronously
{db-name}.wal-bytes-per-syncThe rate at which OS incrementally synchronizes WAL files to disk while the WAL files are being written
{db-name}.writable-file-max-buffer-sizeThe maximum buffer size used in WritableFileWrite
{db-name}.{cf-name}.block-cache-sizeThe cache size of a block
{db-name}.{cf-name}.write-buffer-sizeThe size of a memtable
{db-name}.{cf-name}.max-write-buffer-numberThe maximum number of memtables
{db-name}.{cf-name}.max-bytes-for-level-baseThe maximum number of bytes at base level (L1)
{db-name}.{cf-name}.target-file-size-baseThe size of the target file at base level
{db-name}.{cf-name}.level0-file-num-compaction-triggerThe maximum number of files at L0 that trigger compaction
{db-name}.{cf-name}.level0-slowdown-writes-triggerThe maximum number of files at L0 that trigger write stall
{db-name}.{cf-name}.level0-stop-writes-triggerThe maximum number of files at L0 that completely block write
{db-name}.{cf-name}.max-compaction-bytesThe maximum number of bytes written into disk per compaction
{db-name}.{cf-name}.max-bytes-for-level-multiplierThe default amplification multiple for each layer
{db-name}.{cf-name}.disable-auto-compactionsEnables or disables automatic compaction
{db-name}.{cf-name}.soft-pending-compaction-bytes-limitThe soft limit on the pending compaction bytes
{db-name}.{cf-name}.hard-pending-compaction-bytes-limitThe hard limit on the pending compaction bytes
{db-name}.{cf-name}.titan.blob-run-modeThe mode of processing blob files
storage.block-cache.capacityThe size of shared block cache (supported since v4.0.3)
storage.scheduler-worker-pool-sizeThe number of threads in the Scheduler thread pool
backup.num-threadsThe number of backup threads (supported since v4.0.3)
split.qps-thresholdThe threshold to execute load-base-split on a Region. If the QPS of read requests for a Region exceeds qps-threshold for a consecutive period of time, this Region should be split.
split.byte-thresholdThe threshold to execute load-base-split on a Region. If the traffic of read requests for a Region exceeds the byte-threshold for a consecutive period of time, this Region should be split.
split.split-balance-scoreThe parameter of load-base-split, which ensures the load of the two split Regions is as balanced as possible. The smaller the value is, the more balanced the load is. But setting it too small might cause split failure.
split.split-contained-scoreThe parameter of load-base-split. The smaller the value, the fewer cross-Region visits after Region split.
cdc.min-ts-intervalThe time interval at which Resolved TS is forwarded
cdc.old-value-cache-memory-quotaThe upper limit of memory occupied by the TiCDC Old Value entries
cdc.sink-memory-quotaThe upper limit of memory occupied by TiCDC data change events
cdc.incremental-scan-speed-limitThe upper limit on the speed of incremental scanning for historical data
cdc.incremental-scan-concurrencyThe maximum number of concurrent incremental scanning tasks for historical data

In the table above, parameters with the {db-name} or {db-name}.{cf-name} prefix are configurations related to RocksDB. The optional values of db-name are rocksdb and raftdb.

  • When db-name is rocksdb, the optional values of cf-name are defaultcf, writecf, lockcf, and raftcf.
  • When db-name is raftdb, the value of cf-name can be defaultcf.

For detailed parameter description, refer to TiKV Configuration File.

Modify PD configuration online

Currently, PD does not support the separate configuration for each instance. All PD instances share the same configuration.

You can modify the PD configurations using the following statement:

set config pd `log.level`='info';

If the modification is successful, Query OK is returned:

Query OK, 0 rows affected (0.01 sec)

If a configuration item is successfully modified, the result is persisted in etcd instead of in the configuration file; the configuration in etcd will prevail in the subsequent operations. The names of some configuration items might conflict with TiDB reserved words. For these configuration items, use backtick ` to enclose them. For example, `schedule.leader-schedule-limit`.

The following PD configuration items can be modified online:

Configuration itemDescription
log.levelThe log level
cluster-versionThe cluster version
schedule.max-merge-region-sizeControls the size limit of Region Merge (in MB)
schedule.max-merge-region-keysSpecifies the maximum numbers of the Region Merge keys
schedule.patrol-region-intervalDetermines the frequency at which replicaChecker checks the health state of a Region
schedule.split-merge-intervalDetermines the time interval of performing split and merge operations on the same Region
schedule.max-snapshot-countDetermines the maximum number of snapshots that a single store can send or receive at the same time
schedule.max-pending-peer-countDetermines the maximum number of pending peers in a single store
schedule.max-store-down-timeThe downtime after which PD judges that the disconnected store can not be recovered
schedule.leader-schedule-policyDetermines the policy of Leader scheduling
schedule.leader-schedule-limitThe number of Leader scheduling tasks performed at the same time
schedule.region-schedule-limitThe number of Region scheduling tasks performed at the same time
schedule.replica-schedule-limitThe number of Replica scheduling tasks performed at the same time
schedule.merge-schedule-limitThe number of the Region Merge scheduling tasks performed at the same time
schedule.hot-region-schedule-limitThe number of hot Region scheduling tasks performed at the same time
schedule.hot-region-cache-hits-thresholdDetermines the threshold at which a Region is considered a hot spot
schedule.high-space-ratioThe threshold ratio below which the capacity of the store is sufficient
schedule.low-space-ratioThe threshold ratio above which the capacity of the store is insufficient
schedule.tolerant-size-ratioControls the balance buffer size
schedule.enable-remove-down-replicaDetermines whether to enable the feature that automatically removes DownReplica
schedule.enable-replace-offline-replicaDetermines whether to enable the feature that migrates OfflineReplica
schedule.enable-make-up-replicaDetermines whether to enable the feature that automatically supplements replicas
schedule.enable-remove-extra-replicaDetermines whether to enable the feature that removes extra replicas
schedule.enable-location-replacementDetermines whether to enable isolation level check
schedule.enable-cross-table-mergeDetermines whether to enable cross-table merge
schedule.enable-one-way-mergeEnables one-way merge, which only allows merging with the next adjacent Region
replication.max-replicasSets the maximum number of replicas
replication.location-labelsThe topology information of a TiKV cluster
replication.enable-placement-rulesEnables Placement Rules
replication.strictly-match-labelEnables the label check
pd-server.use-region-storageEnables independent Region storage
pd-server.max-gap-reset-tsSets the maximum interval of resetting timestamp (BR)
pd-server.key-typeSets the cluster key type
pd-server.metric-storageSets the storage address of the cluster metrics
pd-server.dashboard-addressSets the dashboard address
replication-mode.replication-modeSets the backup mode

For detailed parameter description, refer to PD Configuration File.

Modify TiDB configuration online

Currently, the method of changing TiDB configuration is different from that of changing TiKV and PD configurations. You can modify TiDB configuration by using system variables.

The following example shows how to modify slow-threshold online by using the tidb_slow_log_threshold variable.

The default value of slow-threshold is 300 ms. You can set it to 200 ms by using tidb_slow_log_threshold.

set tidb_slow_log_threshold = 200;
Query OK, 0 rows affected (0.00 sec)
select @@tidb_slow_log_threshold;
+---------------------------+ | @@tidb_slow_log_threshold | +---------------------------+ | 200 | +---------------------------+ 1 row in set (0.00 sec)

The following TiDB configuration items can be modified online:

Configuration itemSQL variable
mem-quota-querytidb_mem_quota_query
log.enable-slow-logtidb_enable_slow_log
log.slow-thresholdtidb_slow_log_threshold
log.expensive-thresholdtidb_expensive_query_time_threshold