TiDB Cluster Configurations in Kubernetes

This document introduces the following items of a TiDB cluster in Kubernetes:

  • The configuration parameters
  • The configuration of resources
  • The configuration of disaster recovery

Configuration parameters

TiDB Operator uses Helm to deploy and manage TiDB clusters. The configuration file obtained through Helm provides the basic configuration by default with which you could quickly start a TiDB cluster. However, if you want special configurations or are deploying in a production environment, you need to manually configure the corresponding parameters according to the table below.

ParameterDescriptionDefault Value
rbac.createWhether to enable the RBAC mode of Kubernetestrue
clusterNameThe TiDB cluster name. This variable is unset by default. In this case, tidb-cluster directly replaces it with ReleaseName when the cluster is being installed.nil
extraLabelsAdds extra labels to the TidbCluster object (CRD). See labels{}
schedulerNameThe scheduler used by the TiDB clustertidb-scheduler
timezoneThe default timezone used by the TiDB clusterUTC
pvReclaimPolicyThe reclaim policy for PV (Persistent Volume) used by the TiDB clusterRetain
services[0].nameThe name of the service that the TiDB cluster exposesnil
services[0].typeThe type of the service that the TiDB cluster exposes (selected from ClusterIP, NodePort and LoadBalancer)nil
discovery.imageThe image of PD's service discovery component in the TiDB cluster. This component is used to provide service discovery for each PD instance to coordinate the starting sequence when the PD cluster is started for the first time.pingcap/tidb-operator:v1.0.0
discovery.imagePullPolicyThe pulling policy for the image of PD's service discovery componentIfNotPresent
discovery.resources.limits.cpuThe CPU resource limit of PD's service discovery component250m
discovery.resources.limits.memoryThe memory resource limit of PD's service discovery component150Mi
discovery.resources.requests.cpuThe CPU resource request of PD's service discovery component80m
discovery.resources.requests.memoryThe memory resource request of PD's service discovery component50Mi
enableConfigMapRolloutWhether to enable the automatic rolling update of the TiDB cluster. If enabled, the TiDB cluster automatically updates the corresponding components when the ConfigMap of this cluster changes. This configuration is only supported in tidb-operator v1.0 and later versions.false
pd.configThe configuration of PD. Check the config.toml file for the default PD configuration file (by choosing the tag of the corresponding PD version). You can see PD Configuration Flags for the detailed description of the configuration parameters (by choosing the corresponding document version). Here you must modify the configuration based on the format of the configuration file.If the version of TiDB Operator is v1.0.0 or earlier, the default value is
nil
If the version of TiDB Operator is later than v1.0.0, the default value is
[log]
level = "info"
[replication]
location-labels = ["region", "zone", "rack", "host"].
Sample configuration:
  config: |
    [log]
    level = "info"
    [replication]
    location-labels = ["region", "zone", "rack", "host"]
pd.replicasThe number of Pods in PD3
pd.imageThe PD imagepingcap/pd:v3.0.0-rc.1
pd.imagePullPolicyThe pulling policy for the PD imageIfNotPresent
pd.logLevelThe log level of PD
If the version of TiDB Operator is later than v1.0.0, configure the parameter via pd.config:
[log]
level = "info"
info
pd.storageClassNameThe storageClass used by PD. storageClassName refers to a type of storage provided by the Kubernetes cluster, which might map to a level of service quality, a backup policy, or to any policy determined by the cluster administrator. Detailed reference: storage-classeslocal-storage
pd.maxStoreDownTimeThis parameter indicates how soon a store node is marked as down after it is disconnected. When the state changes to down, the store node starts migrating data to other store nodes.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via pd.config:
[schedule]
max-store-down-time = "30m"
30m
pd.maxReplicasThe number of data replicas in the TiDB cluster
If the version of TiDB Operator is later than v1.0.0, configure this parameter via pd.config:
[replication]
max-replicas = 3
3
pd.resources.limits.cpuThe CPU resource limit per PD Podnil
pd.resources.limits.memoryThe memory resource limit per PD Podnil
pd.resources.limits.storageThe storage limit per PD Podnil
pd.resources.requests.cpuThe CPU resource requests of each PD Podnil
pd.resources.requests.memoryThe memory resource requests of each PD Podnil
pd.resources.requests.storageThe storage requests of each PD Pod1Gi
pd.affinityDefines PD's scheduling rules and preferences. Detailed reference: affinity-and-anti-affinity{}
pd.nodeSelectorEnsures that PD Pods are only scheduled to the node with the specific key-value pair as the label. Detailed reference: nodeSelector{}
pd.tolerationsApplies to PD Pods, allowing the Pods to be scheduled to the nodes with specified taints. Detailed reference: taint-and-toleration{}
pd.annotationsAdds a specific annotations for PD Pods.{}
tikv.configThe configuration of TiKV. Check the config-template.toml file for the default TiKV configuration file (by choosing the tag of the corresponding TiKV version). You can see TiKV Configuration Flags for the detailed description of the configuration parameters (by choosing the corresponding document version). Here you must modify the configuration based on the format of the configuration file.

You need to explicitly configure the following two configuration items:

[storage.block-cache]
  shared = true
  capacity = "1GB"
It is recommended to set capacity to 50% of the value of tikv.resources.limits.memory.

[readpool.coprocessor]
  high-concurrency = 8
  normal-concurrency = 8
  low-concurrency = 8
It is recommended to set to 80% of the value of tikv.resources.limits.cpu.
If the version of TiDB Operator is v1.0.0-beta.3 or earlier, the default value is
nil
If the version of TiDB Operator is later than v1.0.0-beta.3, the default value is
log-level = "info"
Sample configuration:
  config: |
    log-level = "info"
tikv.replicasThe number of Pods in TiKV3
tikv.imageThe TiKV imagepingcap/tikv:v3.0.0-rc.1
tikv.imagePullPolicyThe pulling policy for the TiKV imageIfNotPresent
tikv.logLevelThe level of TiKV logs
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tikv.config:
log-level = "info"
info
tikv.storageClassNameThe storageClass used by TiKV. storageClassName refers to a type of storage provided by the Kubernetes cluster, which might map to a level of service quality, a backup policy, or to any policy determined by the cluster administrator. Detailed reference: storage-classeslocal-storage
tikv.syncLogSyncLog means whether to enable the raft log replication. Enabling this feature ensures that data will not be lost when power is off.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tikv.config:
[raftstore]
sync-log = true
true
tikv.grpcConcurrencyConfigures the thread pool size of the gRPC server.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tikv.config:
[server]
grpc-concurrency = 4
4
tikv.resources.limits.cpuThe CPU resource limit per TiKV Podnil
tikv.resources.limits.memoryThe memory resource limit per TiKV Podnil
tikv.resources.limits.storageThe storage limit per TiKV Podnil
tikv.resources.requests.cpuThe CPU resource requests of each TiKV Podnil
tikv.resources.requests.memoryThe memory resource requests of each TiKV Podnil
tikv.resources.requests.storageThe storage requests of each TiKV Pod10Gi
tikv.affinityDefines TiKV's scheduling rules and preferences. Detailed reference:affinity-and-anti-affinity{}
tikv.nodeSelectorEnsures that TiKV Pods are only scheduled to the node with the specific key-value pair as the label. Detailed reference: nodeSelector{}
tikv.tolerationsApplies to TiKV Pods, allowing TiKV Pods to be scheduled to the nodes with specified taints. Detailed reference: taint-and-toleration{}
tikv.annotationsAdds a specific annotations for TiKV Pods.{}
tikv.defaultcfBlockCacheSizeSpecifies the size of block cache which is used to cache uncompressed blocks. Larger block cache settings speed up reads. It is recommended to set the parameter to 30%-50% of the value of tikv.resources.limits.memory.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tikv.config:
[rocksdb.defaultcf]
block-cache-size = "1GB"
From TiKV v3.0.0 on, you do not need to configure [rocksdb.defaultcf].block-cache-size and [rocksdb.writecf].block-cache-size. Instead, configure [storage.block-cache].capacity.
1GB
tikv.writecfBlockCacheSizeSpecifies the size of writecf block cache. It is recommended to set the parameter to 10%-30% of the value of tikv.resources.limits.memory.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tikv.config:
[rocksdb.writecf]
block-cache-size = "256MB"
From TiKV v3.0.0 on, you do not need to configure [rocksdb.defaultcf].block-cache-size and [rocksdb.writecf].block-cache-size. Instead, configure [storage.block-cache].capacity.
256MB
tikv.readpoolStorageConcurrencyThe size of thread pool for high priority, normal priority or low priority operations in the TiKV storage
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tikv.config:
[readpool.storage]
high-concurrency = 4
normal-concurrency = 4
low-concurrency = 4
4
tikv.readpoolCoprocessorConcurrencyIf tikv.resources.limits.cpu is greater than 8, set the value of tikv.readpoolCoprocessorConcurrency to tikv.resources.limits.cpu * 0.8
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tikv.config:
[readpool.coprocessor]
high-concurrency = 8
normal-concurrency = 8
low-concurrency = 8
8
tikv.storageSchedulerWorkerPoolSizeThe worker pool size of the TiKV scheduler. This size must be increased in the case of rewriting but be smaller than the total CPU cores.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tikv.config:
[storage]
scheduler-worker-pool-size = 4
4
tidb.configThe configuration of TiDB. Check the config.toml.example file for the default TiDB configuration file (by choosing the tag of the corresponding TiDB version). You can see TiDB Configuration File Description for the detailed description of the configuration parameters (by choosing the corresponding document version). Here you must modify the configuration based on the format of the configuration file.

You need to explicitly configure the following configuration items:

[performance]
  max-procs = 0
It is recommended to set max-procs to the value of corresponding cores of tidb.resources.limits.cpu
If the version of TiDB Operator is v1.0.0-beta.3 or earlier, the default value is
nil
If the version of TiDB Operator is later than v1.0.0-beta.3, the default value is
[log]
level = "info"
Sample configuration:
  config: |
    [log]
    level = "info"
tidb.replicasThe number of Pods in TiDB2
tidb.imageThe TiDB imagepingcap/tidb:v3.0.0-rc.1
tidb.imagePullPolicyThe pulling policy for the TiDB imageIfNotPresent
tidb.logLevelThe level of TiDB logs
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
[log]
level = "info"
info
tidb.resources.limits.cpuThe CPU resource limit per TiDB Podnil
tidb.resources.limits.memoryThe memory resource limit per TiDB Podnil
tidb.resources.requests.cpuThe CPU resource requests of each TiDB Podnil
tidb.resources.requests.memoryThe memory resource requests of each TiDB Podnil
tidb.passwordSecretNameThe name of the Secret that stores the TiDB username and password. The Secret can create a secret with this command: kubectl create secret generic tidb secret--from literal=root=<root password>--namespace=<namespace>. If the parameter is unset, TiDB root password is empty.nil
tidb.initSqlThe initialization script that will be executed after a TiDB cluster is successfully started.nil
tidb.affinityDefines TiDB's scheduling rules and preferences. Detailed reference: affinity-and-anti-affinity{}
tidb.nodeSelectorEnsures that TiDB Pods are only scheduled to the node with the specific key-value pair as the label. Detailed reference: nodeSelector{}
tidb.tolerationsApplies to TiDB Pods, allowing TiDB Pods to be scheduled to nodes with specified taints. Detailed reference: taint-and-toleration{}
tidb.annotationsAdds a specific annotations for TiDB Pods.{}
tidb.maxFailoverCountThe maximum number of failovers for TiDB. Assuming the number is 3, that is, up to 3 failovers TiDB instances are supported at the same time.3
tidb.service.typeThe type of service that the TiDB cluster exposesNodeport
tidb.service.externalTrafficPolicyWhether this Service routes external traffic to a node-local or cluster-wide endpoint. There are two options available: Cluster(by default) and Local. Cluster obscures the client source IP and some traffic needs to hop twice among nodes for the intended node, but with a good overall load distribution. Local preserves the client source IP and avoids a second hop for the LoadBalancer and Nodeport type services, but risks potentially imbalanced traffic distribution. Detailed reference: External LoadBalancernil
tidb.service.loadBalancerIPSpecifies the IP of LoadBalancer. Some cloud providers allow you to specify loadBalancerIP. In these cases, the LoadBalancer will be created using the user-specified loadBalancerIP. If the loadBalancerIP field is not specified, the LoadBalancer will be set using the temporary IP address. If loadBalancerIP is specified but the cloud provider does not support this feature, the loadbalancerIP field you set will be ignored.nil
tidb.service.mysqlNodePortThe MySQL NodePort that TiDB Service exposes
tidb.service.exposeStatusThe port that indicates the expose status of TiDB Servicetrue
tidb.service.statusNodePortThe NodePort exposed through specifying the status of TiDB Service
tidb.separateSlowLogWhether to run in the sidecar mode the SlowLog of TiDB exported by the independent containerIf the version of TiDB Operator is v1.0.0 or earlier, the default value is false.
If the version of TiDB Operator is later than v1.0.0, the default value is true.
tidb.slowLogTailer.imageThe image of TiDB's slowLogTailer. slowLogTailer is a container of the sidecar type, used to export the SlowLog of TiDB. This configuration only takes effect when tidb.separateSlowLog=true.busybox:1.26.2
tidb.slowLogTailer.resources.limits.cpuThe CPU resource limit per TiDB Pod's slowLogTailer100m
tidb.slowLogTailer.resources.limits.memoryThe memory resource limit per TiDB Pod's slowLogTailer50Mi
tidb.slowLogTailer.resources.requests.cpuThe requests of each TiDB Pod's slowLogTailer for CPU resources20m
tidb.slowLogTailer.resources.requests.memoryThe requests of each TiDB Pod's slowLogTailer for memory resources5Mi
tidb.plugin.enableWhether to enable the TiDB pluginfalse
tidb.plugin.directorySpecifies the directory where the TiDB plugin is located./plugins
tidb.plugin.listSpecifies a list of plugins loaded on TiDB. The naming rules of Plugin ID: plugin name-version. For example: 'conn_limit-1'.[]
tidb.preparedPlanCacheEnabledWhether to enable TiDB's prepared plan cache
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
[prepared-plan-cache]
enabled = false
false
tidb.preparedPlanCacheCapacityThe cache capacity of TiDB's prepared plan
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
[prepared-plan-cache]
capacity = 100
100
tidb.txnLocalLatchesEnabledWhether to enable the memory lock for transactions. It is recommended to enable the lock when there are many local transaction conflicts.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
[txn-local-latches]
enabled = false
false
tidb.txnLocalLatchesCapacityThe capacity of the transaction memory lock. The number of slots corresponding to Hash is automatically adjusted upward to an exponential multiple of 2. Each slot occupies 32 Bytes of memory. When the range of writing data is relatively wide (such as importing data), setting this parameter too small a value results in lower performance.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
[txn-local-latches]
capacity = 10240000
10240000
tidb.tokenLimitThe restrictions on TiDB to execute concurrent sessions
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
token-limit = 1000
1000
tidb.memQuotaQueryThe memory quota for TiDB queries, which is 32GB by default.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
mem-quota-query = 34359738368
34359738368
tidb.checkMb4ValueInUtf8Controls whether to check the mb4 characters when the character set is utf8.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
check-mb4-value-in-utf8 = true
true
tidb.treatOldVersionUtf8AsUtf8mb4This parameter is used for upgrading compatibility. When it is set to true, utf8 character set in the old table/column is treated as utf8mb4.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
treat-old-version-utf8-as-utf8mb4 = true
true
tidb.leaseThe lease time of TiDB Schema lease. It is highly risky to change this parameter. Therefore, it is not recommended to do so unless you know exactly what might be happening.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
lease = "45s"
45s
tidb.maxProcsThe maximum available CPU cores. 0 represents the number of CPU on the machine or Pod.
If the version of TiDB Operator is later than v1.0.0, configure this parameter via tidb.config:
[performance]
max-procs = 0
0

Resource configuration

Before deploying a TiDB cluster, it is necessary to configure the resources for each component of the cluster depending on your needs. PD, TiKV and TiDB are the core service components of a TiDB cluster. In a production environment, their resource configurations must be specified according to component needs. Detailed reference: Hardware Recommendations.

To ensure the proper scheduling and stable operation of the components of the TiDB cluster in Kubernetes, it is recommended to set Guaranteed-level QoS by letting limits equal to requests when configuring resources. Detailed reference: Configure Quality of Service for Pods.

If you are using a NUMA-based CPU, you need to enable Static's CPU management policy on the node for better performance. In order to allow the TiDB cluster component to monopolize the corresponding CPU resources, the CPU quota must be an integer greater than or equal to 1 besides setting Guaranteed-level QoS as mentioned above. Detailed reference: Control CPU Management Policies on the Node.

Disaster recovery configuration

TiDB is a distributed database and its disaster recovery must ensure that when any physical topology node fails, not only the service is unaffected, but also the data is complete and available. The two configurations of disaster recovery are described separately as follows.

Disaster recovery of TiDB service

The disaster recovery of TiDB service is essentially based on Kubernetes' scheduling capabilities. To optimize scheduling, TiDB Operator provides a custom scheduler that guarantees the disaster recovery of TiDB service at the host level through the specified scheduling algorithm. Currently, the TiDB cluster uses this scheduler as the default scheduler, which is configured through the item schedulerName in the above table.

Disaster recovery at other levels (such as rack, zone, region) are guaranteed by Affinity's PodAntiAffinity. PodAntiAffinity can avoid the situation where different instances of the same component are deployed on the same physical topology node. In this way, disaster recovery is achieved. Detailed user guide for Affinity: Affinity & AntiAffinity.

The following is an example of a typical disaster recovery setup:

affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: # this term works when the nodes have the label named region - weight: 10 podAffinityTerm: labelSelector: matchLabels: app.kubernetes.io/instance: <release name> app.kubernetes.io/component: "pd" topologyKey: "region" namespaces: - <helm namespace> # this term works when the nodes have the label named zone - weight: 20 podAffinityTerm: labelSelector: matchLabels: app.kubernetes.io/instance: <release name> app.kubernetes.io/component: "pd" topologyKey: "zone" namespaces: - <helm namespace> # this term works when the nodes have the label named rack - weight: 40 podAffinityTerm: labelSelector: matchLabels: app.kubernetes.io/instance: <release name> app.kubernetes.io/component: "pd" topologyKey: "rack" namespaces: - <helm namespace> # this term works when the nodes have the label named kubernetes.io/hostname - weight: 80 podAffinityTerm: labelSelector: matchLabels: app.kubernetes.io/instance: <release name> app.kubernetes.io/component: "pd" topologyKey: "kubernetes.io/hostname" namespaces: - <helm namespace>

Disaster recovery of data

Before configuring the data disaster recovery, read Information Configuration of the Cluster Typology which describes how the disaster recovery of the TiDB cluster is implemented.

To add the data disaster recovery feature in Kubernetes:

  1. Set the label collection of topological location for PD

    Configure location-labels in the pd.config file using the labels that describe the topology on the nodes in the Kubernetes cluster.

  2. Set the topological information of the Node where the TiKV node is located.

    TiDB Operator automatically obtains the topological information of the Node for TiKV and calls the PD interface to set this information as the information of TiKV's store labels. Based on this topological information, the TiDB cluster schedules the replicas of the data.

    If the Node of the current Kubernetes cluster does not have a label indicating the topological location, or if the existing label name of topology contains /, you can manually add a label to the Node by running the following command:

    kubectl label node <nodeName> region=<regionName> zone=<zoneName> rack=<rackName> kubernetes.io/hostname=<hostName>

    In the command above, region, zone, rack, and kubernetes.io/hostname are just examples. The name and number of the label to be added can be arbitrarily defined, as long as it conforms to the specification and is consistent with the labels set by location-labels in pd.config.