Scale TiDB in Kubernetes

This document introduces how to horizontally and vertically scale a TiDB cluster in Kubernetes.

Horizontal scaling

Horizontally scaling TiDB means that you scale TiDB out or in by adding or remove nodes in your pool of resources. When you scale a TiDB cluster, PD, TiKV, and TiDB are scaled out or in sequentially according to the values of their replicas. Scaling out operations add nodes based on the node ID in ascending order, while scaling in operations remove nodes based on the node ID in descending order.

Currently, the TiDB cluster supports management by TidbCluster Custom Resource (CR).

Scale PD, TiDB, and TiKV

Modify spec.pd.replicas, spec.tidb.replicas, and spec.tikv.replicas in the TidbCluster object of the cluster to a desired value using kubectl. You can modify the values in the local file or using online command.

  • You can also online modify the TidbCluster definition in the Kubernetes cluster by running the following command:

    kubectl edit tidbcluster ${cluster_name} -n ${namespace}

Check whether the TiDB cluster in Kubernetes has updated to your desired definition by running the following command:

kubectl get tidbcluster ${cluster_name} -n ${namespace} -oyaml

In the TidbCluster file output by the command above, if the values of spec.pd.replicas, spec.tidb.replicas, and spec.tikv.replicas are consistent with the values you have modified, check whether the number of TidbCluster Pods has increased or decreased by running the following command:

watch kubectl -n ${namespace} get pod -o wide

For the PD and TiDB components, it might take 10-30 seconds to scale in or out.

For the TiKV component, it might take 3-5 minutes to scale in or out because the process involves data migration.

Scale out TiFlash

If TiFlash is deployed in the cluster, you can scale out TiFlash by modifying spec.tiflash.replicas.

Scale TiCDC

If TiCDC is deployed in the cluster, you can scale out TiCDC by modifying spec.ticdc.replicas.

Scale in TiFlash

  1. Expose the PD service by using port-forward:

    kubectl port-forward -n ${namespace} svc/${cluster_name}-pd 2379:2379
  2. Open a new terminal tab or window. Check the maximum number (N) of replicas of all data tables with which TiFlash is enabled by running the following command:

    curl 127.0.0.1:2379/pd/api/v1/config/rules/group/tiflash | grep count

    In the printed result, the largest value of count is the maximum number (N) of replicas of all data tables.

  3. Go back to the terminal window in Step 1, where port-forward is running. Press Ctrl+C to stop port-forward.

  4. After the scale-in operation, if the number of remaining Pods in TiFlash >= N, skip to Step 6. Otherwise, take the following steps:

    1. Refer to Access TiDB and connect to the TiDB service.

    2. For all the tables that have more replicas than the remaining Pods in TiFlash, run the following command:

      alter table <db_name>.<table_name> set tiflash replica 0;
  5. Wait for TiFlash replicas in the related tables to be deleted.

    Connect to the TiDB service, and run the following command:

    SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = '<db_name>' and TABLE_NAME = '<table_name>';

    If you cannot view the replication information of related tables, the TiFlash replicas are successfully deleted.

  6. Modify spec.tiflash.replicas to scale in TiFlash.

    Check whether TiFlash in the TiDB cluster in Kubernetes has updated to your desired definition. Run the following command and see whether the value of spec.tiflash.replicas returned is expected:

    kubectl get tidbcluster ${cluster-name} -n ${namespace} -oyaml

View the horizontal scaling status

To view the scaling status of the cluster, run the following command:

watch kubectl -n ${namespace} get pod -o wide

When the number of Pods for all components reaches the preset value and all components go to the Running state, the horizontal scaling is completed.

Horizontal scaling failure

During the horizontal scaling operation, Pods might go to the Pending state because of insufficient resources. See Troubleshoot the Pod in Pending state.

Vertical scaling

Vertically scaling TiDB means that you scale TiDB up or down by increasing or decreasing the limit of resources on the node. Vertically scaling is essentially the rolling update of the nodes.

Currently, the TiDB cluster supports management by TidbCluster Custom Resource (CR).

Vertical scaling operations

Modify spec.pd.resources, spec.tikv.resources, and spec.tidb.resources in the TidbCluster object that corresponds to the cluster to the desired values using kubectl.

If TiFlash is deployed in the cluster, you can scale up and down TiFlash by modifying spec.tiflash.resources.

If TiCDC is deployed in the cluster, you can scale up and down TiCDC by modifying spec.ticdc.resources.

View the vertical scaling progress

To view the upgrade progress of the cluster, run the following command:

watch kubectl -n ${namespace} get pod -o wide

When all Pods are rebuilt and in the Running state, the vertical scaling is completed.

Vertical scaling failure

During the vertical scaling operation, Pods might go to the Pending state because of insufficient resources. See Troubleshoot the Pod in Pending state for details.