Upgrade TiDB Using TiUP

This document is targeted for the following upgrade paths:

  • Upgrade from TiDB 4.0 versions to TiDB 8.0.
  • Upgrade from TiDB 5.0-5.4 versions to TiDB 8.0.
  • Upgrade from TiDB 6.0-6.6 to TiDB 8.0.
  • Upgrade from TiDB 7.0-7.6 to TiDB 8.0.

Upgrade caveat

  • TiDB currently does not support version downgrade or rolling back to an earlier version after the upgrade.
  • For the v4.0 cluster managed using TiDB Ansible, you need to import the cluster to TiUP (tiup cluster) for new management according to Upgrade TiDB Using TiUP (v4.0). Then you can upgrade the cluster to v8.0.0 according to this document.
  • To update versions earlier than v3.0 to v8.0.0:
    1. Update this version to 3.0 using TiDB Ansible.
    2. Use TiUP (tiup cluster) to import the TiDB Ansible configuration.
    3. Update the 3.0 version to 4.0 according to Upgrade TiDB Using TiUP (v4.0).
    4. Upgrade the cluster to v8.0.0 according to this document.
  • Support upgrading the versions of TiDB Binlog, TiCDC, TiFlash, and other components.
  • When upgrading TiFlash from versions earlier than v6.3.0 to v6.3.0 and later versions, note that the CPU must support the AVX2 instruction set under the Linux AMD64 architecture and the ARMv8 instruction set architecture under the Linux ARM64 architecture. For details, see the description in v6.3.0 Release Notes.
  • For detailed compatibility changes of different versions, see the Release Notes of each version. Modify your cluster configuration according to the "Compatibility Changes" section of the corresponding release notes.
  • When updating clusters from versions earlier than v5.3 to v5.3 or later versions, note that there is a time format change in the alerts generated by the default deployed Prometheus. This format change is introduced starting from Prometheus v2.27.1. For more information, see Prometheus commit.

Preparations

This section introduces the preparation works needed before upgrading your TiDB cluster, including upgrading TiUP and the TiUP Cluster component.

Step 1: Review compatibility changes

Review the compatibility changes in TiDB v8.0.0 release notes. If any changes affect your upgrade, take actions accordingly.

Step 2: Upgrade TiUP or TiUP offline mirror

Before upgrading your TiDB cluster, you first need to upgrade TiUP or TiUP mirror.

Upgrade TiUP and TiUP Cluster

  1. Upgrade the TiUP version. It is recommended that the TiUP version is 1.11.3 or later.

    tiup update --self tiup --version
  2. Upgrade the TiUP Cluster version. It is recommended that the TiUP Cluster version is 1.11.3 or later.

    tiup update cluster tiup cluster --version

Upgrade TiUP offline mirror

Refer to Deploy a TiDB Cluster Using TiUP - Deploy TiUP offline to download the TiUP mirror of the new version and upload it to the control machine. After executing local_install.sh, TiUP will complete the overwrite upgrade.

tar xzvf tidb-community-server-${version}-linux-amd64.tar.gz sh tidb-community-server-${version}-linux-amd64/local_install.sh source /home/tidb/.bash_profile

After the overwrite upgrade, run the following command to merge the server and toolkit offline mirrors to the server directory:

tar xf tidb-community-toolkit-${version}-linux-amd64.tar.gz ls -ld tidb-community-server-${version}-linux-amd64 tidb-community-toolkit-${version}-linux-amd64 cd tidb-community-server-${version}-linux-amd64/ cp -rp keys ~/.tiup/ tiup mirror merge ../tidb-community-toolkit-${version}-linux-amd64

After merging the mirrors, run the following command to upgrade the TiUP Cluster component:

tiup update cluster

Now, the offline mirror has been upgraded successfully. If an error occurs during TiUP operation after the overwriting, it might be that the manifest is not updated. You can try rm -rf ~/.tiup/manifests/* before running TiUP again.

Step 3: Edit TiUP topology configuration file

  1. Enter the vi editing mode to edit the topology file:

    tiup cluster edit-config <cluster-name>
  2. Refer to the format of topology configuration template and fill the parameters you want to modify in the server_configs section of the topology file.

  3. After the modification, enter : + w + q to save the change and exit the editing mode. Enter Y to confirm the change.

Step 4: Check the DDL and backup status of the cluster

To avoid undefined behaviors or other unexpected problems during the upgrade, it is recommended to check the following items before the upgrade.

  • Cluster DDLs:

    • If you use smooth upgrade, you do not need to check the DDL operations of your TiDB cluster. You do not need to wait for the completion of DDL jobs or cancel ongoing DDL jobs.
    • If you do not use smooth upgrade, it is recommended to use the ADMIN SHOW DDL statement to check whether ongoing DDL jobs exist. If an ongoing DDL job exists, wait for the completion of its execution or cancel it using the ADMIN CANCEL DDL statement before performing an upgrade.
  • Cluster backup: It is recommended to execute the SHOW [BACKUPS|RESTORES] statement to check whether there is an ongoing backup or restore task in the cluster. If yes, wait for its completion before performing an upgrade.

Step 5: Check the health status of the current cluster

To avoid the undefined behaviors or other issues during the upgrade, it is recommended to check the health status of Regions of the current cluster before the upgrade. To do that, you can use the check sub-command.

tiup cluster check <cluster-name> --cluster

After the command is executed, the "Region status" check result will be output.

  • If the result is "All Regions are healthy", all Regions in the current cluster are healthy and you can continue the upgrade.
  • If the result is "Regions are not fully healthy: m miss-peer, n pending-peer" with the "Please fix unhealthy regions before other operations." prompt, some Regions in the current cluster are abnormal. You need to troubleshoot the anomalies until the check result becomes "All Regions are healthy". Then you can continue the upgrade.

Upgrade the TiDB cluster

This section describes how to upgrade the TiDB cluster and verify the version after the upgrade.

Upgrade the TiDB cluster to a specified version

You can upgrade your cluster in one of the two ways: online upgrade and offline upgrade.

By default, TiUP Cluster upgrades the TiDB cluster using the online method, which means that the TiDB cluster can still provide services during the upgrade process. With the online method, the leaders are migrated one by one on each node before the upgrade and restart. Therefore, for a large-scale cluster, it takes a long time to complete the entire upgrade operation.

If your application has a maintenance window for the database to be stopped for maintenance, you can use the offline upgrade method to quickly perform the upgrade operation.

Online upgrade

tiup cluster upgrade <cluster-name> <version>

For example, if you want to upgrade the cluster to v8.0.0:

tiup cluster upgrade <cluster-name> v8.0.0

Specify the component version during upgrade

Starting from tiup-cluster v1.14.0, you can specify certain components to a specific version during cluster upgrade. These components will remain at their fixed version in the subsequent upgrade unless you specify a different version.

tiup cluster upgrade -h | grep "version" --alertmanager-version string Fix the version of alertmanager and no longer follows the cluster version. --blackbox-exporter-version string Fix the version of blackbox-exporter and no longer follows the cluster version. --cdc-version string Fix the version of cdc and no longer follows the cluster version. --ignore-version-check Ignore checking if target version is bigger than current version. --node-exporter-version string Fix the version of node-exporter and no longer follows the cluster version. --pd-version string Fix the version of pd and no longer follows the cluster version. --tidb-dashboard-version string Fix the version of tidb-dashboard and no longer follows the cluster version. --tiflash-version string Fix the version of tiflash and no longer follows the cluster version. --tikv-cdc-version string Fix the version of tikv-cdc and no longer follows the cluster version. --tikv-version string Fix the version of tikv and no longer follows the cluster version. --tiproxy-version string Fix the version of tiproxy and no longer follows the cluster version.

Offline upgrade

  1. Before the offline upgrade, you first need to stop the entire cluster.

    tiup cluster stop <cluster-name>
  2. Use the upgrade command with the --offline option to perform the offline upgrade. Fill in the name of your cluster for <cluster-name> and the version to upgrade to for <version>, such as v8.0.0.

    tiup cluster upgrade <cluster-name> <version> --offline
  3. After the upgrade, the cluster will not be automatically restarted. You need to use the start command to restart it.

    tiup cluster start <cluster-name>

Verify the cluster version

Execute the display command to view the latest cluster version TiDB Version:

tiup cluster display <cluster-name>
Cluster type: tidb Cluster name: <cluster-name> Cluster version: v8.0.0

FAQ

This section describes common problems encountered when updating the TiDB cluster using TiUP.

If an error occurs and the upgrade is interrupted, how to resume the upgrade after fixing this error?

Re-execute the tiup cluster upgrade command to resume the upgrade. The upgrade operation restarts the nodes that have been previously upgraded. If you do not want the upgraded nodes to be restarted, use the replay sub-command to retry the operation:

  1. Execute tiup cluster audit to see the operation records:

    tiup cluster audit

    Find the failed upgrade operation record and keep the ID of this operation record. The ID is the <audit-id> value in the next step.

  2. Execute tiup cluster replay <audit-id> to retry the corresponding operation:

    tiup cluster replay <audit-id>

How to fix the issue that the upgrade gets stuck when upgrading to v6.2.0 or later versions?

Starting from v6.2.0, TiDB enables the concurrent DDL framework by default to execute concurrent DDLs. This framework changes the DDL job storage from a KV queue to a table queue. This change might cause the upgrade to get stuck in some scenarios. The following are some scenarios that might trigger this issue and the corresponding solutions:

  • Upgrade gets stuck due to plugin loading

    During the upgrade, loading certain plugins that require executing DDL statements might cause the upgrade to get stuck.

    Solution: avoid loading plugins during the upgrade. Instead, load plugins only after the upgrade is completed.

  • Upgrade gets stuck due to using the kill -9 command for offline upgrade

    • Precautions: avoid using the kill -9 command to perform the offline upgrade. If it is necessary, restart the new version TiDB node after 2 minutes.
    • If the upgrade is already stuck, restart the affected TiDB node. If the issue has just occurred, it is recommended to restart the node after 2 minutes.
  • Upgrade gets stuck due to DDL Owner change

    In multi-instance scenarios, network or hardware failures might cause DDL Owner change. If there are unfinished DDL statements in the upgrade phase, the upgrade might get stuck.

    Solution:

    1. Terminate the stuck TiDB node (avoid using kill -9).
    2. Restart the new version TiDB node.

The evict leader has waited too long during the upgrade. How to skip this step for a quick upgrade?

You can specify --force. Then the processes of transferring PD leader and evicting TiKV leader are skipped during the upgrade. The cluster is directly restarted to update the version, which has a great impact on the cluster that runs online. In the following command, <version> is the version to upgrade to, such as v8.0.0.

tiup cluster upgrade <cluster-name> <version> --force

How to update the version of tools such as pd-ctl after upgrading the TiDB cluster?

You can upgrade the tool version by using TiUP to install the ctl component of the corresponding version:

tiup install ctl:v8.0.0