Use PD Recover to Recover the PD Cluster
PD Recover is a disaster recovery tool of PD, used to recover the PD cluster which cannot start or provide services normally.
Download PD Recover
Download the official TiDB package:
wget https://download.pingcap.org/tidb-${version}-linux-amd64.tar.gzIn the command above,
${version}
is the version of the TiDB cluster, such asv4.0.0-rc
.Unpack the TiDB package for installation:
tar -xzf tidb-${version}-linux-amd64.tar.gzpd-recover
is in thetidb-${version}-linux-amd64/bin
directory.
Recover the PD cluster
This section introduces how to recover the PD cluster using PD Recover.
Get Cluster ID
kubectl get tc ${cluster_name} -n ${namespace} -o='go-template={{.status.clusterID}}{{"\n"}}'
Example:
kubectl get tc test -n test -o='go-template={{.status.clusterID}}{{"\n"}}'
6821434242797747735
Get Alloc ID
When you use pd-recover
to recover the PD cluster, you need to specify alloc-id
. The value of alloc-id
must be larger than the largest allocated ID (Alloc ID
) of the original cluster.
Access the Prometheus monitoring data of the TiDB cluster by taking steps in Access the monitoring data.
Enter
pd_cluster_id
in the input box and click theExecute
button to make a query. Get the largest value in the query result.Multiply the largest value in the query result by
100
. Use the multiplied value as thealloc-id
value specified when usingpd-recover
.
Recover the PD Pod
Delete the Pod of the PD cluster.
Execute the following command to set the value of
spec.pd.replicas
to0
:kubectl edit tc ${cluster_name} -n ${namespace}Because the PD cluster is in an abnormal state, TiDB Operator cannot synchronize the change above to the PD StatefulSet. You need to execute the following command to set the
spec.replicas
of the PD StatefulSet to0
.kubectl edit sts ${cluster_name}-pd -n ${namespace}Execute the following command to confirm that the PD Pod is deleted:
kubectl get pod -n ${namespace}After confirming that all PD Pods are deleted, execute the following command to delete the PVCs bound to the PD Pods:
kubectl delete pvc -l app.kubernetes.io/component=pd,app.kubernetes.io/instance=${cluster_name} -n ${namespace}After the PVCs are deleted, scale out the PD cluster to one Pod:
Execute the following command to set the value of
spec.pd.replicas
to1
:kubectl edit tc ${cluster_name} -n ${namespace}Because the PD cluster is in an abnormal state, TiDB Operator cannot synchronize the change above to the PD StatefulSet. You need to execute the following command to set the
spec.replicas
of the PD StatefulSet to1
.kubectl edit sts ${cluster_name}-pd -n ${namespace}Execute the following command to confirm that the PD Pod is started:
kubectl get pod -n ${namespace}
Recover the cluster
Execute the
port-forward
command to expose the PD service:kubectl port-forward -n ${namespace} svc/${cluster_name}-pd 2379:2379Open a new terminal tab or window, enter the directory where
pd-recover
is located, and execute thepd-recover
command to recover the PD cluster:./pd-recover -endpoints http://127.0.0.1:2379 -cluster-id ${cluster_id} -alloc-id ${alloc_id}In the command above,
${cluster_id}
is the cluster ID got in Get Cluster ID.${alloc_id}
is the largest value ofpd_cluster_id
(got in Get Alloc ID) multiplied by100
.After the
pd-recover
command is successfully executed, the following result is printed:recover success! please restart the PD clusterGo back to the window where the
port-forward
command is executed, press Ctrl+C to stop and exit.
Restart the PD Pod
Delete the PD Pod:
kubectl delete pod ${cluster_name}-pd-0 -n ${namespace}After the Pod is started successfully, execute the
port-forward
command to expose the PD service:kubectl port-forward -n ${namespace} svc/${cluster_name}-pd 2379:2379Open a new terminal tab or window, execute the following command to confirm the Cluster ID is the one got in Get Cluster ID.
curl 127.0.0.1:2379/pd/api/v1/clusterGo back to the window where the
port-forward
command is executed, press Ctrl+C to stop and exit.
Increase the capacity of the PD cluster
Execute the following command to set the value of spec.pd.replicas
to the desired number of Pods:
kubectl edit tc ${cluster_name} -n ${namespace}
Restart TiDB and TiKV
kubectl delete pod -l app.kubernetes.io/component=tidb,app.kubernetes.io/instance=${cluster_name} -n ${namespace} &&
kubectl delete pod -l app.kubernetes.io/component=tikv,app.kubernetes.io/instance=${cluster_name} -n ${namespace}