Restore Data into TiDB in Kubernetes
This document describes how to restore data into a TiDB cluster in Kubernetes using TiDB Lightning.
TiDB Lightning contains two components: tidb-lightning and tikv-importer. In Kubernetes, the tikv-importer is inside the Helm chart of the TiDB cluster. And tikv-importer is deployed as a StatefulSet
with replicas=1
while tidb-lightning is in a separate Helm chart and deployed as a Job
.
Therefore, both the tikv-importer and tidb-lightning need to be deployed to restore data with TiDB Lightning.
Deploy tikv-importer
The tikv-importer can be enabled for an existing TiDB cluster or for a newly created one.
Create a new TiDB cluster with tikv-importer enabled
Set
importer.create
totrue
in tidb-clustervalues.yaml
Deploy the cluster
helm install pingcap/tidb-cluster --name=<tidb-cluster-release-name> --namespace=<namespace> -f values.yaml --version=<chart-version>
Configure an existing TiDB cluster to enable tikv-importer
Set
importer.create
totrue
in thevalues.yaml
file of the TiDB clusterUpgrade the existing TiDB cluster
helm upgrade <tidb-cluster-release-name> pingcap/tidb-cluster -f values.yaml --version=<chart-version>
Deploy tidb-lightning
Configure TiDB Lightning
Use the following command to get the default configuration of TiDB Lightning.
helm inspect values pingcap/tidb-lightning --version=<chart-version> > tidb-lightning-values.yamlTiDB Lightning Helm chart supports both local and remote data source.
Local
Local mode requires the Mydumper backup data to be on one of the Kubernetes node. This mode can be enabled by setting
dataSource.local.nodeName
to the node name anddataSource.local.hostPath
to the Mydumper backup data directory path which contains a file namedmetadata
.Remote
Unlike local mode, remote mode needs to use rclone to download Mydumper backup tarball file from a network storage to a PV. Any cloud storage supported by rclone should work, but currently only the following have been tested: Google Cloud Storage (GCS), AWS S3, Ceph Object Storage.
Make sure that
dataSource.local.nodeName
anddataSource.local.hostPath
are commented out.Create a
Secret
containing the rclone configuration. A sample configuration is listed below. Only one cloud storage configuration is required. For other cloud storages, please refer to rclone documentation.apiVersion: v1 kind: Secret metadata: name: cloud-storage-secret type: Opaque stringData: rclone.conf: | [s3] type = s3 provider = AWS env_auth = false access_key_id = <my-access-key> secret_access_key = <my-secret-key> region = us-east-1 [ceph] type = s3 provider = Ceph env_auth = false access_key_id = <my-access-key> secret_access_key = <my-secret-key> endpoint = <ceph-object-store-endpoint> region = :default-placement [gcs] type = google cloud storage # The service account must include Storage Object Viewer role # The content can be retrieved by `cat <service-account-file.json> | jq -c .` service_account_credentials = <service-account-json-file-content>Fill in the placeholders with your configurations and save it as
secret.yaml
, and then create the secret viakubectl apply -f secret.yaml -n <namespace>
.Configure the
dataSource.remote.storageClassName
to an existing storage class in the Kubernetes cluster.
Deploy TiDB Lightning
helm install pingcap/tidb-lightning --name=<tidb-lightning-release-name> --namespace=<namespace> --set failFast=true -f tidb-lightning-values.yaml --version=<chart-version>
When TiDB Lightning fails to restore data, it cannot simply be restarted, but manual intervention is required. So the tidb-lightning's Job
restart policy is set to Never
.
If the lightning fails to restore data, follow the steps below to do manual intervention:
Delete the lightning job by running
kubectl delete job -n <namespace> <tidb-lightning-release-name>-tidb-lightning
.Create the lightning job again with
failFast
disabled byhelm template pingcap/tidb-lightning --name <tidb-lightning-release-name> --set failFast=false -f tidb-lightning-values.yaml | kubectl apply -n <namespace> -f -
.When the lightning pod is running again, use
kubectl exec -it -n <namesapce> <tidb-lightning-pod-name> sh
toexec
into the lightning container.Get the startup script by running
cat /proc/1/cmdline
.Diagnose the lightning following the troubleshooting guide.
Destroy TiDB Lightning
Currently, TiDB Lightning can only restore data offline. When the restoration finishes and the TiDB cluster needs to provide service for applications, the TiDB Lightning should be deleted to save cost.
To delete tikv-importer:
- In
values.yaml
of the TiDB cluster chart, setimporter.create
tofalse
. - Run
helm upgrade <tidb-cluster-release-name> pingcap/tidb-cluster -f values.yaml
.
- In
To delete tidb-lightning, run
helm delete <tidb-lightning-release-name> --purge
.