TiDB Backup and Restore Use Cases

TiDB Snapshot Backup and Restore Guide and TiDB Log Backup and PITR Guide introduce the backup and restore solutions provided by TiDB, namely, snapshot (full) backup and restore, log backup and point-in-time recovery (PITR). This document helps you to quickly get started with the backup and restore solutions of TiDB in specific use cases.

Assume that you have deployed a TiDB production cluster on AWS and the business team requests the following requirements:

  • Back up the data changes in a timely manner. When the database encounters a disaster, you can quickly recover the application with minimal data loss (only a few minutes of data loss is tolerable).
  • Perform business audits every month at no specific time. When an audit request is received, you must provide a database to query the data at a certain time point of the past month as requested.

With PITR, you can satisfy the preceding requirements.

Deploy the TiDB cluster and BR

To use PITR, you need to deploy a TiDB cluster >= v6.2.0 and update BR to the same version as the TiDB cluster. This document uses v7.0.0 as an example.

The following table shows the recommended hardware resources for using PITR in a TiDB cluster.

ComponentCPUMemoryDiskAWS instanceNumber of instances
TiDB8 core+16 GB+SASc5.2xlarge2
PD8 core+16 GB+SSDc5.2xlarge3
TiKV8 core+32 GB+SSDm5.2xlarge3
BR8 core+16 GB+SASc5.2xlarge1
Monitor8 core+16 GB+SASc5.2xlarge1

Deploy or upgrade a TiDB cluster using TiUP:

Install or upgrade BR using TiUP:

  • Install:

    tiup install br:v7.0.0
  • Upgrade:

    tiup update br:v7.0.0

Configure backup storage (Amazon S3)

Before you start a backup task, prepare the backup storage, including the following aspects:

  1. Prepare the S3 bucket and directory that stores the backup data.
  2. Configure the permissions to access the S3 bucket.
  3. Plan the subdirectory that stores each backup data.

The detailed steps are as follows:

  1. Create a directory in S3 to store the backup data. The directory in this example is s3://tidb-pitr-bucket/backup-data.

    1. Create a bucket. You can choose an existing S3 to store the backup data. If there is none, refer to AWS documentation: Creating a bucket and create an S3 bucket. In this example, the bucket name is tidb-pitr-bucket.
    2. Create a directory for your backup data. In the bucket (tidb-pitr-bucket), create a directory named backup-data. For detailed steps, refer to AWS documentation: Organizing objects in the Amazon S3 console using folders.
  2. Configure permissions for BR and TiKV to access the S3 directory. It is recommended to grant permissions using the IAM method, which is the most secure way to access the S3 bucket. For detailed steps, refer to AWS documentation: Controlling access to a bucket with user policies. The required permissions are as follows:

    • TiKV and BR in the backup cluster need s3:ListBucket, s3:PutObject, and s3:AbortMultipartUpload permissions of the s3://tidb-pitr-bucket/backup-data directory.
    • TiKV and BR in the restore cluster need s3:ListBucket and s3:GetObject permissions of the s3://tidb-pitr-bucket/backup-data directory.
  3. Plan the directory structure that stores the backup data, including the snapshot (full) backup and the log backup.

    • All snapshot backup data are stored in the s3://tidb-pitr-bucket/backup-data/snapshot-${date} directory. ${date} is the start time of the snapshot backup. For example, a snapshot backup starting at 2022/05/12 00:01:30 is stored in s3://tidb-pitr-bucket/backup-data/snapshot-20220512000130.
    • Log backup data are stored in the s3://tidb-pitr-bucket/backup-data/log-backup/ directory.

Determine the backup policy

To meet the requirements of minimum data loss, quick recovery, and business audits within a month, you can set the backup policy as follows:

  • Run the log backup to continuously back up the data change in the database.
  • Run a snapshot backup at 00:00 AM every two days.
  • Retain the snapshot backup data and log backup data within 30 days and clean up backup data older than 30 days.

Run log backup

After the log backup task is started, the log backup process runs in the TiKV cluster to continuously send the data change in the database to the S3 storage. To start a log backup task, run the following command:

tiup br log start --task-name=pitr --pd="${PD_IP}:2379" \ --storage='s3://tidb-pitr-bucket/backup-data/log-backup'

When the log backup task is running, you can query the backup status:

tiup br log status --task-name=pitr --pd="${PD_IP}:2379" ● Total 1 Tasks. > #1 < name: pitr status: ● NORMAL start: 2022-05-13 11:09:40.7 +0800 end: 2035-01-01 00:00:00 +0800 storage: s3://tidb-pitr-bucket/backup-data/log-backup speed(est.): 0.00 ops/s checkpoint[global]: 2022-05-13 11:31:47.2 +0800; gap=4m53s

Run snapshot backup

You can run snapshot backup tasks on a regular basis using an automatic tool such as crontab. For example, run a snapshot backup at 00:00 every two days.

The following are two snapshot backup examples:

  • Run a snapshot backup at 2022/05/14 00:00:00

    tiup br backup full --pd="${PD_IP}:2379" \ --storage='s3://tidb-pitr-bucket/backup-data/snapshot-20220514000000' \ --backupts='2022/05/14 00:00:00'
  • Run a snapshot backup at 2022/05/16 00:00:00

    tiup br backup full --pd="${PD_IP}:2379" \ --storage='s3://tidb-pitr-bucket/backup-data/snapshot-20220516000000' \ --backupts='2022/05/16 00:00:00'

Run PITR

Assume that you need to query the data at 2022/05/15 18:00:00. You can use PITR to restore a cluster to that time point by restoring a snapshot backup taken at 2022/05/14 and the log backup data between the snapshot and 2022/05/15 18:00:00.

The command is as follows:

tiup br restore point --pd="${PD_IP}:2379" \ --storage='s3://tidb-pitr-bucket/backup-data/log-backup' \ --full-backup-storage='s3://tidb-pitr-bucket/backup-data/snapshot-20220514000000' \ --restored-ts '2022-05-15 18:00:00+0800' Full Restore <--------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00% [2022/05/29 18:15:39.132 +08:00] [INFO] [collector.go:69] ["Full Restore success summary"] [total-ranges=12] [ranges-succeed=xxx] [ranges-failed=0] [split-region=xxx.xxxµs] [restore-ranges=xxx] [total-take=xxx.xxxs] [restore-data-size(after-compressed)=xxx.xxx] [Size=xxxx] [BackupTS={TS}] [total-kv=xxx] [total-kv-size=xxx] [average-speed=xxx] Restore Meta Files <--------------------------------------------------------------------------------------------------------------------------------------------------> 100.00% Restore KV Files <----------------------------------------------------------------------------------------------------------------------------------------------------> 100.00% [2022/05/29 18:15:39.325 +08:00] [INFO] [collector.go:69] ["restore log success summary"] [total-take=xxx.xx] [restore-from={TS}] [restore-to={TS}] [total-kv-count=xxx] [total-size=xxx]

Clean up outdated data

You can clean up outdated data every two days using an automatic tool such as crontab.

For example, you can run the following commands to clean up outdated data:

  • Delete snapshot data earlier than 2022/05/14 00:00:00

    rm s3://tidb-pitr-bucket/backup-data/snapshot-20220514000000
  • Delete log backup data earlier than 2022/05/14 00:00:00

    tiup br log truncate --until='2022-05-14 00:00:00 +0800' --storage='s3://tidb-pitr-bucket/backup-data/log-backup'

See also