Use PITR
This document describe how to deploy the relevant tool and use point-in-time recovery (PITR). It aims to help you get started with this feature.
Assume that you have deployed a TiDB cluster in the production environment on AWS, and the business team puts forward the following requirements:
- Back up the data changes in time. When the database encounters an error, you can quickly recover the application data with a minimum data loss (only a few minutes of data loss is tolerable).
- Perform business audits every month at no specific time. When an audit request is received, you must provide a database to query the data at a certain time point in the past month as requested.
With PITR, you can satisfy the above requirements.
Deploy the TiDB cluster and BR
To use PITR, you need to deploy a TiDB cluster >= v6.2.0 and update BR to the same version as the TiDB cluster. This document uses v6.2.0 as an example.
The following table shows the recommended hardware resources for using PITR in a TiDB cluster.
Component | CPU | Memory | Local storage | AWS instance | Number of instances |
---|---|---|---|---|---|
TiDB | 8 core+ | 16 GB+ | SAS | c5.2xlarge | 2 |
PD | 8 core+ | 16 GB+ | SSD | c5.2xlarge | 3 |
TiKV | 8 core+ | 32 GB+ | SSD | m5.2xlarge | 3 |
BR | 8 core+ | 16 GB+ | SAS | c5.2xlarge | 1 |
Monitor | 8 core+ | 16 GB+ | SAS | c5.2xlarge | 1 |
Deploy or upgrade a TiDB cluster using TiUP:
- To deploy a new TiDB cluster, refer to Deploy a TiDB cluster.
- To upgrade an existing TiDB cluster, refer to Upgrade a TiDB cluster.
Install or upgrade BR using TiUP:
- Install: run
tiup install br:v6.2.0
. - Upgrade: run
tiup update br:v6.2.0
.
Enable log backup
Before you use log backup, ensure that log-backup.enable
in the TiKV configuration file is in its default value true
. For the method to modify configuration, refer to Modify the configuration.
Configure backup storage (Amazon S3)
Before you start a backup task, prepare the backup storage, including the following aspects:
- Prepare the S3 bucket and directory that stores the backup data.
- Configure the permissions to access the S3 bucket.
- Plan the directory that stores the backup data.
The detailed steps are as follows:
Create a directory in S3 to store the backup data. The directory in this example is
s3://tidb-pitr-bucket/backup-data
.- Create a bucket. You can choose an existing S3 to store the backup data. If there is none, refer to AWS documentation - Creating a bucket and create an S3 bucket. In this example, the bucket name is
tidb-pitr-bucket
. - Create a directory for your backup data. In the bucket (
tidb-pitr-bucket
), create a directory namedbackup-data
. For detailed steps, refer to AWS documentation - Organizing objects in the Amazon S3 console using folders.
- Create a bucket. You can choose an existing S3 to store the backup data. If there is none, refer to AWS documentation - Creating a bucket and create an S3 bucket. In this example, the bucket name is
Configure permissions for BR and TiKV to access the S3 directory. It is recommended to grant permissions using the IAM method, which is the most secure way to access the S3 bucket. For detailed steps, refer to AWS documentation - Controlling access to a bucket with user policies. The required permissions are as follows:
- TiKV and BR in the backup cluster need
s3:ListBucket
,s3:PutObject
, ands3:AbortMultipartUpload
permissions of thes3://tidb-pitr-bucket/backup-data
directory. - TiKV and BR in the restoration cluster need
s3:ListBucket
ands3:GetObject
permissions of thes3://tidb-pitr-bucket/backup-data
directory.
- TiKV and BR in the backup cluster need
Plan the directory structure that stores the backup data, including the snapshot (full) backup and the log backup.
- All snapshot backup data is stored in the
s3://tidb-pitr-bucket/backup-data/snapshot-${date}
directory.${date}
is the start time of the snapshot backup. For example, a snapshot backup starting at 2022/05/12 00:01:30 is stored ins3://tidb-pitr-bucket/backup-data/snapshot-20220512000130
. - Log backup data is stored in the
s3://tidb-pitr-bucket/backup-data/log-backup/
directory.
- All snapshot backup data is stored in the
Determine the backup policy
To meet the requirements of minimum data loss, quick recovery, and business audits within a month, you can set the backup policy as follows:
- Run the log backup to continuously back up the data change in the database.
- Run a snapshot backup at 00:00 every two days.
- Retain the snapshot backup data and log backup data within 30 days and clean up backup data older than 30 days.
Run log backup
After the log backup task is started, the log backup process runs in the TiKV cluster to continuously send the data change in the database to the S3 storage. To start a log backup task, run the following command:
tiup br log start --task-name=pitr --pd=172.16.102.95:2379 --storage='s3://tidb-pitr-bucket/backup-data/log-backup'
When the log backup task is running, you can query the backup status:
tiup br log status --task-name=pitr --pd=172.16.102.95:2379
● Total 1 Tasks.
> #1 <
name: pitr
status: ● NORMAL
start: 2022-05-13 11:09:40.7 +0800
end: 2035-01-01 00:00:00 +0800
storage: s3://tidb-pitr-bucket/backup-data/log-backup
speed(est.): 0.00 ops/s
checkpoint[global]: 2022-05-13 11:31:47.2 +0800; gap=4m53s
Run snapshot backup
You can run snapshot backup tasks on a regular basis using an automatic tool such as crontab. For example, run a snapshot backup at 00:00 every two days.
The following are two snapshot backup examples:
Run a snapshot backup at 2022/05/14 00:00:00
tiup br backup full --pd=172.16.102.95:2379 --storage='s3://tidb-pitr-bucket/backup-data/snapshot-20220514000000' --backupts='2022/05/14 00:00:00'Run a snapshot backup at 2022/05/16 00:00:00
tiup br backup full --pd=172.16.102.95:2379 --storage='s3://tidb-pitr-bucket/backup-data/snapshot-20220516000000' --backupts='2022/05/16 00:00:00'
Run PITR
Assume that you need to query the data at 2022/05/15 18:00:00. You can use PITR to restore a cluster to that timestamp by restoring a snapshot backup taken at 2022/05/14 and a log backup between the snapshot and 2022/05/15 18:00:00.
The command is as follows:
tiup br restore point --pd=172.16.102.95:2379
--storage='s3://tidb-pitr-bucket/backup-data/log-backup'
--full-backup-storage='s3://tidb-pitr-bucket/backup-data/snapshot-20220514000000'
--restored-ts '2022-05-15 18:00:00+0800'
Full Restore <--------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2022/05/29 18:15:39.132 +08:00] [INFO] [collector.go:69] ["Full Restore success summary"] [total-ranges=12] [ranges-succeed=xxx] [ranges-failed=0] [split-region=xxx.xxxµs] [restore-ranges=xxx] [total-take=xxx.xxxs] [restore-data-size(after-compressed)=xxx.xxx] [Size=xxxx] [BackupTS={TS}] [total-kv=xxx] [total-kv-size=xxx] [average-speed=xxx]
Restore Meta Files <--------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
Restore KV Files <----------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2022/05/29 18:15:39.325 +08:00] [INFO] [collector.go:69] ["restore log success summary"] [total-take=xxx.xx] [restore-from={TS}] [restore-to={TS}] [total-kv-count=xxx] [total-size=xxx]
Clean up outdated data
You can clean up outdated data every two days using an automatic tool such as crontab.
For example, you can run the following commands to clean up outdated data:
Delete snapshot data earlier than 2022/05/14 00:00:00
rm s3://tidb-pitr-bucket/backup-data/snapshot-20220514000000Delete log backup data earlier than 2022/05/14 00:00:00
tiup br log truncate --until='2022-05-14 00:00:00 +0800' --storage='s3://tidb-pitr-bucket/backup-data/log-backup'