Migrate from a MySQL-compatible Database - Taking Amazon Aurora MySQL as an Example
This document describes how to migrate from Amazon Aurora MySQL to TiDB by using TiDB Data Migration (DM).
Step 1: Enable binlog in the Aurora cluster
Assuming that you want to migrate data from two Aurora clusters to TiDB, the information of the Aurora clusters is listed in the following table. The Aurora-1 cluster contains a separate reader endpoint.
Cluster | Endpoint | Port | Role |
---|---|---|---|
Aurora-1 | pingcap-1.h8emfqdptyc4.us-east-2.rds.amazonaws.com | 3306 | Writer |
Aurora-1 | pingcap-1-us-east-2a.h8emfqdptyc4.us-east-2.rds.amazonaws.com | 3306 | Reader |
Aurora-2 | pingcap-2.h8emfqdptyc4.us-east-2.rds.amazonaws.com | 3306 | Writer |
DM relies on the ROW
format of binlog during the incremental replication process, so you need to set the binlog format as ROW
. If binlog is not enabled or is incorrectly configured, DM cannot migrate data normally. For more details, see Checking items.
If you need to migrate data based on GTID (Global Transaction Identifier), enable GTID for the Aurora cluster.
Modify binlog related parameters in the Aurora cluster
In the Aurora cluster, binlog related parameters are cluster level parameters among cluster parameter groups. For more information about binlog in the Aurora cluster, see Enable Binary Logging on the Replication Master. You need to set the binlog_format
to ROW
when using DM for data migration.
To migrate data based on GTID, set both gtid-mode
and enforce_gtid_consistency
to ON
. See Configuring GTID-Based Replication for an Aurora MySQL Cluster for more information about enabling GTID-based migration for Aurora cluster.
Step 2: Deploy the DM cluster
It is recommended to use DM-Ansible to deploy a DM cluster. See Deploy Data Migration Using DM-Ansible.
Step 3: Check the cluster informtaion
After a DM cluster is deployed using DM-Ansible, the configuration information is as follows:
DM cluster components
Component Host Port dm_worker1 172.16.10.72 8262 dm_worker2 172.16.10.73 8262 dm_master 172.16.10.71 8261 Upstream and downstream database instances
Database instance Host Port Username Encrypted password Upstream Aurora-1 pingcap-1.h8emfqdptyc4.us-east-2.rds.amazonaws.com 3306 root VjX8cEeTX+qcvZ3bPaO4h0C80pe/1aU= Upstream Aurora-2 pingcap-2.h8emfqdptyc4.us-east-2.rds.amazonaws.com 3306 root VjX8cEeTX+qcvZ3bPaO4h0C80pe/1aU= Downstream TiDB 172.16.10.83 4000 root Configuration in the
{ansible deploy}/conf/dm-master.toml
DM-master process configuration file# DM-Master Configuration [[deploy]] source-id = "mysql-replica-01" dm-worker = "172.16.10.72:8262" [[deploy]] source-id = "mysql-replica-02" dm-worker = "172.16.10.73:8262"
Step 4: Configure the task
This section assumes that you need to migrate data of the test_table
table in the test_db
schema of Aurora-1 and Aurora-2 instances, in both full data migration and incremental replication modes, to the test_table
table of the test_db
schema in one downstream TiDB instance.
Copy and edit {ansible deploy}/conf/task.yaml.example
to generate the following task.yaml
configuration file:
# The task name. You need to use a different name for each of the multiple tasks that run simultaneously.
name: "test"
# The full data migration plus incremental replication task mode.
task-mode: "all"
# The downstream TiDB configuration information.
target-database:
host: "172.16.10.83"
port: 4000
user: "root"
password: ""
# Configuration of all the upstream MySQL instances required by the current data migration task.
mysql-instances:
-
# ID of the upstream instance or the migration group. Refer to the configuration of `source_id` in the `inventory.ini` file or configuration of `source-id` in the `dm-master.toml` file.
source-id: "mysql-replica-01"
# The configuration item name of the block and allow lists of the schema or table to be migrated, used to quote the global block and allow lists configuration. For global configuration, see the `block-allow-list` below.
block-allow-list: "global" # Use black-white-list if the DM's version <= v1.0.6.
# The configuration item name of the dump unit, used to quote the global dump unit configuration.
mydumper-config-name: "global"
-
source-id: "mysql-replica-02"
block-allow-list: "global" # Use black-white-list if the DM's version <= v1.0.6.
mydumper-config-name: "global"
# The global configuration of block and allow lists. Each instance can quote it by the configuration item name.
block-allow-list: # Use black-white-list if the DM's version <= v1.0.6.
global:
do-tables: # The allow list of the upstream table to be migrated
- db-name: "test_db" # The database name of the table to be migrated
tbl-name: "test_table" # The name of the table to be migrated
# The global configuration of dump unit. Each instance can quote it by the configuration item name.
mydumpers:
global:
extra-args: "-B test_db -T test_table" # Extra arguments of the dump unit. Since DM 1.0.2, DM automatically generates the "--tables-list" configuration. For versions earlier than 1.0.2, you need to configure this option manually.
Step 5: Start the task
Go to the dmctl directory:
/home/tidb/dm-ansible/resources/bin/
.Start dmctl using the following command:
./dmctl --master-addr 172.16.10.71:8261
Start data migration task using the following command:
# `task.yaml` is the previously edited configuration file. start-task ./task.yaml
If the returned results do not contain any error, it indicates the task is successfully started.
If the returned results contain the following error information, it indicates the upstream Aurora user might have privileges unsupported by TiDB:
{ "id": 4, "name": "source db dump privilege chcker", "desc": "check dump privileges of source DB", "state": "fail", "errorMsg": "line 1 column 285 near \"LOAD FROM S3, SELECT INTO S3 ON *.* TO 'root'@'%' WITH GRANT OPTION\" ...", "instruction": "", "extra": "address of db instance - pingcap-1.h8emfqdptyc4.us-east-2.rds.amazonaws.com" }, { "id": 5, "name": "source db replication privilege chcker", "desc": "check replication privileges of source DB", "state": "fail", "errorMsg": "line 1 column 285 near \"LOAD FROM S3, SELECT INTO S3 ON *.* TO 'root'@'%' WITH GRANT OPTION\" ...", "instruction": "", "extra": "address of db instance - pingcap-1.h8emfqdptyc4.us-east-2.rds.amazonaws.com" }
To resolve this issue, use either of the following two solutions to handle it and then use the
start-task
command to restart the task:Remove the unnecessary privileges unsupported by TiDB for the Aurora user that is used to migrate data.
If you can make sure that the Aurora user has the privileges required by DM, add the following configuration item to the
task.yaml
configuration file to skip the privileges precheck when starting the task.ignore-checking-items: ["dump_privilege", "replication_privilege"]
Step 6: Query the task
To view the on-going data migration task(s) in the DM cluster or the task status, run the following command in dmctl to query:
query-status