Data Migration Overview

TiDB Data Migration (DM) is an integrated data migration task management platform, which supports the full data migration and the incremental data replication from MySQL-compatible databases (such as MySQL, MariaDB, and Aurora MySQL) into TiDB. It can help to reduce the operation cost of data migration and simplify the troubleshooting process. When using DM for data migration, you need to perform the following operations:

  • Deploy a DM Cluster
  • Create upstream data source and save data source access information
  • Create data migration tasks to migrate data from data sources to TiDB

The data migration task includes two stages: full data migration and incremental data replication:

  • Full data migration: Migrate the table structure of the corresponding table from the data source to TiDB, and then read the data stored in the data source and write it to the TiDB cluster.
  • Incremental data replication: After the full data migration is completed, the corresponding table changes from the data source are read and then written to the TiDB cluster.

The following describes the features of DM.

Basic features

This section describes the basic data migration features provided by DM.

DM Core Features

Block and allow lists migration at the schema and table levels

The block and allow lists filtering rule is similar to the replication-rules-db/replication-rules-table feature of MySQL, which can be used to filter or replicate all operations of some databases only or some tables only.

Binlog event filtering

The binlog event filtering feature means that DM can filter certain types of SQL statements from certain tables in the source database. For example, you can filter all INSERT statements in the table test.sbtest or filter all TRUNCATE TABLE statements in the schema test.

Schema and table routing

The schema and table routing feature means that DM can migrate a certain table of the source database to the specified table in the downstream. For example, you can migrate the table structure and data from the table test.sbtest1 in the source database to the table test.sbtest2 in TiDB. This is also a core feature for merging and migrating sharded databases and tables.

Advanced features

Shard merge and migration

DM supports merging and migrating the original sharded instances and tables from the source databases into TiDB, but with some restrictions. For details, see Sharding DDL usage restrictions in the pessimistic mode and Sharding DDL usage restrictions in the optimistic mode.

Optimization for third-party online-schema-change tools in the migration process

In the MySQL ecosystem, tools such as gh-ost and pt-osc are widely used. DM provides support for these tools to avoid migrating unnecessary intermediate data. For details, see Online DDL Tools

Filter certain row changes using SQL expressions

In the phase of incremental replication, DM supports the configuration of SQL expressions to filter out certain row changes, which lets you replicate the data with a greater granularity. For more information, refer to Filter Certain Row Changes Using SQL Expressions.

Usage restrictions

Before using the DM tool, note the following restrictions:

  • Database version requirements

    • MySQL version > 5.5

    • MariaDB version >= 10.1.2

  • DDL syntax compatibility

    • Currently, TiDB is not compatible with all the DDL statements that MySQL supports. Because DM uses the TiDB parser to process DDL statements, it only supports the DDL syntax supported by the TiDB parser. For details, see MySQL Compatibility.

    • DM reports an error when it encounters an incompatible DDL statement. To solve this error, you need to manually handle it using dmctl, either skipping this DDL statement or replacing it with a specified DDL statement(s). For details, see Skip or replace abnormal SQL statements.

  • Sharding merge with conflicts

  • Switch of MySQL instances for data sources

    When DM-worker connects the upstream MySQL instance via a virtual IP (VIP), if you switch the VIP connection to another MySQL instance, DM might connect to the new and old MySQL instances at the same time in different connections. In this situation, the binlog migrated to DM is not consistent with other upstream status that DM receives, causing unpredictable anomalies and even data damage. To make necessary changes to DM manually, see Switch DM-worker connection via virtual IP.