Migration Task Precheck

Before using DM to migrate data from upstream to downstream, a precheck helps detect errors in the upstream database configurations and ensures that the migration goes smoothly. This document introduces the DM precheck feature, including its usage scenario, check items, and arguments.

Usage scenario

To run a data migration task smoothly, DM triggers a precheck automatically at the start of the task and returns the check results. DM starts the migration only after the precheck is passed.

To trigger a precheck manually, run the check-task command.

For example:

tiup dmctl check-task ./task.yaml

Descriptions of check items

After a precheck is triggered for a task, DM checks the corresponding items according to your migration mode configuration.

This section lists all the precheck items.

  • If a mandatory check item does not pass, DM returns an error after the check and does not proceed with the migration task. In this case, modify the configurations according to the error message and retry the task after meeting the precheck requirements.

  • If a non-mandatory check item does not pass, DM returns a warning after the check. DM automatically starts a migration task if the check result contains only warnings but no errors.

Common check items

Regardless of the migration mode you choose, the precheck always includes the following common check items:

  • Database version

    • MySQL version > 5.5

    • MariaDB version >= 10.1.2

  • Compatibility of the upstream MySQL table schema

    • Check whether the upstream tables have foreign keys, which are not supported by TiDB. A warning is returned if a foreign key is found in the precheck.

    • Check whether the upstream tables use character sets that are incompatible with TiDB. For more information, see TiDB Supported Character Sets.

    • Check whether the upstream tables have primary key constraints or unique key constraints (introduced from v1.0.7).

Check items for full data migration

For the full data migration mode (task-mode: full), in addition to the common check items, the precheck also includes the following check items:

  • (Mandatory) dump permission of the upstream database

    • SELECT permission on INFORMATION_SCHEMA and dump tables
    • RELOAD permission if consistency=flush
    • LOCK TABLES permission on the dump tables if consistency=flush/lock
  • (Mandatory) Consistency of upstream MySQL multi-instance sharding tables

    • In the pessimistic mode, check whether the table schemas of all sharded tables are consistent in the following items:

      • Number of columns
      • Column name
      • Column order
      • Column type
      • Primary key
      • Unique index
    • In the optimistic mode, check whether the schemas of all sharded tables meet the optimistic compatibility.

    • If a migration task was started successfully by the start-task command, the precheck of this task skips the consistency check.

  • Auto-increment primary key in sharded tables

Check items for incremental data migration

For the incremental data migration mode (task-mode: incremental), in addition to the common check items, the precheck also includes the following check items:

  • (Mandatory) Upstream database REPLICATION permission

    • REPLICATION CLIENT permission
    • REPLICATION SLAVE permission
  • Database primary-secondary configuration

    • To avoid primary-secondary replication failures, it is recommended that you specify the database ID server_id for the upstream database (GTID is recommended for non-AWS Aurora environments).
  • (Mandatory) MySQL binlog configuration

    • Check whether binlog is enabled (required by DM).
    • Check whether binlog_format=ROW is configured (DM only supports the migration of binlog in the ROW format).
    • Check whether binlog_row_image=FULL is configured (DM only supports binlog_row_image=FULL).
    • If binlog_do_db or binlog_ignore_db is configured, check whether the database tables to be migrated meet the conditions of binlog_do_db and binlog_ignore_db.
  • (Mandatory) Check if the upstream database is in an Online-DDL process (in which the ghost table is created but the rename phase is not executed yet). If the upstream is in the online-DDL process, the precheck returns an error. In this case, wait until the DDL to complete and retry.

Check items for full and incremental data migration

For the full and incremental data migration mode (task-mode: all), in addition to the common check items, the precheck also includes the full data migration check items and the incremental data migration check items.

Ignorable check items

Prechecks can find potential risks in your environments. It is not recommended to ignore check items. If your data migration task has special needs, you can use the ignore-checking-items configuration item to skip some check items.

Check itemDescription
dump_privilegeChecks the dump privilege of the user in the upstream MySQL instance.
replication_privilegeChecks the replication privilege of the user in the upstream MySQL instance.
versionChecks the version of the upstream database.
server_idChecks whether server_id is configured in the upstream database.
binlog_enableChecks whether binlog is enabled in the upstream database.
table_schemaChecks the compatibility of the table schemas in the upstream MySQL tables.
schema_of_shard_tablesChecks the consistency of the table schemas in the upstream MySQL multi-instance shards.
auto_increment_IDChecks whether the auto-increment primary key conflicts in the upstream MySQL multi-instance shards.
online_ddlChecks whether the upstream is in the process of online-DDL.

Configure precheck arguments

The migration task precheck supports processing in parallel. Even if the number of rows in sharded tables reaches a million level, the precheck can be completed in minutes.

To specify the number of threads for the precheck, you can configure the threads argument of the mydumpers field in the migration task configuration file.

mydumpers: # Configuration arguments of the dump processing unit global: # Configuration name threads: 4 # The number of threads that access the upstream when the dump processing unit performs the precheck and exports data from the upstream database (4 by default) chunk-filesize: 64 # The size of the files generated by the dump processing unit (64 MB by default) extra-args: "--consistency none" # Other arguments of the dump processing unit. You do not need to manually configure table-list in `extra-args`, because it is automatically generated by DM.