Optimize Configuration of DM

This document introduces how to optimize the configuration of the data migration task to improve the performance of data migration.

Full data export

mydumpers is the configuration item related to full data export. This section describes how to configure performance-related options.

rows

Setting the rows option enables concurrently exporting data from a single table using multi-thread. The value of rows is the maximum number of rows contained in each exported chunk. After this option is enabled, DM selects a column as the split benchmark when the data of a MySQL single table is concurrently exported. This column can be one of the following columns: the primary key column, the unique index column, and the normal index column (ordered from highest priority to lowest). Make sure this column is of integer type (for example, INT, MEDIUMINT, BIGINT).

The value of rows can be set to 10000. You can change this value according to the total number of rows in the table and the performance of the database. In addition, you need to set threads to control the number of concurrent threads. By default, the value of threads is 4. You can adjust this value as needed.

chunk-filesize

During full backup, DM splits the data of each table into multiple chunks according to the value of the chunk-filesize option. Each chunk is saved in a file with a size of about chunk-filesize. In this way, data is split into multiple files and you can use the parallel processing of the DM Load unit to improve the import speed. The default value of this option is 64 (in MB). Normally, you do not need to set this option. If you set it, adjust the value of this option according to the size of the full data.

Full data import

loaders is the configuration item related to full data import. This section describes how to configure performance-related options.

pool-size

The pool-size option determines the number of threads in the DM Load unit. The default value is 16. Normally, you do not need to set this option. If you set it, adjust the value of this option according to the size of the full data and the performance of the database.

Incremental data replication

syncers is the configuration item related to incremental data replication. This section describes how to configure performance-related options.

worker-count

worker-count determines the number of threads for concurrent replication of DMLs in the DM Sync unit. The default value is 16. To speed up data replication, increase the value of this option appropriately.

batch

batch determines the number of DMLs included in each transaction when the data is replicated to the downstream database during the DM Sync unit. The default value is 100. Normally, you do not need to change the value of this option.