Software and Hardware Requirements for TiDB Data Migration

q
r
O
d
h

TiDB Data Migration (DM) supports mainstream Linux operating systems. See the following table for specific version requirements:

Linux OSVersion
Red Hat Enterprise Linux7.3 or later
CentOS7.3 or later
Oracle Enterprise Linux7.3 or later
Ubuntu LTS16.04 or later

DM can be deployed and run on Intel architecture servers and mainstream virtualization environments.

DM can be deployed and run on a 64-bit generic hardware server platform (Intel x86-64 architecture). For servers used in the development, testing, and production environments, this section illustrates recommended hardware configurations (these do not include the resources used by the operating system).

Development and test environments

ComponentCPUMemoryLocal StorageNetworkNumber of Instances (Minimum Requirement)
DM-master4 core+8 GB+SAS, 200 GB+Gigabit network card1
DM-worker8 core+16 GB+SAS, 200 GB+ (Greater than the size of the migrated data)Gigabit network cardThe number of upstream MySQL instances

Production environment

ComponentCPUMemoryHard Disk TypeNetworkNumber of Instances (Minimum Requirement)
DM-master4 core+8 GB+SAS, 200 GB+Gigabit network card3
DM-worker16 core+32 GB+SSD, 200 GB+ (Greater than the size of the migrated data)10 Gigabit network cardGreater than the number of upstream MySQL instances
Monitor8 core+16 GB+SAS, 200 GB+Gigabit network card1

Downstream storage space requirements

The target TiKV cluster must have enough disk space to store the imported data. In addition to the standard hardware requirements, the storage space of the target TiKV cluster must be larger than the size of the data source x the number of replicas x 2. For example, if the cluster uses 3 replicas by default, the target TiKV cluster must have a storage space larger than 6 times the size of the data source. The formula has x 2 because:

  • Indexes might take extra space.
  • RocksDB has a space amplification effect.

You can estimate the data volume by using the following SQL statements to summarize the DATA_LENGTH field:

-- Calculate the size of all schemas SELECT TABLE_SCHEMA, FORMAT_BYTES(SUM(DATA_LENGTH)) AS 'Data Size', FORMAT_BYTES(SUM(INDEX_LENGTH)) 'Index Size' FROM information_schema.tables GROUP BY TABLE_SCHEMA; -- Calculate the 5 largest tables SELECT TABLE_NAME, TABLE_SCHEMA, FORMAT_BYTES(SUM(data_length)) AS 'Data Size', FORMAT_BYTES(SUM(index_length)) AS 'Index Size', FORMAT_BYTES(SUM(data_length+index_length)) AS 'Total Size' FROM information_schema.tables GROUP BY TABLE_NAME, TABLE_SCHEMA ORDER BY SUM(DATA_LENGTH+INDEX_LENGTH) DESC LIMIT 5;