Software and Hardware Requirements for TiDB Data Migration
TiDB Data Migration (DM) supports mainstream Linux operating systems. See the following table for specific version requirements:
Linux OS | Version |
---|---|
Red Hat Enterprise Linux | 7.3 or later |
CentOS | 7.3 or later |
Oracle Enterprise Linux | 7.3 or later |
Ubuntu LTS | 16.04 or later |
DM can be deployed and run on Intel architecture servers and mainstream virtualization environments.
Recommended server requirements
DM can be deployed and run on a 64-bit generic hardware server platform (Intel x86-64 architecture). For servers used in the development, testing, and production environments, this section illustrates recommended hardware configurations (these do not include the resources used by the operating system).
Development and test environments
Component | CPU | Memory | Local Storage | Network | Number of Instances (Minimum Requirement) |
---|---|---|---|---|---|
DM-master | 4 core+ | 8 GB+ | SAS, 200 GB+ | Gigabit network card | 1 |
DM-worker | 8 core+ | 16 GB+ | SAS, 200 GB+ (Greater than the size of the migrated data) | Gigabit network card | The number of upstream MySQL instances |
Production environment
Component | CPU | Memory | Hard Disk Type | Network | Number of Instances (Minimum Requirement) |
---|---|---|---|---|---|
DM-master | 4 core+ | 8 GB+ | SAS, 200 GB+ | Gigabit network card | 3 |
DM-worker | 16 core+ | 32 GB+ | SSD, 200 GB+ (Greater than the size of the migrated data) | 10 Gigabit network card | Greater than the number of upstream MySQL instances |
Monitor | 8 core+ | 16 GB+ | SAS, 200 GB+ | Gigabit network card | 1 |
Downstream storage space requirements
The target TiKV cluster must have enough disk space to store the imported data. In addition to the standard hardware requirements, the storage space of the target TiKV cluster must be larger than the size of the data source x the number of replicas x 2. For example, if the cluster uses 3 replicas by default, the target TiKV cluster must have a storage space larger than 6 times the size of the data source. The formula has x 2
because:
- Indexes might take extra space.
- RocksDB has a space amplification effect.
You can estimate the data volume by using the following SQL statements to summarize the DATA_LENGTH
field:
-- Calculate the size of all schemas
SELECT
TABLE_SCHEMA,
FORMAT_BYTES(SUM(DATA_LENGTH)) AS 'Data Size',
FORMAT_BYTES(SUM(INDEX_LENGTH)) 'Index Size'
FROM
information_schema.tables
GROUP BY
TABLE_SCHEMA;
-- Calculate the 5 largest tables
SELECT
TABLE_NAME,
TABLE_SCHEMA,
FORMAT_BYTES(SUM(data_length)) AS 'Data Size',
FORMAT_BYTES(SUM(index_length)) AS 'Index Size',
FORMAT_BYTES(SUM(data_length+index_length)) AS 'Total Size'
FROM
information_schema.tables
GROUP BY
TABLE_NAME,
TABLE_SCHEMA
ORDER BY
SUM(DATA_LENGTH+INDEX_LENGTH) DESC
LIMIT
5;