This document introduces how to use the logical import mode in TiDB Lightning, including writing the configuration file and tuning performance.
You can use the logical import mode via the following configuration file to import data:
[lightning] # log level = "info" file = "tidb-lightning.log" max-size = 128 # MB max-days = 28 max-backups = 14 # Checks the cluster minimum requirements before start. check-requirements = true [mydumper] # The local data source directory or the external storage URL. data-source-dir = "/data/my_database" [tikv-importer] # Import mode. "tidb" means using the logical import mode. backend = "tidb" # The operation of inserting duplicate data in the logical import mode. # - replace: replace existing data with new data # - ignore: keep existing data and ignore new data # - error: pause the import and report an error on-duplicate = "replace" [tidb] # The information of the target cluster. The address of any tidb-server from the cluster. host = "172.16.31.1" port = 4000 user = "root" # Configure the password to connect to TiDB. Either plaintext or Base64 encoded. password = "" # tidb-lightning imports the TiDB library, and generates some logs. # Set the log level of the TiDB library. log-level = "error"
For the complete configuration file, refer to TiDB Lightning Configuration.
Conflicting data refers to two or more records with the same data in the PK or UK column. When the data source contains conflicting data, the actual number of rows in the table is different from the total number of rows returned by the query using the unique index.
In the logical import mode, you can configure the strategy for resolving conflicting data by setting the
on-duplicate configuration item. Based on the strategy, TiDB Lightning imports data with different SQL statements.
|Strategy||Default behavior of conflicting data||The corresponding SQL statement|
|Replacing existing data with new data.|
|Keeping existing data and ignoring new data.|
|Pausing the import and reporting an error.|
In the logical import mode, the performance of TiDB Lightning largely depends on the write performance of the target TiDB cluster. If the cluster hits a performance bottleneck, refer to Highly Concurrent Write Best Practices.
If the target TiDB cluster does not hit a write bottleneck, consider increasing the value of
region-concurrencyin TiDB Lightning configuration. The default value of
region-concurrencyis the number of CPU cores. The meaning of
region-concurrencyis different between the physical import mode and the logical import mode. In the logical import mode,
region-concurrencyis the write concurrency.
Example configuration:[lightning] region-concurrency = 32
raftstore.store-pool-sizeconfiguration items in the target TiDB cluster might improve the import speed.