TiDB Lightning FAQ

What is the minimum TiDB/TiKV/PD cluster version supported by Lightning?

The version of TiDB Lightning should be the same as the cluster. The earliest available version is 2.0.9, but we recommend using the latest stable version (3.0).

Does TiDB Lightning support importing multiple schemas (databases)?

Yes.

What is the privilege requirements for the target database?

TiDB Lightning requires the following privileges:

  • SELECT
  • UPDATE
  • ALTER
  • CREATE
  • DROP

If the TiDB-backend is chosen, or the target database is used to store checkpoints, it additionally requires these privileges:

  • INSERT
  • DELETE

The Importer-backend does not require these two privileges because data is ingested into TiKV directly, which bypasses the entire TiDB privilege system. This is secure as long as the ports of TiKV, TiKV Importer and TiDB Lightning are not reachable outside the cluster.

If the checksum configuration of TiDB Lightning is set to true, then the admin user privileges in the downstream TiDB need to be granted to TiDB Lightning.

TiDB Lightning encountered an error when importing one table. Will it affect other tables? Will the process be terminated?

If only one table has an error encountered, the rest will still be processed normally.

How to properly restart TiDB Lightning?

Depending on the status of tikv-importer, the basic sequence of restarting TiDB Lightning is like this:

If tikv-importer is still running:

  1. Stop tidb-lightning.
  2. Perform the intended modifications, such as fixing the source data, changing settings, replacing hardware etc.
  3. If the modification previously has changed any table, remove the corresponding checkpoint too.
  4. Start tidb-lightning.

If tikv-importer needs to be restarted:

  1. Stop tidb-lightning.
  2. Stop tikv-importer.
  3. Perform the intended modifications, such as fixing the source data, changing settings, replacing hardware etc.
  4. Start tikv-importer.
  5. Start tidb-lightning and wait until the program fails with CHECKSUM error, if any.
    • Restarting tikv-importer would destroy all engine files still being written, but tidb-lightning did not know about it. As of v3.0 the simplest way is to let tidb-lightning go on and retry.
  6. Destroy the failed tables and checkpoints
  7. Start tidb-lightning again.

How to ensure the integrity of the imported data?

TiDB Lightning by default performs checksum on the local data source and the imported tables. If there is checksum mismatch, the process would be aborted. These checksum information can be read from the log.

You could also execute the ADMIN CHECKSUM TABLE SQL command on the target table to recompute the checksum of the imported data.

ADMIN CHECKSUM TABLE `schema`.`table`;
+---------+------------+---------------------+-----------+-------------+ | Db_name | Table_name | Checksum_crc64_xor | Total_kvs | Total_bytes | +---------+------------+---------------------+-----------+-------------+ | schema | table | 5505282386844578743 | 3 | 96 | +---------+------------+---------------------+-----------+-------------+ 1 row in set (0.01 sec)

What kind of data source format is supported by Lightning?

TiDB Lightning only supports the SQL dump generated by Mydumper or CSV files stored in the local file system.

Could TiDB Lightning skip creating schema and tables?

Yes. If you have already created the tables in the target database, you could set no-schema = true in the [mydumper] section in tidb-lightning.toml. This makes TiDB Lightning skip the CREATE TABLE invocations and fetch the metadata directly from the target database. TiDB Lightning will exit with error if a table is actually missing.

Can the Strict SQL Mode be disabled to allow importing invalid data?

Yes. By default, the sql_mode used by TiDB Lightning is "STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION", which disallows invalid data such as the date 1970-00-00. The mode can be changed by modifying the sql-mode setting in the [tidb] section in tidb-lightning.toml.

... [tidb] sql-mode = "" ...

Can one tikv-importer serve multiple tidb-lightning instances?

Yes, as long as every tidb-lightning instance operates on different tables.

How to stop the tikv-importer process?

To stop the tikv-importer process, you can choose the corresponding operation according to your deployment method.

  • For deployment using TiDB Ansible: run scripts/stop_importer.sh on the Importer server.

  • For manual deployment: if tikv-importer is running in foreground, press Ctrl+C to exit. Otherwise, obtain the process ID using the ps aux | grep tikv-importer command and then terminate the process using the kill «pid» command.

How to stop the tidb-lightning process?

To stop the tidb-lightning process, you can choose the corresponding operation according to your deployment method.

  • For deployment using TiDB Ansible: run scripts/stop_lightning.sh on the Lightning server.

  • For manual deployment: if tidb-lightning is running in foreground, press Ctrl+C to exit. Otherwise, obtain the process ID using the ps aux | grep tidb-lighting command and then terminate the process using the kill -2 «pid» command.

Why the tidb-lightning process suddenly quits while running in background?

It is potentially caused by starting tidb-lightning incorrectly, which causes the system to send a SIGHUP signal to stop the tidb-lightning process. In this situation, tidb-lightning.log usually outputs the following log:

[2018/08/10 07:29:08.310 +08:00] [INFO] [main.go:41] ["got signal to exit"] [signal=hangup]

It is not recommended to directly use nohup in the command line to start tidb-lightning. You can start tidb-lightning by executing a script.

Why my TiDB cluster is using lots of CPU resources and running very slowly after using TiDB Lightning?

If tidb-lightning abnormally exited, the cluster might be stuck in the "import mode", which is not suitable for production. You can force the cluster back to "normal mode" using the following command:

tidb-lightning-ctl --switch-mode=normal

Can TiDB Lightning be used with 1-Gigabit network card?

The TiDB Lightning toolset is best used with a 10-Gigabit network card. 1-Gigabit network cards are not recommended, especially for tikv-importer.

1-Gigabit network cards can only provide a total bandwidth of 120 MB/s, which has to be shared among all target TiKV stores. TiDB Lightning can easily saturate all bandwidth of the 1-Gigabit network and bring down the cluster because PD is unable to be contacted anymore. To avoid this, set an upload speed limit in Importer's configuration:

[import] # Restricts the total upload speed to TiKV to 100 MB/s or less upload-speed-limit = "100MB"

Why TiDB Lightning requires so much free space in the target TiKV cluster?

With the default settings of 3 replicas, the space requirement of the target TiKV cluster is 6 times the size of data source. The extra multiple of “2” is a conservative estimation because the following factors are not reflected in the data source:

  • The space occupied by indices
  • Space amplification in RocksDB

Can TiKV Importer be restarted while TiDB Lightning is running?

No. Importer stores some information of engines in memory. If tikv-importer is restarted, tidb-lightning will be stopped due to lost connection. At this point, you need to destroy the failed checkpoints as those Importer-specific information is lost. You can restart Lightning afterwards.

See also How to properly restart TiDB Lightning? for the correct sequence.

How to completely destroy all intermediate data associated with TiDB Lightning?

  1. Delete the checkpoint file.

    tidb-lightning-ctl --config conf/tidb-lightning.toml --checkpoint-remove=all

    If, for some reason, you cannot run this command, try manually deleting the file /tmp/tidb_lightning_checkpoint.pb.

  2. If you are using Local-backend, delete the sorted-kv-dir directory in the configuration. If you are using Importer-backend, delete the entire import directory on the machine hosting tikv-importer.

  3. Delete all tables and databases created on the TiDB cluster, if needed.

Why does TiDB Lightning report the could not find first pair, this shouldn't happen error?

This error occurs possibly because the number of files opened by TiDB Lightning exceeds the system limit when TiDB Lightning reads the sorted local files. In the Linux system, you can use the ulimit -n command to confirm whether the value of this system limit is too small. It is recommended that you adjust this value to 1000000 (ulimit -n 1000000) during TiDB Lightning import.

Import speed is too slow

Normally it takes TiDB Lightning 2 minutes per thread to import a 256 MB data file. If the speed is much slower than this, there is an error. You can check the time taken for each data file from the log mentioning restore chunk … takes. This can also be observed from metrics on Grafana.

There are several reasons why TiDB Lightning becomes slow:

Cause 1: region-concurrency is set too high, which causes thread contention and reduces performance.

  1. The setting can be found from the start of the log by searching region-concurrency.
  2. If TiDB Lightning shares the same machine with other services (for example, TiKV Importer), region-concurrency must be manually set to 75% of the total number of CPU cores.
  3. If there is a quota on CPU (for example, limited by Kubernetes settings), TiDB Lightning may not be able to read this out. In this case, region-concurrency must also be manually reduced.

Cause 2: The table schema is too complex.

Every additional index introduces a new KV pair for each row. If there are N indices, the actual size to be imported would be approximately (N+1) times the size of the Dumpling output. If the indices are negligible, you may first remove them from the schema, and add them back using CREATE INDEX after the import is complete.

Cause 3: Each file is too large.

TiDB Lightning works the best when the data source is broken down into multiple files of size around 256 MB so that the data can be processed in parallel. If each file is too large, TiDB Lightning might not respond.

If the data source is CSV, and all CSV files have no fields containing newline control characters (U+000A and U+000D), you can turn on "strict format" to let TiDB Lightning automatically split the large files.

[mydumper] strict-format = true

Cause 4: TiDB Lightning is too old.

Try the latest version! Maybe there is new speed improvement.

checksum failed: checksum mismatched remote vs local

Cause: The checksum of a table in the local data source and the remote imported database differ. This error has several deeper reasons:

  1. The table might already have data before. These old data can affect the final checksum.

  2. If the remote checksum is 0, which means nothing is imported, it is possible that the cluster is too hot and fails to take in any data.

  3. If the data is mechanically generated, ensure it respects the constrains of the table:

    • AUTO_INCREMENT columns need to be positive, and do not contain the value "0".
    • The UNIQUE and PRIMARY KEYs must have no duplicated entries.
  4. If TiDB Lightning has failed before and was not properly restarted, a checksum mismatch may happen due to data being out-of-sync.

Solutions:

  1. Delete the corrupted data using tidb-lightning-ctl, and restart TiDB Lightning to import the affected tables again.

    tidb-lightning-ctl --config conf/tidb-lightning.toml --checkpoint-error-destroy=all
  2. Consider using an external database to store the checkpoints (change [checkpoint] dsn) to reduce the target database's load.

  3. If TiDB Lightning was improperly restarted, see also the "How to properly restart TiDB Lightning" section in the FAQ.

Checkpoint for … has invalid status: (error code)

Cause: Checkpoint is enabled, and TiDB Lightning or TiKV Importer has previously abnormally exited. To prevent accidental data corruption, Lightning will not start until the error is addressed.

The error code is an integer smaller than 25, with possible values of 0, 3, 6, 9, 12, 14, 15, 17, 18, 20, and 21. The integer indicates the step where the unexpected exit occurs in the import process. The larger the integer is, the later step the exit occurs at.

Solutions:

If the error was caused by invalid data source, delete the imported data using tidb-lightning-ctl and start Lightning again.

tidb-lightning-ctl --config conf/tidb-lightning.toml --checkpoint-error-destroy=all

See the Checkpoints control section for other options.

ResourceTemporarilyUnavailable("Too many open engines …: …")

Cause: The number of concurrent engine files exceeds the limit specified by tikv-importer. This could be caused by misconfiguration. Additionally, if tidb-lightning exited abnormally, an engine file might be left at a dangling open state, which could cause this error as well.

Solutions:

  1. Increase the value of max-open-engines setting in tikv-importer.toml. This value is typically dictated by the available memory. This could be calculated by using:

    Max Memory Usage ≈ max-open-engines × write-buffer-size × max-write-buffer-number

  2. Decrease the value of table-concurrency + index-concurrency so it is less than max-open-engines.

  3. Restart tikv-importer to forcefully remove all engine files (default to ./data.import/). This also removes all partially imported tables, which requires Lightning to clear the outdated checkpoints.

    tidb-lightning-ctl --config conf/tidb-lightning.toml --checkpoint-error-destroy=all

cannot guess encoding for input file, please convert to UTF-8 manually

Cause: TiDB Lightning only recognizes the UTF-8 and GB-18030 encodings for the table schemas. This error is emitted if the file isn't in any of these encodings. It is also possible that the file has mixed encoding, such as containing a string in UTF-8 and another string in GB-18030, due to historical ALTER TABLE executions.

Solutions:

  1. Fix the schema so that the file is entirely in either UTF-8 or GB-18030.

  2. Manually CREATE the affected tables in the target database, and then set [mydumper] no-schema = true to skip automatic table creation.

  3. Set [mydumper] character-set = "binary" to skip the check. Note that this might introduce mojibake into the target database.

[sql2kv] sql encode error = [types:1292]invalid time format: '{1970 1 1 …}'

Cause: A table contains a column with the timestamp type, but the time value itself does not exist. This is either because of DST changes or the time value has exceeded the supported range (Jan 1, 1970 to Jan 19, 2038).

Solutions:

  1. Ensure Lightning and the source database are using the same time zone.

    When executing Lightning directly, the time zone can be forced using the $TZ environment variable.

    # Manual deployment, and force Asia/Shanghai. TZ='Asia/Shanghai' bin/tidb-lightning -config tidb-lightning.toml
  2. When exporting data using Mydumper, make sure to include the --skip-tz-utc flag.

  3. Ensure the entire cluster is using the same and latest version of tzdata (version 2018i or above).

    On CentOS, run yum info tzdata to check the installed version and whether there is an update. Run yum upgrade tzdata to upgrade the package.

[Error 8025: entry too large, the max entry size is 6291456]

Cause: A single row of key-value pairs generated by TiDB Lightning exceeds the limit set by TiDB.

Solution:

Currently, the limitation of TiDB cannot be bypassed. You can only ignore this table to ensure the successful import of other tables.

restore table test.district failed: unknown columns in header [...]

This error occurs usually because the CSV data file does not contain a header (the first row is not column names but data). Therefore, you need to add the following configuration to the TiDB Lightning configuration file:

[mydumper.csv] header = false