Interaction Test on Online Workloads and ADD INDEX Operations

Test purpose

This document tests the interaction effects between online workloads and ADD INDEX operations in the OLTP scenario.

Test version, time, and place

TiDB version: v3.0.1

Time: July, 2019

Place: Beijing

Test environment

This test runs in a Kubernetes cluster deployed with 3 TiDB instances, 3 TiKV instances and 3 PD instances.

Version information

ComponentGitHash
TiDB9e4e8da3c58c65123db5f26409759fe1847529f8
TiKV4151dc8878985df191b47851d67ca21365396133
PD811ce0b9a1335d1b2a049fd97ef9e186f1c9efc1

Sysbench version:1.0.17

TiDB parameter configuration

TiDB, TiKV and PD all use the default TiDB Operator configuration.

Cluster topology

Machine IPDeployment instance
172.31.8.8Sysbench
172.31.7.69, 172.31.5.152, 172.31.11.133PD
172.31.4.172, 172.31.1.155, 172.31.9.210TiKV
172.31.7.80, 172.31.5.163, 172.31.11.123TiDB

Online workloads simulation using Sysbench

Use Sysbench to import a table with 2,000,000 rows of data into the Kubernetes cluster.

Execute the following command to import data:

sysbench oltp_common \ --threads=16 \ --rand-type=uniform \ --db-driver=mysql \ --mysql-db=sbtest \ --mysql-host=$tidb_host \ --mysql-port=$tidb_port \ --mysql-user=root \ prepare --tables=1 --table-size=2000000

Execute the following command to run the test:

sysbench $testname \ --threads=$threads \ --time=300000 \ --report-interval=15 \ --rand-type=uniform \ --rand-seed=$RANDOM \ --db-driver=mysql \ --mysql-db=sbtest \ --mysql-host=$tidb_host \ --mysql-port=$tidb_port \ --mysql-user=root \ run --tables=1 --table-size=2000000

Test plan 1: Frequently perform write operations to the target column of the ADD INDEX statement

  1. Start the oltp_read_write test.
  2. Perform at the same time with step 1: use alter table sbtest1 add index c_idx(c) to add an index.
  3. Perform at the end of step 2: when the index is added successfully, stop the oltp_read_write test.
  4. Get the duration of alter table ... add index and the average TPS and QPS of Sysbench in this period.
  5. Gradually increase the value of two parameters tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size, and then repeat step 1-4.

Test results

Test result of oltp_read_write without ADD INDEX operations

sysbench TPSsysbench QPS
350.316806

tidb_ddl_reorg_batch_size = 32

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1402338.46776
2266330.36001
4174288.55769
8129280.65612
1690263.55273
3254229.24583
4857230.14601

add-index-load-1-b32

tidb_ddl_reorg_batch_size = 64

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1264269.45388
2163266.25324
4105272.55430
878262.55228
1657215.54308
3242185.23715
4845189.23794

add-index-load-1-b64

tidb_ddl_reorg_batch_size = 128

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1171289.15779
2110274.25485
479250.65011
851246.14922
1639171.13431
3235130.82629
4835120.52425

add-index-load-1-b128

tidb_ddl_reorg_batch_size = 256

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1145283.05659
296282.25593
456236.54735
845194.23882
1639149.32893
3236113.52268
483386.21715

add-index-load-1-b256

tidb_ddl_reorg_batch_size = 512

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1135257.85147
278252.85053
449222.74478
836145.42904
16331092190
323372.51503
483354.21318

add-index-load-1-b512

tidb_ddl_reorg_batch_size = 1024

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1111244.34885
278228.44573
454168.83320
839123.82475
163659.61213
324293.21835
4851115.72261

add-index-load-1-b1024

tidb_ddl_reorg_batch_size = 2048

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1918243.34855
21160209.94194
4342185.43707
81316151.03027
1679530.5679
32113026.69547
4889327.5552

add-index-load-1-b2048

tidb_ddl_reorg_batch_size = 4096

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
13042200.04001
23022203.84076
4858195.53971
83015177.13522
16837143.82875
329421142267
4818754.21416

add-index-load-1-b4096

Test conclusion

When you perform frequent write operations (this test involves UPDATE, INSERT and DELETE operations) to the target column of the ADD INDEX statement, the default ADD INDEX configuration has a significant impact on the online workload of the system. It is mainly because of the write conflicts caused by the concurrent ADD INDEX operation and column update. The performance of the system is as follows:

  • As the value of tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size parameters increase, the value of TiKV_prewrite_latch_wait_duration increases significantly, slowing down the write speed.
  • When the value of tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size are very large, you can execute the admin show ddl command to see multiple retry attempts of the DDL job, such as Write conflict, txnStartTS 410327455965380624 is stale [try again later], ErrCount:38, SnapshotVersion: 410327228136030220. In this situation, the ADD INDEX operation takes a very long time to complete.

Test plan 2: Do not perform write operations to the target column of the ADD INDEX statement (query-only)

  1. Start the oltp_read_only test.
  2. Perform at the same time with step 1: use alter table sbtest1 add index c_idx(c) to add an index.
  3. Perform at the end of step 2: when the index is added successfully, stop the oltp_read_only test.
  4. Get the duration of alter table ... add index and the average TPS and QPS of Sysbench in this period.
  5. Gradually increase the value of two parameters tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size, and then repeat step 1-4.

Test results

Test result of oltp_read_only without ADD INDEX operations

sysbench TPSsysbench QPS
550.98812.8

tidb_ddl_reorg_batch_size = 32

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1376548.98780
2212541.58523
4135538.68549
8114536.78393
1677533.98292
3246533.48103
4846532.28074

add-index-load-2-b32

tidb_ddl_reorg_batch_size = 1024

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
191536.88316
252533.98165
440522.47947
8365107860
1633485.57704
3231467.57516
4830562.17442

add-index-load-2-b1024

tidb_ddl_reorg_batch_size = 4096

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1103502.27823
263486.57672
452467.47516
839452.57302
1635447.27206
3230441.97057
4830440.17004

add-index-load-2-b4096

Test conclusion

When you only perform query operations to the target column of the ADD INDEX statement, the effect of ADD INDEX operations on online workloads is not obvious.

Test plan 3: The target column of the ADD INDEX statement is irrelevant to online workloads

  1. Start the oltp_read_write test.
  2. Perform at the same time with step 1: use alter table test add index pad_idx(pad) to add an index.
  3. Perform at the end of step 2: when the index is added successfully, stop the oltp_read_only test.
  4. Get the duration of alter table ... add index and the average TPS and QPS of Sysbench in this period.
  5. Gradually increase the value of two parameters tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size, and then repeat step 1-4.

Test results

Test result of oltp_read_write without ADD INDEX operations

sysbench TPSsysbench QPS
350.316806

tidb_ddl_reorg_batch_size = 32

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1372350.46892
2207344.26700
4140343.16672
8121339.16579
16763406607
3242343.16695
4842333.46454

add-index-load-3-b32

tidb_ddl_reorg_batch_size = 1024

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
194352.46794
2503326493
4453306456
836325.56324
1632312.56294
3232300.66017
4831279.55612

add-index-load-3-b1024

tidb_ddl_reorg_batch_size = 4096

tidb_ddl_reorg_worker_cntadd_index_durations(s)sysbench TPSsysbench QPS
1116325.56324
265312.56290
450300.66017
837279.55612
1634250.45365
3232220.24924
4833214.84544

add-index-load-3-b4096

Test conclusion

When the target column of the ADD INDEX statement is irrelevant to online workloads, the effect of ADD INDEX operations on the workload is not obvious.

Summary

  • When you perform frequent write operations (including INSERT, DELETE and UPDATE operations) to the target column of the ADD INDEX statement, the default ADD INDEX configuration causes relatively frequent write conflicts, which has a great impact on online workloads. At the same time, the ADD INDEX operation takes a long time to complete due to continuous retry attempts. In this test, you can modify the product of tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size to 1/32 of the default value. For example, you can set tidb_ddl_reorg_worker_cnt to 4 and tidb_ddl_reorg_batch_size to 256 for better performance.
  • When only performing query operations to the target column of the ADD INDEX statement or the target column is not directly related to online workloads, you can use the default ADD INDEX configuration.