Important

You are viewing the archived documentation of TiDB, which no longer receives updates. It is recommended that you use the latest LTS version of the TiDB database.

TiDB 7.3.0 Release Notes

Release date: August 14, 2023

TiDB version: 7.3.0

Quick access: Quick start | Installation packages

7.3.0 introduces the following major features. In addition to that, 7.3.0 also includes a series of enhancements (described in the Feature details section) to query stability in TiDB server and TiFlash. These enhancements are more miscellaneous in nature and not user-facing so they are not included in the following table.

Category	Feature	Description
Scalability and Performance	TiDB Lightning supports Partitioned Raft KV (experimental)	TiDB Lightning now supports the new Partitioned Raft KV architecture, as part of the near-term GA of the architecture.
Reliability and Availability	Add automatic conflict detection and resolution on data imports	The TiDB Lightning Physical Import Mode supports a new version of conflict detection, which implements the semantics of replacing (`replace`) or ignoring (`ignore`) conflict data when encountering conflicts. It automatically handles conflict data for you while improving the performance of conflict resolution.
Reliability and Availability	Manual management of runaway queries (experimental)	Queries might take longer than you expect. With the new watch list of resource groups, you can now manage queries more effectively and either deprioritize or kill them. Allowing operators to mark target queries by exact SQL text, SQL digest, or plan digest and deal with the queries at a resource group level, this feature gives you much more control over the potential impact of unexpected large queries on a cluster.
SQL	Enhance operator control over query stability by adding more optimizer hints to the query planner	Added hints: `NO_INDEX_JOIN()`, `NO_MERGE_JOIN()`, `NO_INDEX_MERGE_JOIN()`, `NO_HASH_JOIN()`, `NO_INDEX_HASH_JOIN()`
DB Operations and Observability	Show the progress of statistics collection tasks	Support viewing the progress of `ANALYZE` tasks using the `SHOW ANALYZE STATUS` statement or through the `mysql.analyze_jobs` system table.

Feature details

Performance

TiFlash supports the replica selection strategy #44106 @XuHuaiyu
Before v7.3.0, TiFlash uses replicas from all its nodes for data scanning and MPP calculations to maximize performance. Starting from v7.3.0, TiFlash introduces the replica selection strategy and lets you configure it using the tiflash_replica_read system variable. This strategy supports selecting specific replicas based on the zone attributes of nodes and scheduling specific nodes for data scanning and MPP calculations.
For a cluster that is deployed in multiple data centers and each data center has complete TiFlash data replicas, you can configure this strategy to only select TiFlash replicas from the current data center. This means data scanning and MPP calculations are performed only on TiFlash nodes in the current data center, which avoids excessive network data transmission across data centers.
For more information, see documentation.
TiFlash supports Runtime Filter within nodes #40220 @elsa0520
Runtime Filter is a dynamic predicate generated during the query planning phase. In the process of table joining, these dynamic predicates can effectively filter out rows that do not meet the join conditions, reducing scan time and network overhead, and improving the efficiency of table joining. Starting from v7.3.0, TiFlash supports Runtime Filter within nodes, improving the overall performance of analytical queries. In some TPC-DS workloads, the performance can be improved by 10% to 50%.
This feature is disabled by default in v7.3.0. To enable this feature, set the system variable tidb_runtime_filter_mode to LOCAL.
For more information, see documentation.
TiFlash supports executing common table expressions (CTEs) (experimental) #43333 @winoros
Before v7.3.0, the MPP engine of TiFlash cannot execute queries that contain CTEs by default. To achieve the best execution performance within the MPP framework, you need to use the system variable tidb_opt_force_inline_cte to enforce inlining CTE.
Starting from v7.3.0, TiFlash's MPP engine supports executing queries with CTEs without inlining them, allowing for optimal query execution within the MPP framework. In TPC-DS benchmark tests, compared with inlining CTEs, this feature has shown a 20% improvement in overall query execution speed for queries containing CTE.
This feature is experimental and is disabled by default. It is controlled by the system variable tidb_opt_enable_mpp_shared_cte_execution.

Reliability

Add new optimizer hints #45520 @qw4990
In v7.3.0, TiDB introduces several new optimizer hints to control the join methods between tables, including:
- NO_MERGE_JOIN() selects join methods other than merge join.
- NO_INDEX_JOIN() selects join methods other than index nested loop join.
- NO_INDEX_MERGE_JOIN() selects join methods other than index nested loop merge join.
- NO_HASH_JOIN() selects join methods other than hash join.
- NO_INDEX_HASH_JOIN() selects join methods other than index nested loop hash join.
For more information, see documentation.
Manually mark queries that use resources more than expected (experimental) #43691 @Connor1996 @CabinfeverB
In v7.2.0, TiDB automatically manages queries that use resources more than expected (Runaway Query) by automatically downgrading or canceling runaway queries. In actual practice, rules alone cannot cover all cases. Therefore, TiDB v7.3.0 introduces the ability to manually mark runaway queries. With the new command QUERY WATCH, you can mark runaway queries based on SQL text, SQL Digest, or execution plan, and the marked runaway queries can be downgraded or cancelled.
This feature provides an effective intervention method for sudden performance issues in the database. For performance issues caused by queries, before identifying the root cause, this feature can quickly alleviate its impact on overall performance, thereby improving system service quality.
For more information, see documentation.

SQL

List and List COLUMNS partitioned tables support default partitions #20679 @mjonss @bb7133
Before v7.3.0, when you use the INSERT statement to insert data into a List or List COLUMNS partitioned table, the data needs to meet the specified partitioning conditions of the table. If the data to be inserted does not meet any of these conditions, either the execution of the statement will fail or the non-compliant data will be ignored.
Starting from v7.3.0, List and List COLUMNS partitioned tables support default partitions. After a default partition is created, if the data to be inserted does not meet any partitioning condition, it will be written to the default partition. This feature improves the usability of List and List COLUMNS partitioning, avoiding the execution failure of the INSERT statement or data being ignored due to data that does not meet partitioning conditions.
Note that this feature is a TiDB extension to MySQL syntax. For a partitioned table with a default partition, the data in the table cannot be directly replicated to MySQL.
For more information, see documentation.

Observability

Show the progress of collecting statistics #44033 @hawkingrei
Collecting statistics for large tables often takes a long time. In previous versions, you cannot see the progress of collecting statistics, and therefore cannot predict the completion time. TiDB v7.3.0 introduces a feature to show the progress of collecting statistics. You can view the overall workload, current progress, and estimated completion time for each subtask using the system table mysql.analyze_jobs or SHOW ANALYZE STATUS. In scenarios such as large-scale data import and SQL performance optimization, this feature helps you understand the overall task progress and improves the user experience.
For more information, see documentation.
Plan Replayer supports exporting historical statistics #45038 @time-and-fate
Starting from v7.3.0, with the newly added dump with stats as of timestamp clause, you can use Plan Replayer to export the statistics of specified SQL-related objects at a specific point in time. During the diagnosis of execution plan issues, accurately capturing historical statistics can help analyze more precisely how the execution plan was generated at the time when the issue occurred. This helps identify the root cause of the issue and greatly improves efficiency in diagnosing execution plan issues.
For more information, see documentation.

Data migration

TiDB Lightning introduces a new version of conflict data detection and handling strategy #41629 @lance6716
In previous versions, TiDB Lightning uses different conflict detection and handling methods for Logical Import Mode and Physical Import Mode, which are complex to configure and not easy for users to understand. In addition, Physical Import Mode cannot handle conflicts using the replace or ignore strategy. Starting from v7.3.0, TiDB Lightning introduces a unified conflict detection and handling strategy for both Logical Import Mode and Physical Import Mode. You can choose to report an error (error), replace (replace) or ignore (ignore) conflicting data when encountering conflicts. You can limit the number of conflict records, such as the task is interrupted and terminated after processing a specified number of conflict records. Furthermore, the system can record conflicting data for troubleshooting.
For import data with many conflicts, it is recommended to use the new version of the conflict detection and handling strategy for better performance. In the lab environment, the new version strategy can improve the performance of conflict detection and handling up to three times faster than the old version. This performance value is for reference only. The actual performance might vary depending on your configuration, table structure, and the percentage of conflicting data. Note that the new version and the old version of the conflict strategy cannot be used at the same time. The old conflict detection and handling strategy will be deprecated in the future.
For more information, see documentation.
TiDB Lightning supports Partitioned Raft KV (experimental) #14916 @GMHDBJD
TiDB Lightning now supports Partitioned Raft KV. This feature helps improve the data import performance of TiDB Lightning.
TiDB Lightning introduces a new parameter enable-diagnose-log to enhance troubleshooting by printing more diagnostic logs #45497 @D3Hunter
By default, this feature is disabled and TiDB Lightning only prints logs containing lightning/main. When enabled, TiDB Lightning prints logs for all packages (including client-go and tidb) to help diagnose issues related to client-go and tidb.
For more information, see documentation.

Compatibility changes

Note

This section provides compatibility changes you need to know when you upgrade from v7.2.0 to the current version (v7.3.0). If you are upgrading from v7.1.0 or earlier versions to the current version, you might also need to check the compatibility changes introduced in intermediate versions.

Behavior changes

Backup & Restore (BR)
- BR adds an empty cluster check before performing a full data restoration. By default, restoring data to a non-empty cluster is not allowed. If you want to force the restoration, you can use the --filter option to specify the corresponding table name to restore data to.
TiDB Lightning
- tikv-importer.on-duplicate is deprecated and replaced by conflict.strategy.
- The max-error parameter, which controls the maximum number of non-fatal errors that TiDB Lightning can tolerate before stopping the migration task, no longer limits import data conflicts. The conflict.threshold parameter now controls the maximum number of conflicting records that can be tolerated.
TiCDC
- When Kafka sink uses Avro protocol, if the force-replicate parameter is set to true, TiCDC reports an error when creating a changefeed.
- Due to incompatibility between delete-only-output-handle-key-columns and force-replicate parameters, when both parameters are enabled, TiCDC reports an error when creating a changefeed.
- When the output protocol is Open Protocol, the UPDATE events only output the changed columns.

System variables

Variable name	Change type	Description
`tidb_opt_enable_mpp_shared_cte_execution`	Modified	This system variable takes effect starting from v7.3.0. It controls whether non-recursive Common Table Expressions (CTEs) can be executed in TiFlash MPP.
`tidb_allow_tiflash_cop`	Newly added	This system variable is used to select the protocol for generating execution plans when TiDB pushes computation tasks down to TiFlash.
`tidb_lock_unchanged_keys`	Newly added	This variable is used to control in certain scenarios whether to lock the keys that are involved but not modified in a transaction.
`tidb_opt_enable_non_eval_scalar_subquery`	Newly added	Controls whether the `EXPLAIN` statement disables the execution of constant subqueries that can be expanded at the optimization stage.
`tidb_skip_missing_partition_stats`	Newly added	This variable controls the generation of GlobalStats when partition statistics are missing.
`tiflash_replica_read`	Newly added	Controls the strategy for selecting TiFlash replicas when a query requires the TiFlash engine.

Configuration file parameters

Configuration file	Configuration parameter	Change type	Description
TiDB	`enable-32bits-connection-id`	Newly added	Controls whether to enable the 32-bit connection ID feature.
TiDB	`in-mem-slow-query-recent-num`	Newly added	Controls the number of recently used slow queries that are cached in memory.
TiDB	`in-mem-slow-query-topn-num`	Newly added	Controls the number of slowest queries that are cached in memory.
TiKV	`coprocessor.region-bucket-size`	Modified	Changes the default value from `96MiB` to `50MiB`.
TiKV	`raft-engine.format-version`	Modified	When using Partitioned Raft KV (`storage.engine="partitioned-raft-kv"`), Ribbon filter is used. Therefore, TiKV changes the default value from `2` to `5`.
TiKV	`raftdb.max-total-wal-size`	Modified	When using Partitioned Raft KV (`storage.engine="partitioned-raft-kv"`), TiKV skips writing WAL. Therefore, TiKV changes the default value from `"4GB"` to `1`, meaning that WAL is disabled.
TiKV	`rocksdb.[defaultcf\|writecf\|lockcf].compaction-guard-min-output-file-size`	Modified	Changes the default value from `"1MB"` to `"8MB"` to resolve the issue that compaction speed cannot keep up with the write speed during large data writes.
TiKV	`rocksdb.[defaultcf\|writecf\|lockcf].format-version`	Modified	When using Partitioned Raft KV (`storage.engine="partitioned-raft-kv"`), Ribbon filter is used. Therefore, TiKV changes the default value from `2` to `5`.
TiKV	`rocksdb.lockcf.write-buffer-size`	Modified	When using Partitioned Raft KV (`storage.engine="partitioned-raft-kv"`), to speed up compaction on lockcf, TiKV changes the default value from `"32MB"` to `"4MB"`.
TiKV	`rocksdb.max-total-wal-size`	Modified	When using Partitioned Raft KV (`storage.engine="partitioned-raft-kv"`), TiKV skips writing WAL. Therefore, TiKV changes the default value from `"4GB"` to `1`, meaning that WAL is disabled.
TiKV	`rocksdb.stats-dump-period`	Modified	When using Partitioned Raft KV (`storage.engine="partitioned-raft-kv"`), to disable redundant log printing, changes the default value from `"10m"` to `"0"`.
TiKV	`rocksdb.write-buffer-limit`	Modified	To reduce the memory overhead of memtables, when `storage.engine="raft-kv"`, TiKV changes the default value from 25% of the memory of the machine to `0`, which means no limit. When using Partitioned Raft KV (`storage.engine="partitioned-raft-kv"`), TiKV changes the default value from 25% to 20% of the memory of the machine.
TiKV	`storage.block-cache.capacity`	Modified	When using Partitioned Raft KV (`storage.engine="partitioned-raft-kv"`), to compensate for the memory overhead of memtables, TiKV changes the default value from 45% to 30% of the size of total system memory.
TiFlash	`storage.format_version`	Modified	Introduces a new DTFile format `format_version = 5` to reduce the number of physical files by merging smaller files. Note that this format is experimental and not enabled by default.
TiDB Lightning	`tikv-importer.incremental-import`	Deleted	TiDB Lightning parallel import parameter. Because it could easily be mistaken as an incremental import parameter, this parameter is now renamed to `tikv-importer.parallel-import`. If a user passes in the old parameter name, it will be automatically converted to the new one.
TiDB Lightning	`tikv-importer.on-duplicate`	Deprecated	Controls action to do when trying to insert a conflicting record in the logical import mode. Starting from v7.3.0, this parameter is replaced by `conflict.strategy`.
TiDB Lightning	`conflict.max-record-rows`	Newly added	The new version of strategy to handle conflicting data. It controls the maximum number of rows in the `conflict_records` table. The default value is 100.
TiDB Lightning	`conflict.strategy`	Newly added	The new version of strategy to handle conflicting data. It includes the following options: "" (TiDB Lightning does not detect and process conflicting data), `error` (terminate the import and report an error if a primary or unique key conflict is detected in the imported data), `replace` (when encountering data with conflicting primary or unique keys, the new data is retained and the old data is overwritten.), `ignore` (when encountering data with conflicting primary or unique keys, the old data is retained and the new data is ignored.). The default value is "", that is, TiDB Lightning does not detect and process conflicting data.
TiDB Lightning	`conflict.threshold`	Newly added	Controls the upper limit of the conflicting data. When `conflict.strategy="error"`, the default value is `0`. When `conflict.strategy="replace"` or `conflict.strategy="ignore"`, you can set it as a maxint.
TiDB Lightning	`enable-diagnose-logs`	Newly added	Controls whether to enable the diagnostic logs. The default value is `false`, that is, only the logs related to the import are output, and the logs of other dependent components are not output. When you set it to `true`, logs from both the import process and other dependent components are output, and GRPC debugging is enabled, which can be used for diagnosis.
TiDB Lightning	`tikv-importer.parallel-import`	Newly added	TiDB Lightning parallel import parameter. It replaces the existing `tikv-importer.incremental-import` parameter, which could be mistaken as an incremental import parameter and misused.
BR	`azblob.encryption-scope`	Newly added	BR provides encryption scope support for Azure Blob Storage.
BR	`azblob.encryption-key`	Newly added	BR provides encryption key support for Azure Blob Storage.
TiCDC	`large-message-handle-option`	Newly added	Empty by default, which means that when the message size exceeds the limit of Kafka topic, the changefeed fails. When this configuration is set to `"handle-key-only"`, if the message exceeds the size limit, only the handle key will be sent to reduce the message size; if the reduced message still exceeds the limit, then the changefeed fails.
TiCDC	`sink.csv.binary-encoding-method`	Newly added	The encoding method of binary data, which can be `'base64'` or `'hex'`. The default value is `'base64'`.

System tables

Add a new system table mysql.tidb_timers to store the metadata of internal timers.

Deprecated features

TiDB
- The Fast Analyze feature (experimental) for statistics will be deprecated in v7.5.0.
- The incremental collection feature for statistics will be deprecated in v7.5.0.

Improvements

TiDB
- Introduce a new system variable tidb_opt_enable_non_eval_scalar_subquery to control whether the EXPLAIN statement executes subqueries in advance during the optimization phase #22076 @winoros
- When Global Kill is enabled, you can terminate the current session by pressing Control+C #8854 @pingyu
- Support the IS_FREE_LOCK() and IS_USED_LOCK() locking functions #44493 @dveeden
- Optimize the performance of reading the dumped chunks from disk #45125 @YangKeao
- Optimize the overestimation issue of the inner table of Index Join by using Optimizer Fix Controls #44855 @time-and-fate
TiKV
- Add the Max gap of safe-ts and Min safe ts region metrics and introduce the tikv-ctl get-region-read-progress command to better observe and diagnose the status of resolved-ts and safe-ts #15082 @ekexium
PD
- Support blocking the Swagger API by default when the Swagger server is not enabled #6786 @bufferflies
- Improve the high availability of etcd #6554 #6442 @lhy1024
- Reduce the memory consumption of GetRegions requests #6835 @lhy1024
TiFlash
- Support a new DTFile format version storage.format_version = 5 to reduce the number of physical files (experimental) #7595 @hongyunyan
Tools
- Backup & Restore (BR)
  - When backing up data to Azure Blob Storage using BR, you can specify either an encryption scope or an encryption key for server-side encryption #45025 @Leavrth
- TiCDC
  - Optimize the message size of the Open Protocol output to make it include only the updated column values when sending UPDATE events #9336 @3AceShowHand
  - Storage Sink now supports hexadecimal encoding for HEX formatted data, making it compatible with AWS DMS format specifications #9373 @CharlesCheung96
  - Kafka Sink supports sending only handle key data when the message is too large, reducing the size of the message #9382 @3AceShowHand

Bug fixes

TiDB
- Fix the issue that when the MySQL Cursor Fetch protocol is used, the memory consumption of result sets might exceed the tidb_mem_quota_query limit and causes TiDB OOM. After the fix, TiDB will automatically write result sets to the disk to release memory #43233 @YangKeao
- Fix the TiDB panic issue caused by data race #45561 @genliqi
- Fix the hang-up issue that occurs when queries with indexMerge are killed #45279 @xzhangxian1008
- Fix the issue that query results in MPP mode are incorrect when tidb_enable_parallel_apply is enabled #45299 @windtalker
- Fix the issue that resolve lock might hang when there is a sudden change in PD time #44822 @zyguan
- Fix the issue that the GC Resolve Locks step might miss some pessimistic locks #45134 @MyonKeminta
- Fix the issue that the query with ORDER BY returns incorrect results in dynamic pruning mode #45007 @Defined2014
- Fix the issue that AUTO_INCREMENT can be specified on the same column with the DEFAULT column value #45136 @Defined2014
- Fix the issue that querying the system table INFORMATION_SCHEMA.TIKV_REGION_STATUS returns incorrect results in some cases #45531 @Defined2014
- Fix the issue of incorrect partition table pruning in some cases #42273 @jiyfhust
- Fix the issue that global indexes are not cleared when truncating partition of a partitioned table #42435 @L-maple
- Fix the issue that other TiDB nodes do not take over TTL tasks after failures in one TiDB node #45022 @lcwangchao
- Fix the memory leak issue when TTL is running #45510 @lcwangchao
- Fix the issue of inaccurate error messages when inserting data into partitioned tables #44966 @lilinghai
- Fix the read permission issue on the INFORMATION_SCHEMA.TIFLASH_REPLICA table #7795 @Lloyd-Pottiger
- Fix the issue that an error occurs when using a wrong partition table name #44967 @River2000i
- Fix the issue that creating indexes gets stuck when tidb_enable_dist_task is enabled in some cases #44440 @tangenta
- Fix the duplicate entry error that occurs when restoring a table with AUTO_ID_CACHE=1 using BR #44716 @tiancaiamao
- Fix the issue that the time consumed for executing TRUNCATE TABLE is inconsistent with the task execution time shown in ADMIN SHOW DDL JOBS #44785 @tangenta
- Fix the issue that upgrading TiDB gets stuck when reading metadata takes longer than one DDL lease #45176 @zimulala
- Fix the issue that the query result of the SELECT CAST(n AS CHAR) statement is incorrect when n in the statement is a negative number #44786 @xhebox
- Fix the issue that queries might return incorrect results when tidb_opt_agg_push_down is enabled #44795 @AilinKid
- Fix the issue of wrong results that occurs when a query with current_date() uses plan cache #45086 @qw4990
TiKV
- Fix the issue that reading data during GC might cause TiKV panic in some rare cases #15109 @MyonKeminta
PD
- Fix the issue that restarting PD might cause the default resource group to be reinitialized #6787 @glorv
- Fix the issue that when etcd is already started but the client has not yet connected to it, calling the client might cause PD to panic #6860 @HuSharp
- Fix the issue that the health-check output of a Region is inconsistent with the Region information returned by querying the Region ID #6560 @JmPotato
- Fix the issue that failed learner peers in unsafe recovery are ignored in auto-detect mode #6690 @v01dstar
- Fix the issue that Placement Rules select TiFlash learners that do not meet the rules #6662 @rleungx
- Fix the issue that unhealthy peers cannot be removed when rule checker selects peers #6559 @nolouch
TiFlash
- Fix the issue that TiFlash cannot replicate partitioned tables successfully due to deadlocks #7758 @hongyunyan
- Fix the issue that the INFORMATION_SCHEMA.TIFLASH_REPLICA system table contains tables that users do not have privileges to access #7795 @Lloyd-Pottiger
- Fix the issue that when there are multiple HashAgg operators within the same MPP task, the compilation of the MPP task might take an excessively long time, severely affecting query performance #7810 @SeaRise
Tools
- TiCDC
  - Fix the issue that changefeeds would fail due to the temporary unavailability of PD #9294 @asddongmen
  - Fix the data inconsistency issue that might occur when some TiCDC nodes are isolated from the network #9344 @CharlesCheung96
  - Fix the issue that when Kafka Sink encounters errors it might indefinitely block changefeed progress #9309 @hicqu
  - Fix the panic issue that might occur when the TiCDC node status changes #9354 @sdojjy
  - Fix the encoding error for the default ENUM values #9259 @3AceShowHand
- TiDB Lightning
  - Fix the issue that executing checksum after TiDB Lightning completes import might get SSL errors #45462 @D3Hunter
  - Fix the issue that in Logical Import Mode, deleting tables downstream during import might cause TiDB Lightning metadata not to be updated in time #44614 @dsdashun

Contributors

We would like to thank the following contributors from the TiDB community: