TiDB 3.0 GA Release Notes
Release date: June 28, 2019
TiDB version: 3.0.0
TiDB Ansible version: 3.0.0
Overview
On June 28, 2019, TiDB 3.0 GA is released. The corresponding TiDB Ansible version is 3.0.0. Compared with TiDB 2.1, this release has greatly improved in the following aspects:
- Stability. TiDB 3.0 has demonstrated long-term stability for large-scale clusters with up to 150+ nodes and 300+ TB of storage.
- Usability. TiDB 3.0 has multi-facet improvements in usability, including standardized slow query logs, well-developed log file specification, and new features such as
EXPLAIN ANALYZEand SQL Trace to save operation costs for users. - Performance. The performance of TiDB 3.0 is 4.5 times greater than TiDB 2.1 in TPC-C benchmarks, and over 1.5 times in Sysbench benchmarks. Thanks to the support for Views, TPC-H 50G Q15 can now run normally.
- New features including Window Functions, Views (Experimental), partitioned tables, the plugin framework, pessimistic locking (Experimental), and
SQL Plan Management.
TiDB
- New Features
- Support Window Functions; compatible with all window functions in MySQL 8.0, including
NTILE,LEAD,LAG,PERCENT_RANK,NTH_VALUE,CUME_DIST,FIRST_VALUE,LAST_VALUE,RANK,DENSE_RANK, andROW_NUMBER - Support Views (Experimental)
- Improve Table Partition
- Support Range Partition
- Support Hash Partition
- Add the plug-in framework, supporting plugins such as IP Whitelist (Enterprise) and Audit Log (Enterprise).
- Support the SQL Plan Management function to create SQL execution plan binding to ensure query stability (Experimental)
- Support Window Functions; compatible with all window functions in MySQL 8.0, including
- SQL Optimizer
- Optimize the
NOT EXISTSsubquery and convert it toAnti Semi Jointo improve performance - Optimize the constant propagation on the
Outer Join, and add the optimization rule ofOuter Joinelimination to reduce non-effective computations and improve performance - Optimize the
INsubquery to executeInner Joinafter aggregation to improve performance - Optimize
Index Jointo adapt to more scenarios - Improve the Partition Pruning optimization rule of Range Partition
- Optimize the query logic for
_tidb_rowidto avoid full table scan and improve performance - Match more prefix columns of the indexes when extracting access conditions of composite indexes if there are relevant columns in the filter to improve performance
- Improve the accuracy of cost estimates by using order correlation between columns
- Optimize
Join Orderbased on the greedy strategy and the dynamic programming algorithm to speed up the join operation of multiple tables - Support Skyline Pruning, with some rules to prevent the execution plan from relying too heavily on statistics to improve query stability
- Improve the accuracy of row count estimation for single-column indexes with NULL values
- Support
FAST ANALYZEthat randomly samples in each Region to avoid full table scan and improve performance with statistics collection - Support the incremental Analyze operation on monotonically increasing index columns to improve performance with statistics collection
- Support using subqueries in the
DOstatement - Support using
Index Joinin transactions - Optimize
prepare/executeto support DDL statements with no parameters - Modify the system behavior to auto load statistics when the
stats-leasevariable value is 0 - Support exporting historical statistics
- Support the
dump/loadcorrelation of histograms
- Optimize the
- SQL Execution Engine
- Optimize log output:
EXECUTEoutputs user variables andCOMMIToutputs slow query logs to facilitate troubleshooting - Support the
EXPLAIN ANALYZEfunction to improve SQL tuning usability - Support the
admin show next_row_idcommand to get the ID of the next row - Add six built-in functions:
JSON_QUOTE,JSON_ARRAY_APPEND,JSON_MERGE_PRESERVE,BENCHMARK,COALESCE, andNAME_CONST - Optimize control logics on the chunk size to dynamically adjust based on the query context, to reduce the SQL execution time and resource consumption
- Support tracking and controlling memory usage in three operators -
TableReader,IndexReaderandIndexLookupReader - Optimize the Merge Join operator to support an empty
ONcondition - Optimize write performance for single tables that contains too many columns
- Improve the performance of
admin show ddl jobsby supporting scanning data in reverse order - Add the
split table regionstatement to manually split the table Region to alleviate hotspot issues - Add the
split index regionstatement to manually split the index Region to alleviate hotspot issues - Add a blocklist to prohibit pushing down expressions to Coprocessor
- Optimize the
Expensive Querylog to print the SQL query in the log when it exceeds the configured limit of execution time or memory
- Optimize log output:
- DDL
- Support migrating from character set
utf8toutf8mb4 - Change the default character set from
utf8toutf8mb4 - Add the
alter schemastatement to modify the character set and the collation of the database - Support ALTER algorithm
INPLACE/INSTANT - Support
SHOW CREATE VIEW - Support
SHOW CREATE USER - Support fast recovery of mistakenly deleted tables
- Support adjusting the number of concurrencies of ADD INDEX dynamically
- Add the
pre_split_regionsoption that pre-allocates Regions when creating the table using theCREATE TABLEstatement, to relieve write hot Regions caused by lots of writes after the table creation - Support splitting Regions by the index and range of the table specified using SQL statements to relieve hotspot issues
- Add the
ddl_error_count_limitglobal variable to limit the number of DDL task retries - Add a feature to use
SHARD_ROW_ID_BITSto scatter row IDs when the column contains an AUTO_INCREMENT attribute to relieve hotspot issues - Optimize the lifetime of invalid DDL metadata to speed up recovering the normal execution of DDL operations after upgrading the TiDB cluster
- Support migrating from character set
- Transactions
- Support the pessimistic transaction mode (Experimental)
- Optimize transaction processing logics to adapt to more scenarios:
- Change the default value
tidb_disable_txn_auto_retrytoon, which means non-auto committed transactions will not be retried - Add the
tidb_batch_commitsystem variable to split a transaction into multiple ones to be executed concurrently - Add the
tidb_low_resolution_tsosystem variable to control the number of TSOs to obtain in batches and reduce the number of times that transactions request for TSOs, to improve performance in scenarios with relatively low requirement of consistency - Add the
tidb_skip_isolation_level_checkvariable to control whether to report errors when the isolation level is set to SERIALIZABLE - Modify the
tidb_disable_txn_auto_retrysystem variable to make it work on all retryable errors
- Change the default value
- Permission Management
- Perform permission check on the
ANALYZE,USE,SET GLOBAL, andSHOW PROCESSLISTstatements - Support Role Based Access Control (RBAC) (Experimental)
- Perform permission check on the
- Server
- Optimize slow query logs:
- Restructure the log format
- Optimize the log content
- Optimize the log query method to support using the
INFORMATION_SCHEMA.SLOW_QUERYandADMIN SHOW SLOWstatements of the memory table to query slow query logs
- Develop a unified log format specification with restructured log system to facilitate collection and analysis by tools
- Support using SQL statements to manage TiDB Binlog services, including querying status, enabling TiDB Binlog, maintaining and sending TiDB Binlog strategies.
- Support using
unix_socketto connect to the database - Support
Tracefor SQL statements - Support getting information for a TiDB instance via the
/debug/zipHTTP interface to facilitate troubleshooting. - Optimize monitoring items to facilitate troubleshooting:
- Add the
high_error_rate_feedback_totalmonitoring item to monitor the difference between the actual data volume and the estimated data volume based on statistics - Add a QPS monitoring item in the database dimension
- Add the
- Optimize the system initialization process to only allow the DDL owner to perform the initialization. This reduces the startup time for initialization or upgrading.
- Optimize the execution logic of
kill queryto improve performance and ensure resource is release properly - Add a startup option
config-checkto check the validity of the configuration file - Add the
tidb_back_off_weightsystem variable to control the backoff time of internal error retries - Add the
wait_timeoutandinteractive_timeoutsystem variables to control the maximum idle connections allowed - Add the connection pool for TiKV to shorten the connection establishing time
- Optimize slow query logs:
- Compatibility
- Support the
ALLOW_INVALID_DATESSQL mode - Support the MySQL 320 Handshake protocol
- Support manifesting unsigned BIGINT columns as auto-increment columns
- Support the
SHOW CREATE DATABASE IF NOT EXISTSsyntax - Optimize the fault tolerance of
load datafor CSV files - Abandon the predicate pushdown operation when the filtering condition contains a user variable to improve the compatibility with MySQL's behavior of using user variables to simulate Window Functions
- Support the
PD
- Support re-creating a cluster from a single node
- Migrate Region metadata from etcd to the go-leveldb storage engine to solve the storage bottleneck in etcd for large-scale clusters
- API
- Add the
remove-tombstoneAPI to clear Tombstone stores - Add the
ScanRegionsAPI to batch query Region information - Add the
GetOperatorAPI to query running operators - Optimize the performance of the
GetStoresAPI
- Add the
- Configurations
- Optimize configuration check logic to avoid configuration item errors
- Add
enable-two-way-mergeto control the direction of Region merge - Add
hot-region-schedule-limitto control the scheduling rate for hot Regions - Add
hot-region-cache-hits-thresholdto identify hotspot when hitting multiple thresholds consecutively - Add the
store-balance-rateconfiguration item to control the maximum numbers of balance Region operators allowed per minute
- Scheduler Optimizations
- Add the store limit mechanism for separately controlling the speed of operators for each store
- Support the
waitingOperatorqueue to optimize the resource race among different schedulers - Support scheduling rate limit to actively send scheduling operations to TiKV. This improves the scheduling rate by limiting the number of concurrent scheduling tasks on a single node.
- Optimize the
Region Scatterscheduling to be not restrained by the limit mechanism - Add the
shuffle-hot-regionscheduler to facilitate TiKV stability test in scenarios of poor hotspot scheduling
- Simulator
- Add simulator for data import scenarios
- Support setting different heartbeats intervals for the Store
- Others
- Upgrade etcd to solve the issues of inconsistent log output formats, Leader selection failure in prevote, and lease deadlocking
- Develop a unified log format specification with restructured log system to facilitate collection and analysis by tools
- Add monitoring metrics including scheduling parameters, cluster label information, and time consumed by PD to process TSO requests, Store ID, and address information.
TiKV
- Support distributed GC and concurrent lock resolving for improved GC performance
- Support reversed
raw_scanandraw_batch_scan - Support Multi-thread Raftstore and Multi-thread Apply to improve scalabilities, concurrency capacity, and resource usage within a single node. Performance improves by 70% under the same level of pressure
- Support batch receiving and sending Raft messages, improving TPS by 7% for write intensive scenarios
- Support checking RocksDB Level 0 files before applying snapshots to avoid write stall
- Introduce Titan, a key-value plugin that improves write performance for scenarios with value sizes greater than 1KiB, and relieves write amplification in certain degrees
- Support the pessimistic transaction mode (Experimental)
- Support getting monitoring information via HTTP
- Modify the semantics of
Insertto allow Prewrite to succeed only when there is no Key - Develop a unified log format specification with restructured log system to facilitate collection and analysis by tools
- Add performance metrics related to configuration information and key bound crossing
- Support Local Reader in RawKV to improve performance
- Engine
- Optimize memory management to reduce memory allocation and copying for
Iterator Key Bound Option - Support
block cachesharing among different column families
- Optimize memory management to reduce memory allocation and copying for
- Server
- Reduce context switch overhead from
batch commands - Remove
txn scheduler - Add monitoring items related to
read indexandGC worker
- Reduce context switch overhead from
- RaftStore
- Support Hibernate Regions to optimize CPU consumption from RaftStore (Experimental)
- Remove the local reader thread
- Coprocessor
- Refactor the computation framework to implement vector operators, computation using vector expressions, and vector aggregations to improve performance
- Support providing operator execution status for the
EXPLAIN ANALYZEstatement in TiDB - Switch to the
work-stealingthread pool model to reduce context switch cost
Tools
- TiDB Lightning
- Support redirected replication of data tables
- Support importing CSV files
- Improve performance for conversion from SQL to KV pairs
- Support batch import of single tables to improve performance
- Support separately importing data and indexes for big tables to improve the performance of TiKV-importer
- Support filling the missing column using the
row_idor the default column value when column data is missing in the new file - Support setting a speed limit in
TIKV-importerwhen uploading SST files to TiKV
- TiDB Binlog
- Add the
advertise-addrconfiguration in Drainer to support the bridge mode in the container environment - Add the
GetMvccByEncodeKeyfunction in Pump to speed up querying the transaction status - Support compressing communication data among components to reduce network resource consumption
- Add the Arbiter tool that supports reading binlog from Kafka and replicate the data into MySQL
- Support filtering out files that don't require replication via Reparo
- Support replicating generated columns
- Add the
syncer.sql-modeconfiguration item to support using different sql-modes to parse DDL queries - Add the
syncer.ignore-tableconfiguration item to support filtering tables not to be replicated
- Add the
- sync-diff-inspector
- Support checkpoint to record verification status and continue the verification from last saved point after restarting
- Add the
only-use-checksumconfiguration item to check data consistency by calculating checksum - Support using TiDB statistics and multiple columns to split chunks for comparison to adapt to more scenarios
TiDB Ansible
- Upgrade the following monitoring components to a stable version:
- Prometheus from V2.2.1 to V2.8.1
- Pushgateway from V0.4.0 to V0.7.0
- Node_exporter from V0.15.2 to V0.17.0
- Alertmanager from V0.14.0 to V0.17.0
- Grafana from V4.6.3 to V6.1.6
- Ansible from V2.5.14 to V2.7.11
- Add the TiKV summary monitoring dashboard to view cluster status conveniently
- Add the TiKV trouble_shooting monitoring dashboard to remove duplicate items and facilitate troubleshooting
- Add the TiKV details monitoring dashboard to facilitate debugging and troubleshooting
- Add concurrent check for version consistency during rolling updates to improve the update performance
- Support deployment and operations for TiDB Lightning
- Optimize the
table-regions.pyscript to support displaying Leader distribution by tables - Optimize TiDB monitoring and add latency related monitoring items by SQL categories
- Modify the operating system version limit to only support the CentOS 7.0+ and Red Hat 7.0+ operating systems
- Add the monitoring item to predict the maximum QPS of the cluster (hidden by default)