FAQs After Upgrade
This document lists some FAQs and their solutions after you upgrade TiDB.
The character set (charset) errors when executing DDL operations
In v2.1.0 and earlier versions (including all versions of v2.0), the character set of TiDB is UTF-8 by default. But starting from v2.1.1, the default character set has been changed into UTF8MB4.
If you explicitly specify the charset of a newly created table as UTF-8 in v2.1.0 or earlier versions, then you might fail to execute DDL operations after upgrading TiDB to v2.1.1.
To avoid this issue, you need to pay attention to:
Point #1: before v2.1.3, TiDB does not support modifying the charset of the column. Therefore, when you execute DDL operations, you need to make sure that the charset of the new column is consistent with that of the original column.
Point #2: before v2.1.3, even if the charset of the column is different from that of the table,
show create table
does not show the charset of the column. But as shown in the following example, you can view it by obtaining the metadata of the table through the HTTP API.
Issue #1: unsupported modify column charset utf8mb4 not match origin utf8
Before upgrading, the following operations are executed in v2.1.0 and earlier versions.
create table t(a varchar(10)) charset=utf8;Query OK, 0 rows affected Time: 0.106sshow create table t;+-------+-------------------------------------------------------+ | Table | Create Table | +-------+-------------------------------------------------------+ | t | CREATE TABLE `t` ( | | | `a` varchar(10) DEFAULT NULL | | | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin | +-------+-------------------------------------------------------+ 1 row in set Time: 0.006sAfter upgrading, the following error is reported in v2.1.1 and v2.1.2 but there is no such error in v2.1.3 and the later versions.
alter table t change column a a varchar(20);ERROR 1105 (HY000): unsupported modify column charset utf8mb4 not match origin utf8
Solution:
You can explicitly specify the column charset as the same with the original charset.
alter table t change column a a varchar(22) character set utf8;
According to Point #1, if you do not specify the column charset, UTF8MB4 is used by default, so you need to specify the column charset to make it consistent with the original one.
According to Point #2, you can obtain the metadata of the table through the HTTP API, and find the column charset by searching the column name and the keyword "Charset".
curl "http://$IP:10080/schema/test/t" | python -m json.tool # A python tool is used here to format JSON, which is not required and only for the convenience to add comments. { "ShardRowIDBits": 0, "auto_inc_id": 0, "charset": "utf8", # The charset of the table. "collate": "", "cols": [ # The relevant information about the columns. { ... "id": 1, "name": { "L": "a", "O": "a" # The column name. }, "offset": 0, "origin_default": null, "state": 5, "type": { "Charset": "utf8", # The charset of column a. "Collate": "utf8_bin", "Decimal": 0, "Elems": null, "Flag": 0, "Flen": 10, "Tp": 15 } } ], ... }
Issue #2: unsupported modify charset from utf8mb4 to utf8
Before upgrading, the following operations are executed in v2.1.1 and v2.1.2.
create table t(a varchar(10)) charset=utf8;Query OK, 0 rows affected Time: 0.109sshow create table t;+-------+-------------------------------------------------------+ | Table | Create Table | +-------+-------------------------------------------------------+ | t | CREATE TABLE `t` ( | | | `a` varchar(10) DEFAULT NULL | | | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin | +-------+-------------------------------------------------------+In the above example,
show create table
only shows the charset of the table, but the charset of the column is actually UTF8MB4, which can be confirmed by obtaining the schema through the HTTP API. However, when a new table is created, the charset of the column should stay consistent with that of the table. This bug has been fixed in v2.1.3.After upgrading, the following operations are executed in v2.1.3 and the later versions.
show create table t;+-------+--------------------------------------------------------------------+ | Table | Create Table | +-------+--------------------------------------------------------------------+ | t | CREATE TABLE `t` ( | | | `a` varchar(10) CHARSET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL | | | ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin | +-------+--------------------------------------------------------------------+ 1 row in set Time: 0.007salter table t change column a a varchar(20);ERROR 1105 (HY000): unsupported modify charset from utf8mb4 to utf8
Solution:
Starting from v2.1.3, TiDB supports modifying the charsets of the column and the table, so it is recommended to modify the table charset into UTF8MB4.
alter table t convert to character set utf8mb4;You can also specify the column charset as done in Issue #1, making it stay consistent with the original column charset (UTF8MB4).
alter table t change column a a varchar(20) character set utf8mb4;
Issue #3: ERROR 1366 (HY000): incorrect utf8 value f09f8c80(🌀) for column a
In TiDB v2.1.1 and earlier versions, if the charset is UTF-8, there is no UTF-8 Unicode encoding check on the inserted 4-byte data. But in v2.1.2 and the later versions, this check is added.
Before upgrading, the following operations are executed in v2.1.1 and earlier versions.
create table t(a varchar(100) charset utf8);Query OK, 0 rows affectedinsert t values (unhex('f09f8c80'));Query OK, 1 row affectedAfter upgrading, the following error is reported in v2.1.2 and the later versions.
insert t values (unhex('f09f8c80'));ERROR 1366 (HY000): incorrect utf8 value f09f8c80(🌀) for column a
Solution:
In v2.1.2: this version does not support modifying the column charset, so you have to skip the UTF-8 check.
set @@session.tidb_skip_utf8_check=1;Query OK, 0 rows affectedinsert t values (unhex('f09f8c80'));Query OK, 1 row affectedIn v2.1.3 and the later versions: it is recommended to modify the column charset into UTF8MB4. Or you can set
tidb_skip_utf8_check
to skip the UTF-8 check. But if you skip the check, you might fail to replicate data from TiDB to MySQL because MySQL executes the check.alter table t change column a a varchar(100) character set utf8mb4;Query OK, 0 rows affectedinsert t values (unhex('f09f8c80'));Query OK, 1 row affectedSpecifically, you can use the variable
tidb_skip_utf8_check
to skip the legal UTF-8 and UTF8MB4 check on the data. But if you skip the check, you might fail to replicate the data from TiDB to MySQL because MySQL executes the check.If you only want to skip the UTF-8 check, you can set
tidb_check_mb4_value_in_utf8
. This variable is added to theconfig.toml
file in v2.1.3, and you can modifycheck-mb4-value-in-utf8
in the configuration file and then restart the cluster to enable it.Starting from v2.1.5, you can set
tidb_check_mb4_value_in_utf8
through the HTTP API and the session variable:HTTP API(the HTTP API can be enabled only on a single server)
To enable HTTP API:
curl -X POST -d "check_mb4_value_in_utf8=1" http://{TiDBIP}:10080/settingsTo disable HTTP API:
curl -X POST -d "check_mb4_value_in_utf8=0" http://{TiDBIP}:10080/settings
Session variable
To enable session variable:
set @@session.tidb_check_mb4_value_in_utf8 = 1;To disable session variable:
set @@session.tidb_check_mb4_value_in_utf8 = 0;