Generated Columns
TiDB supports generated columns as part of MySQL 5.7 compatibility. One of the primary use cases for generated columns is to extract data out of a JSON data type and enable it to be indexed.
Index JSON using stored generated column
In both MySQL 5.7 and TiDB, columns of type JSON can not be indexed directly. i.e. The following table structure is not supported:
CREATE TABLE person (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
address_info JSON,
KEY (address_info)
);
To index a JSON column, you must first extract it as a generated stored column.
Using the city
stored generated column as an example, you are then able to add an index:
CREATE TABLE person (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
address_info JSON,
city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) STORED,
KEY (city)
);
In this table, the city
column is a generated column. As the name implies, the column is generated from other columns in the table, and cannot be assigned a value when inserted or updated. This column is generated based on a defined expression and is stored in the database. Thus this column can be read directly, not in a way that its dependent column address_info
is read first and then the data is calculated. The index on city
however is stored and uses the same structure as other indexes of the type varchar(64)
.
You can use the index on the stored generated column in order to speed up the following statement:
SELECT name, id FROM person WHERE city = 'Beijing';
If no data exists at path $.city
, JSON_EXTRACT
returns NULL
. If you want to enforce a constraint that city
must be NOT NULL
, you can define the virtual column as follows:
CREATE TABLE person (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
address_info JSON,
city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) STORED NOT NULL,
KEY (city)
);
Both INSERT
and UPDATE
statements check virtual column definitions. Rows that do not pass validation return errors:
mysql> INSERT INTO person (name, address_info) VALUES ('Morgan', JSON_OBJECT('Country', 'Canada'));
ERROR 1048 (23000): Column 'city' cannot be null
Use generated virtual columns
TiDB also supports generated virtual columns. Different from generated store columns, generated virtual columns are virtual in that they are generated as needed and are not stored in the database or cached in the memory. Although TiDB supports indexing generated virtual columns, the optimizer currently cannot use indexes in this case. This issue will be fixed in a later version of TiDB (Issue #5189).
CREATE TABLE person (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
address_info JSON,
city VARCHAR(64) AS (JSON_UNQUOTE(JSON_EXTRACT(address_info, '$.city'))) VIRTUAL
);
Limitations
The current limitations of JSON and generated columns are as follows:
- You cannot add the generated column in the storage type of
STORED
throughALTER TABLE
. - You cannot create an index on the generated column through
ALTER TABLE
. - You can neither convert a generated stored column to a normal column through the
ALTER TABLE
statement nor convert a normal column to a generated stored column. - You cannot modify the expression of a generated stored column through the
ALTER TABLE
statement. - You cannot assign the
DEFAULT
value to generated columns in DML statements. - Not all JSON functions are supported.