Deploy TiDB Using TiDB Ansible

This guide describes how to deploy a TiDB cluster using TiDB Ansible. For the production environment, it is recommended to deploy TiDB using TiDB Ansible.

Overview

Ansible is an IT automation tool that can configure systems, deploy software, and orchestrate more advanced IT tasks such as continuous deployments or zero downtime rolling updates.

TiDB Ansible is a TiDB cluster deployment tool developed by PingCAP, based on Ansible playbook. TiDB Ansible enables you to quickly deploy a new TiDB cluster which includes PD, TiDB, TiKV, and the cluster monitoring modules.

You can use the TiDB Ansible configuration file to set up the cluster topology and complete all the following operation tasks:

Prepare

Before you start, make sure you have:

Several target machines that meet the following requirements:
- 4 or more machines
  A standard TiDB cluster contains 6 machines. You can use 4 machines for testing. For more details, see Software and Hardware Recommendations.
- CentOS 7.3 (64 bit) or later, x86_64 architecture (AMD64)
- Network between machines
Note
When you deploy TiDB using TiDB Ansible, use SSD disks for the data directory of TiKV and PD nodes. Otherwise, it cannot pass the check. If you only want to try TiDB out and explore the features, it is recommended to deploy TiDB using Docker Compose on a single machine.
A Control Machine that meets the following requirements:
Note
The Control Machine can be one of the target machines.
- CentOS 7.3 (64 bit) or later with Python 2.7 installed
- Access to the Internet

Step 1: Install system dependencies on the Control Machine

Log in to the Control Machine using the root user account, and run the corresponding command according to your operating system.

If you use a Control Machine installed with CentOS 7, run the following command:
```
yum -y install epel-release git curl sshpass && \
yum -y install python2-pip
```
If you use a Control Machine installed with Ubuntu, run the following command:
```
apt-get -y install git curl sshpass python-pip
```

Step 2: Create the `tidb` user on the Control Machine and generate the SSH key

Make sure you have logged in to the Control Machine using the root user account, and then run the following command.

Create the tidb user.
```
useradd -m -d /home/tidb tidb
```
Set a password for the tidb user account.
```
passwd tidb
```
Configure sudo without password for the tidb user account by adding tidb ALL=(ALL) NOPASSWD: ALL to the end of the sudo file:
```
visudo
```
```
tidb ALL=(ALL) NOPASSWD: ALL
```

Generate the SSH key.

Execute the su command to switch the user from root to tidb.

su - tidb

Create the SSH key for the tidb user account and hit the Enter key when Enter passphrase is prompted. After successful execution, the SSH private key file is /home/tidb/.ssh/id_rsa, and the SSH public key file is /home/tidb/.ssh/id_rsa.pub.

ssh-keygen -t rsa

Generating public/private rsa key pair.
Enter file in which to save the key (/home/tidb/.ssh/id_rsa):
Created directory '/home/tidb/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/tidb/.ssh/id_rsa.
Your public key has been saved in /home/tidb/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:eIBykszR1KyECA/h0d7PRKz4fhAeli7IrVphhte7/So tidb@172.16.10.49
The key's randomart image is:
+---[RSA 2048]----+
|=+o+.o.          |
|o=o+o.oo         |
| .O.=.=          |
| . B.B +         |
|o B * B S        |
| * + * +         |
|  o + .          |
| o  E+ .         |
|o   ..+o.        |
+----[SHA256]-----+

Step 3: Download TiDB Ansible to the Control Machine

Log in to the Control Machine using the tidb user account and enter the /home/tidb directory. The relationship between the tidb-ansible version and the TiDB version is as follows:

TiDB version	tidb-ansible tag	Note
2.0 version	v2.0.10, v2.0.11	It is the stable version of 2.0. It is not recommended for new users to use it in the production environment.
2.1 version	v2.1.1 ~ v2.1.18	It is the stable version of 2.1. It can be used in the production environment.

Download the corresponding TiDB Ansible versions of TiDB 2.0 or 2.1 from the TiDB Ansible project. The default folder name is tidb-ansible.
```
git clone -b $tag https://github.com/pingcap/tidb-ansible.git
```

Note

Replace $tag with the value of the chosen TAG version. For example, v2.11.15.
To deploy and upgrade TiDB clusters, use the corresponding version of tidb-ansible. If you only modify the version in the inventory.ini file, errors might occur.
It is required to download tidb-ansible to the /home/tidb directory using the tidb user account. If you download it to the /root directory, a privilege issue occurs.

If you have questions regarding which version to use, email to info@pingcap.com for more information or file an issue.

Step 4: Install TiDB Ansible and its dependencies on the Control Machine

Make sure you have logged in to the Control Machine using the tidb user account.

It is required to use pip to install TiDB Ansible and its dependencies, otherwise a compatibility issue occurs. Currently, the release-2.0, release-2.1, and master branches of TiDB Ansible are compatible with Ansible 2.4 ~ 2.7.11 (2.4 ≤ Ansible ≤ 2.7.11).

Install TiDB Ansible and the dependencies on the Control Machine:
```
cd /home/tidb/tidb-ansible && \
sudo pip install -r ./requirements.txt
```
The version information of TiDB Ansible and dependencies is in the tidb-ansible/requirements.txt file.
View the version of TiDB Ansible:
```
ansible --version
```
```
ansible 2.5.0
```

Step 5: Configure the SSH mutual trust and sudo rules on the Control Machine

Make sure you have logged in to the Control Machine using the tidb user account.

Add the IPs of your target machines to the [servers] section of the hosts.ini file.

cd /home/tidb/tidb-ansible && \
vi hosts.ini

[servers]
172.16.10.1
172.16.10.2
172.16.10.3
172.16.10.4
172.16.10.5
172.16.10.6

[all:vars]
username = tidb
ntp_server = pool.ntp.org

Run the following command and input the root user account password of your target machines.
```
ansible-playbook -i hosts.ini create_users.yml -u root -k
```
This step creates the tidb user account on the target machines, configures the sudo rules and the SSH mutual trust between the Control Machine and the target machines.

To configure the SSH mutual trust and sudo without password manually, see How to manually configure the SSH mutual trust and sudo without password.

Step 6: Install the NTP service on the target machines

Note

If the time and time zone of all your target machines are same, the NTP service is on and is normally synchronizing time, you can ignore this step. See How to check whether the NTP service is normal.

Make sure you have logged in to the Control Machine using the tidb user account, run the following command:

cd /home/tidb/tidb-ansible && \
ansible-playbook -i hosts.ini deploy_ntp.yml -u tidb -b

The NTP service is installed and started using the software repository that comes with the system on the target machines. The default NTP server list in the installation package is used. The related server parameter is in the /etc/ntp.conf configuration file.

To make the NTP service start synchronizing as soon as possible, the system executes the ntpdate command to set the local date and time by polling ntp_server in the hosts.ini file. The default server is pool.ntp.org, and you can also replace it with your NTP server.

Step 7: Configure the CPUfreq governor mode on the target machine

For details about CPUfreq, see the CPUfreq Governor documentation.

Set the CPUfreq governor mode to performance to make full use of CPU performance.

Check the governor modes supported by the system

To check the governor modes supported by the system, run the following command:

cpupower frequency-info --governors

analyzing CPU 0:
  available cpufreq governors: performance powersave

Taking the above code for example, the system supports the performance and powersave modes.

Note

As the following shows, if it returns Not Available, it means that the current system does not support CPUfreq configuration and you can skip this step.

cpupower frequency-info --governors

analyzing CPU 0:
   available cpufreq governors: Not Available

Check the current governor mode

To check the current CPUfreq governor mode, run the following command:

cpupower frequency-info --policy

analyzing CPU 0:
  current policy: frequency should be within 1.20 GHz and 3.20 GHz.
                  The governor "powersave" may decide which speed to use
                  within this range.

As the above code shows, the current mode is powersave in this example.

Change the governor mode

You can use either of the following two methods to change the governor mode. In the above example, the current governor mode is powersave and the following commands change it to performance.

Use the cpupower frequency-set --governor command to change the current mode:
```
cpupower frequency-set --governor performance
```

Run the following command to set the mode on the target machine in batches:

ansible -i hosts.ini all -m shell -a "cpupower frequency-set --governor performance" -u tidb -b

Step 8: Mount the data disk ext4 filesystem with options on the target machines

Format your data disks to the ext4 filesystem and add the nodelalloc and noatime mount options to the filesystem. It is required to add the nodelalloc option, or else the TiDB Ansible deployment cannot pass the test. The noatime option is optional.

Note

If your data disks have been formatted to ext4 and have added the mount options, you can uninstall it by running the umount /dev/nvme0n1p1 command, follow the steps starting from editing the /etc/fstab file, and add the options again to the filesystem.

Take the /dev/nvme0n1 data disk as an example:

View the data disk.
```
fdisk -l
```
```
Disk /dev/nvme0n1: 1000 GB
```
Create the partitioned table.
```
parted -s -a optimal /dev/nvme0n1 mklabel gpt -- mkpart primary ext4 1 -1
```
Note
Use the lsblk command to view the device number of the partition: for a nvme disk, the generated device number is usually nvme0n1p1; for a regular disk (for example, /dev/sdb), the generated device number is usually sdb1.
Format the data disk to the ext4 filesystem.
```
mkfs.ext4 /dev/nvme0n1p1
```

View the partition UUID of the data disk.

In this example, the UUID of nvme0n1p1 is c51eb23b-195c-4061-92a9-3fad812cc12f.

lsblk -f

NAME    FSTYPE LABEL UUID                                 MOUNTPOINT
sda
├─sda1  ext4         237b634b-a565-477b-8371-6dff0c41f5ab /boot
├─sda2  swap         f414c5c0-f823-4bb1-8fdf-e531173a72ed
└─sda3  ext4         547909c1-398d-4696-94c6-03e43e317b60 /
sr0
nvme0n1
└─nvme0n1p1 ext4         c51eb23b-195c-4061-92a9-3fad812cc12f

Edit the /etc/fstab file and add the mount options.

vi /etc/fstab

UUID=c51eb23b-195c-4061-92a9-3fad812cc12f /data1 ext4 defaults,nodelalloc,noatime 0 2

Mount the data disk.
```
mkdir /data1 && \
mount -a
```
Check using the following command.
```
mount -t ext4
```
```
/dev/nvme0n1p1 on /data1 type ext4 (rw,noatime,nodelalloc,data=ordered)
```
If the filesystem is ext4 and nodelalloc is included in the mount options, you have successfully mount the data disk ext4 filesystem with options on the target machines.

Step 9: Edit the `inventory.ini` file to orchestrate the TiDB cluster

Log in to the Control Machine using the tidb user account, and edit the tidb-ansible/inventory.ini file to orchestrate the TiDB cluster. The standard TiDB cluster contains 6 machines: 2 TiDB instances, 3 PD instances, and 3 TiKV instances.

Deploy at least 3 instances for TiKV.
Do not deploy TiKV together with TiDB or PD on the same machine.
Use the first TiDB machine as the monitoring machine.

Note

It is required to use the internal IP address to deploy. If the SSH port of the target machines is not the default 22 port, you need to add the ansible_port variable. For example, TiDB1 ansible_host=172.16.10.1 ansible_port=5555.

You can choose one of the following two types of cluster topology according to your scenario:

The cluster topology of a single TiKV instance on each TiKV node
In most cases, it is recommended to deploy one TiKV instance on each TiKV node for better performance. However, if the CPU and memory of your TiKV machines are much better than the required in Hardware and Software Requirements, and you have more than two disks in one node or the capacity of one SSD is larger than 2 TB, you can deploy no more than 2 TiKV instances on a single TiKV node.
The cluster topology of multiple TiKV instances on each TiKV node

Option 1: Use the cluster topology of a single TiKV instance on each TiKV node

Name	Host IP	Services
node1	172.16.10.1	PD1, TiDB1
node2	172.16.10.2	PD2, TiDB2
node3	172.16.10.3	PD3
node4	172.16.10.4	TiKV1
node5	172.16.10.5	TiKV2
node6	172.16.10.6	TiKV3

[tidb_servers]
172.16.10.1
172.16.10.2

[pd_servers]
172.16.10.1
172.16.10.2
172.16.10.3

[tikv_servers]
172.16.10.4
172.16.10.5
172.16.10.6

[monitoring_servers]
172.16.10.1

[grafana_servers]
172.16.10.1

[monitored_servers]
172.16.10.1
172.16.10.2
172.16.10.3
172.16.10.4
172.16.10.5
172.16.10.6

Option 2: Use the cluster topology of multiple TiKV instances on each TiKV node

Take two TiKV instances on each TiKV node as an example:

Name	Host IP	Services
node1	172.16.10.1	PD1, TiDB1
node2	172.16.10.2	PD2, TiDB2
node3	172.16.10.3	PD3
node4	172.16.10.4	TiKV1-1, TiKV1-2
node5	172.16.10.5	TiKV2-1, TiKV2-2
node6	172.16.10.6	TiKV3-1, TiKV3-2

[tidb_servers]
172.16.10.1
172.16.10.2

[pd_servers]
172.16.10.1
172.16.10.2
172.16.10.3

# Note: To use labels in TiKV, you must also configure location_labels for PD at the same time.
[tikv_servers]
TiKV1-1 ansible_host=172.16.10.4 deploy_dir=/data1/deploy tikv_port=20171 labels="host=tikv1"
TiKV1-2 ansible_host=172.16.10.4 deploy_dir=/data2/deploy tikv_port=20172 labels="host=tikv1"
TiKV2-1 ansible_host=172.16.10.5 deploy_dir=/data1/deploy tikv_port=20171 labels="host=tikv2"
TiKV2-2 ansible_host=172.16.10.5 deploy_dir=/data2/deploy tikv_port=20172 labels="host=tikv2"
TiKV3-1 ansible_host=172.16.10.6 deploy_dir=/data1/deploy tikv_port=20171 labels="host=tikv3"
TiKV3-2 ansible_host=172.16.10.6 deploy_dir=/data2/deploy tikv_port=20172 labels="host=tikv3"

# When you deploy a TiDB cluster of the 3.0 version, you must configure the TiKV status ports in the topology of multiple TiKV instances, as shown in the following example.
# TiKV1-1 ansible_host=172.16.10.4 deploy_dir=/data1/deploy tikv_port=20171 tikv_status_port=20181 labels="host=tikv1"
# TiKV1-2 ansible_host=172.16.10.4 deploy_dir=/data2/deploy tikv_port=20172 tikv_status_port=20182 labels="host=tikv1"
# TiKV2-1 ansible_host=172.16.10.5 deploy_dir=/data1/deploy tikv_port=20171 tikv_status_port=20181 labels="host=tikv2"
# TiKV2-2 ansible_host=172.16.10.5 deploy_dir=/data2/deploy tikv_port=20172 tikv_status_port=20182 labels="host=tikv2"
# TiKV3-1 ansible_host=172.16.10.6 deploy_dir=/data1/deploy tikv_port=20171 tikv_status_port=20181 labels="host=tikv3"
# TiKV3-2 ansible_host=172.16.10.6 deploy_dir=/data2/deploy tikv_port=20172 tikv_status_port=20182 labels="host=tikv3"

[monitoring_servers]
172.16.10.1

[grafana_servers]
172.16.10.1

[monitored_servers]
172.16.10.1
172.16.10.2
172.16.10.3
172.16.10.4
172.16.10.5
172.16.10.6

......

# Note: For labels in TiKV to work, you must also configure location_labels for PD when deploying the cluster.
[pd_servers:vars]
location_labels = ["host"]

Edit the parameters in the service configuration file:

For the cluster topology of multiple TiKV instances on each TiKV node, you need to edit the block-cache-size parameter in tidb-ansible/conf/tikv.yml:
- rocksdb defaultcf block-cache-size(GB): MEM * 80% / TiKV instance number * 30%
- rocksdb writecf block-cache-size(GB): MEM * 80% / TiKV instance number * 45%
- rocksdb lockcf block-cache-size(GB): MEM * 80% / TiKV instance number * 2.5% (128 MB at a minimum)
- raftdb defaultcf block-cache-size(GB): MEM * 80% / TiKV instance number * 2.5% (128 MB at a minimum)
For the cluster topology of multiple TiKV instances on each TiKV node, you need to edit the high-concurrency, normal-concurrency and low-concurrency parameters in the tidb-ansible/conf/tikv.yml file:
```
readpool:
  coprocessor:
    # Notice: if CPU_NUM > 8, default thread pool size for coprocessors
    # will be set to CPU_NUM * 0.8.
    # high-concurrency: 8
    # normal-concurrency: 8
    # low-concurrency: 8
```
Note
Recommended configuration: the number of TiKV instances * the parameter value = CPU_Vcores * 0.8.
If multiple TiKV instances are deployed on a same physical disk, edit the capacity parameter in conf/tikv.yml:
```
raftstore:
  capacity: 0
```
Note
Recommended configuration: capacity = total disk capacity / the number of TiKV instances. For example, capacity: "100GB".

Step 10: Edit variables in the `inventory.ini` file

This step describes how to edit the variable of deployment directory and other variables in the inventory.ini file.

Configure the deployment directory

Edit the deploy_dir variable to configure the deployment directory.

The global variable is set to /home/tidb/deploy by default, and it applies to all services. If the data disk is mounted on the /data1 directory, you can set it to /data1/deploy. For example:

## Global variables
[all:vars]
deploy_dir = /data1/deploy

Note

To separately set the deployment directory for a service, you can configure the host variable while configuring the service host list in the inventory.ini file. It is required to add the first column alias, to avoid confusion in scenarios of mixed services deployment.

TiKV1-1 ansible_host=172.16.10.4 deploy_dir=/data1/deploy

Edit other variables (Optional)

To enable the following control variables, use the capitalized True. To disable the following control variables, use the capitalized False.

Variable Name	Description
cluster_name	the name of a cluster, adjustable
tidb_version	the version of TiDB, configured by default in TiDB Ansible branches
process_supervision	the supervision way of processes, systemd by default, supervise optional
timezone	the global default time zone configured when a new TiDB cluster bootstrap is initialized; you can edit it later using the global `time_zone` system variable and the session `time_zone` system variable as described in Time Zone Support; the default value is `Asia/Shanghai` and see the list of time zones for more optional values
enable_firewalld	to enable the firewall, closed by default; to enable it, add the ports in network requirements to the allowlist
enable_ntpd	to monitor the NTP service of the managed node, True by default; do not close it
set_hostname	to edit the hostname of the managed node based on the IP, False by default
enable_binlog	whether to deploy Pump and enable the binlog, False by default, dependent on the Kafka cluster; see the `zookeeper_addrs` variable
zookeeper_addrs	the zookeeper address of the binlog Kafka cluster
enable_slow_query_log	to record the slow query log of TiDB into a single file: ({{ deploy_dir }}/log/tidb_slow_query.log). False by default, to record it into the TiDB log
deploy_without_tidb	the Key-Value mode, deploy only PD, TiKV and the monitoring service, not TiDB; set the IP of the tidb_servers host group to null in the `inventory.ini` file
alertmanager_target	optional: If you have deployed `alertmanager` separately, you can configure this variable using the `alertmanager_host:alertmanager_port` format
grafana_admin_user	the username of Grafana administrator; default `admin`
grafana_admin_password	the password of Grafana administrator account; default `admin`; used to import Dashboard and create the API key using TiDB Ansible; update this variable if you have modified it through Grafana web
collect_log_recent_hours	to collect the log of recent hours; default the recent 2 hours
enable_bandwidth_limit	to set a bandwidth limit when pulling the diagnostic data from the target machines to the Control Machine; used together with the `collect_bandwidth_limit` variable
collect_bandwidth_limit	the limited bandwidth when pulling the diagnostic data from the target machines to the Control Machine; unit: Kbit/s; default 10000, indicating 10Mb/s; for the cluster topology of multiple TiKV instances on each TiKV node, you need to divide the number of the TiKV instances on each TiKV node
prometheus_storage_retention	the retention time of the monitoring data of Prometheus (30 days by default); this is a new configuration in the `group_vars/monitoring_servers.yml` file in 2.1.7, 3.0 and the later tidb-ansible versions

Step 11: Deploy the TiDB cluster

When ansible-playbook runs Playbook, the default concurrent number is 5. If many deployment target machines are deployed, you can add the -f parameter to specify the concurrency, such as ansible-playbook deploy.yml -f 10.

The following example uses tidb as the user who runs the service.

Edit the tidb-ansible/inventory.ini file to make sure ansible_user = tidb.
```
## Connection
# ssh via normal user
ansible_user = tidb
```
Note
Do not configure ansible_user to root, because tidb-ansible limits the user that runs the service to the normal user.
Run the following command and if all servers return tidb, then the SSH mutual trust is successfully configured:
```
ansible -i inventory.ini all -m shell -a 'whoami'
```
Run the following command and if all servers return root, then sudo without password of the tidb user is successfully configured:
```
ansible -i inventory.ini all -m shell -a 'whoami' -b
```
Run the local_prepare.yml playbook and download TiDB binary to the Control Machine.
```
ansible-playbook local_prepare.yml
```
Initialize the system environment and modify the kernel parameters.
```
ansible-playbook bootstrap.yml
```
Deploy the TiDB cluster software.
```
ansible-playbook deploy.yml
```
Note
You can use the Report button on the Grafana Dashboard to generate the PDF file. This function depends on the fontconfig package and English fonts. To use this function, log in to the grafana_servers machine and install it using the following command:
```
>
```
```
sudo yum install fontconfig open-sans-fonts
```
Start the TiDB cluster.
```
ansible-playbook start.yml
```

Test the TiDB cluster

Because TiDB is compatible with MySQL, you must use the MySQL client to connect to TiDB directly. It is recommended to configure load balancing to provide uniform SQL interface.

Connect to the TiDB cluster using the MySQL client.
```
mysql -u root -h 172.16.10.1 -P 4000
```
Note
The default port of TiDB service is 4000.
Access the monitoring platform using a web browser.
- Address: http://172.16.10.1:3000
- Default account and password: admin; admin

Deployment FAQs

This section lists the common questions about deploying TiDB using TiDB Ansible.

How to customize the port?

Edit the inventory.ini file and add the following host variable after the IP of the corresponding service:

Component	Variable Port	Default Port	Description
TiDB	tidb_port	4000	the communication port for the application and DBA tools
TiDB	tidb_status_port	10080	the communication port to report TiDB status
TiKV	tikv_port	20160	the TiKV communication port
TiKV	tikv_status_port	20180	the communication port to report the TiKV status
PD	pd_client_port	2379	the communication port between TiDB and PD
PD	pd_peer_port	2380	the inter-node communication port within the PD cluster
Pump	pump_port	8250	the pump communication port
Prometheus	prometheus_port	9090	the communication port for the Prometheus service
Pushgateway	pushgateway_port	9091	the aggregation and report port for TiDB, TiKV, and PD monitor
Node_exporter	node_exporter_port	9100	the communication port to report the system information of every TiDB cluster node
Grafana	grafana_port	3000	the port for the external Web monitoring service and client (Browser) access
Grafana	grafana_collector_port	8686	the grafana_collector communication port, used to export Dashboard as the PDF format
Kafka_exporter	kafka_exporter_port	9308	the communication port for Kafka_exporter, used to monitor the binlog Kafka cluster

How to customize the deployment directory?

Edit the inventory.ini file and add the following host variable after the IP of the corresponding service:

Component	Variable Directory	Default Directory	Description
Global	deploy_dir	/home/tidb/deploy	the deployment directory
TiDB	tidb_log_dir	{{ deploy_dir }}/log	the TiDB log directory
TiKV	tikv_log_dir	{{ deploy_dir }}/log	the TiKV log directory
TiKV	tikv_data_dir	{{ deploy_dir }}/data	the data directory
TiKV	wal_dir	""	the rocksdb write-ahead log directory, consistent with the TiKV data directory when the value is null
TiKV	raftdb_path	""	the raftdb directory, being tikv_data_dir/raft when the value is null
PD	pd_log_dir	{{ deploy_dir }}/log	the PD log directory
PD	pd_data_dir	{{ deploy_dir }}/data.pd	the PD data directory
Pump	pump_log_dir	{{ deploy_dir }}/log	the Pump log directory
Pump	pump_data_dir	{{ deploy_dir }}/data.pump	the Pump data directory
Prometheus	prometheus_log_dir	{{ deploy_dir }}/log	the Prometheus log directory
Prometheus	prometheus_data_dir	{{ deploy_dir }}/data.metrics	the Prometheus data directory
Pushgateway	pushgateway_log_dir	{{ deploy_dir }}/log	the pushgateway log directory
Node_exporter	node_exporter_log_dir	{{ deploy_dir }}/log	the node_exporter log directory
Grafana	grafana_log_dir	{{ deploy_dir }}/log	the Grafana log directory
Grafana	grafana_data_dir	{{ deploy_dir }}/data.grafana	the Grafana data directory

How to check whether the NTP service is normal?

Run the following command. If it returns running, then the NTP service is running:

sudo systemctl status ntpd.service

ntpd.service - Network Time Service
Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled)
Active: active (running) since 一 2017-12-18 13:13:19 CST; 3s ago

Run the ntpstat command. If it returns synchronised to NTP server (synchronizing with the NTP server), then the synchronization process is normal.
```
ntpstat
```
```
synchronised to NTP server (85.199.214.101) at stratum 2
time correct to within 91 ms
polling server every 1024 s
```

Note

For the Ubuntu system, you need to install the ntpstat package.

The following result indicates the NTP service is not synchronizing normally:
```
ntpstat
```
```
unsynchronised
```
The following result indicates the NTP service is not running normally:
```
ntpstat
```
```
Unable to talk to NTP daemon. Is it running?
```
To make the NTP service start synchronizing as soon as possible, run the following command. You can replace pool.ntp.org with other NTP servers.
```
sudo systemctl stop ntpd.service && \
sudo ntpdate pool.ntp.org && \
sudo systemctl start ntpd.service
```

To install the NTP service manually on the CentOS 7 system, run the following command:

sudo yum install ntp ntpdate && \
sudo systemctl start ntpd.service && \
sudo systemctl enable ntpd.service

How to modify the supervision method of a process from `supervise` to `systemd`?

Run the following command:

process supervision, [systemd, supervise]

process_supervision = systemd

For versions earlier than TiDB 1.0.4, the process supervision method of TiDB Ansible is supervise by default. The previously installed cluster can remain the same. If you need to change the supervision method to systemd, stop the cluster and run the following command:

ansible-playbook stop.yml && \
ansible-playbook deploy.yml -D && \
ansible-playbook start.yml

How to manually configure the SSH mutual trust and sudo without password?

Log in to the deployment target machine respectively using the root user account, create the tidb user and set the login password.
```
useradd tidb && \
passwd tidb
```
To configure sudo without password, run the following command, and add tidb ALL=(ALL) NOPASSWD: ALL to the end of the file:
```
visudo
```
```
tidb ALL=(ALL) NOPASSWD: ALL
```
Use the tidb user to log in to the Control Machine, and run the following command. Replace 172.16.10.61 with the IP of your deployment target machine, and enter the tidb user password of the deployment target machine as prompted. Successful execution indicates that SSH mutual trust is created. This applies to other machines as well.
```
ssh-copy-id -i ~/.ssh/id_rsa.pub 172.16.10.61
```
Log in to the Control Machine using the tidb user account, and log in to the IP of the target machine using SSH. If you do not need to enter the password and can successfully log in, then the SSH mutual trust is successfully configured.
```
ssh 172.16.10.61
```
```
[tidb@172.16.10.61 ~]$
```
After you log in to the deployment target machine using the tidb user, run the following command. If you do not need to enter the password and can switch to the root user, then sudo without password of the tidb user is successfully configured.
```
sudo -su root
```
```
[root@172.16.10.61 tidb]#
```

Error: You need to install jmespath prior to running json_query filter

See Install TiDB Ansible and its dependencies on the Control Machine and use pip to install Ansible and the corresponding dependencies in the Control Machine. The jmespath dependent package is installed by default.
Run the following command to check whether jmespath is successfully installed:
```
pip show jmespath
```
```
Name: jmespath
Version: 0.9.0
```
Enter import jmespath in the Python interactive window of the Control Machine.
- If no error displays, the dependency is successfully installed.
- If the ImportError: No module named jmespath error displays, the Python jmespath module is not successfully installed.
```
python
```
```
Python 2.7.5 (default, Nov  6 2016, 00:28:07)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
```
```
import jmespath
```

The `zk: node does not exist` error when starting Pump/Drainer

Check whether the zookeeper_addrs configuration in inventory.ini is the same with the configuration in the Kafka cluster, and whether the namespace is filled in. The description about namespace configuration is as follows:

# ZooKeeper connection string (see ZooKeeper docs for details).
# ZooKeeper address of the Kafka cluster. Example:
# zookeeper_addrs = "192.168.0.11:2181,192.168.0.12:2181,192.168.0.13:2181"
# You can also append an optional chroot string to the URLs to specify the root directory for all Kafka znodes. Example:
# zookeeper_addrs = "192.168.0.11:2181,192.168.0.12:2181,192.168.0.13:2181/kafka/123"

Deploy TiDB Using TiDB Ansible

Overview

Prepare

Step 1: Install system dependencies on the Control Machine

Step 2: Create the tidb user on the Control Machine and generate the SSH key

Step 3: Download TiDB Ansible to the Control Machine

Step 4: Install TiDB Ansible and its dependencies on the Control Machine

Step 5: Configure the SSH mutual trust and sudo rules on the Control Machine

Step 6: Install the NTP service on the target machines

Step 7: Configure the CPUfreq governor mode on the target machine

Check the governor modes supported by the system

Check the current governor mode

Change the governor mode

Step 8: Mount the data disk ext4 filesystem with options on the target machines

Step 9: Edit the inventory.ini file to orchestrate the TiDB cluster

Option 1: Use the cluster topology of a single TiKV instance on each TiKV node

Option 2: Use the cluster topology of multiple TiKV instances on each TiKV node

Step 10: Edit variables in the inventory.ini file

Configure the deployment directory

Edit other variables (Optional)

Step 11: Deploy the TiDB cluster

Test the TiDB cluster

Deployment FAQs

How to customize the port?

How to customize the deployment directory?

How to check whether the NTP service is normal?

How to modify the supervision method of a process from supervise to systemd?

How to manually configure the SSH mutual trust and sudo without password?

Error: You need to install jmespath prior to running json_query filter

The zk: node does not exist error when starting Pump/Drainer

Step 2: Create the `tidb` user on the Control Machine and generate the SSH key

Step 9: Edit the `inventory.ini` file to orchestrate the TiDB cluster

Step 10: Edit variables in the `inventory.ini` file

How to modify the supervision method of a process from `supervise` to `systemd`?

The `zk: node does not exist` error when starting Pump/Drainer