Geo-Distributed Deployment Topology

This document takes the typical architecture of three data centers (DC) in two cities as an example, and introduces the geo-distributed deployment architecture and the key configuration. The cities used in this example are Shanghai (referred to as sha) and Beijing (referred to as bja and bjb).

Topology information

InstanceCountPhysical machine configurationBJ IPSH IPConfiguration
TiDB516 VCore 32GB * 110.0.1.1
10.0.1.2
10.0.1.3
10.0.1.4
10.0.1.5Default port
Global directory configuration
PD54 VCore 8GB * 110.0.1.6
10.0.1.7
10.0.1.8
10.0.1.9
10.0.1.10Default port
Global directory configuration
TiKV516 VCore 32GB 2TB (nvme ssd) * 110.0.1.11
10.0.1.12
10.0.1.13
10.0.1.14
10.0.1.15Default port
Global directory configuration
Monitoring & Grafana14 VCore 8GB * 1 500GB (ssd)10.0.1.16Default port
Global directory configuration

Topology templates

For detailed descriptions of the configuration items in the above TiDB cluster topology file, see Topology Configuration File for Deploying TiDB Using TiUP.

Key parameters

This section describes the key parameter configuration of the TiDB geo-distributed deployment.

TiKV parameters

  • The gRPC compression format (none by default):

    To increase the transmission speed of gRPC packages between geo-distributed target nodes, set this parameter to gzip.

    server.grpc-compression-type: gzip
  • The label configuration:

    Since TiKV is deployed across different data centers, if the physical machines go down, the Raft Group might lose three of the default five replicas, which causes the cluster unavailability. To address this issue, you can configure the labels to enable the smart scheduling of PD, which ensures that the Raft Group does not allow three replicas to be located in TiKV instances on the same machine in the same cabinet of the same data center.

  • The TiKV configuration:

    The same host-level label information is configured for the same physical machine.

    config: server.labels: zone: bj dc: bja rack: rack1 host: host2
  • To prevent remote TiKV nodes from launching unnecessary Raft elections, it is required to increase the minimum and maximum number of ticks that the remote TiKV nodes need to launch an election. The two parameters are set to 0 by default.

    raftstore.raft-min-election-timeout-ticks: 1000 raftstore.raft-max-election-timeout-ticks: 1020

PD parameters

  • The PD metadata information records the topology of the TiKV cluster. PD schedules the Raft Group replicas on the following four dimensions:

    replication.location-labels: ["zone","dc","rack","host"]
  • To ensure high availability of the cluster, adjust the number of Raft Group replicas to be 5:

    replication.max-replicas: 5
  • Forbid the remote TiKV Raft replica being elected as Leader:

    label-property: reject-leader: - key: "dc" value: "sha"

For the further information about labels and the number of Raft Group replicas, see Schedule Replicas by Topology Labels.