TiDB Binlog Cluster Operations

This document introduces the following TiDB Binlog cluster operations:

  • The state of a Pump and Drainer nodes
  • Starting or exiting a Pump or Drainer process
  • Managing the TiDB Binlog cluster by using the binlogctl tool or by directly performing SQL operations in TiDB

Pump or Drainer state

Pump or Drainer state description:

  • online: running normally
  • pausing: in the pausing process
  • paused: has been stopped
  • closing: in the offline process
  • offline: has been offline

Starting and exiting a Pump or Drainer process

Pump

  • Starting: When started, the Pump node notifies all Drainer nodes in the online state. If the notification is successful, the Pump node sets its state to online. Otherwise, the Pump node reports an error, sets its state to paused and exits the process.
  • Exiting: The Pump node enters the paused or offline state before the process is exited normally; if the process is exited abnormally (caused by the kill -9 command, process panic, crash), the node is still in the online state.
    • Pause: You can pause a Pump process by using the kill command (not kill -9), pressing Ctrl+C or using the pause-pump command in the binlogctl tool. After receiving the pause instruction, the Pump node sets its state to pausing, stops receiving binlog write requests and stops providing binlog data to Drainer nodes. After all threads are safely exited, the Pump node updates its state to paused and exits the process.
    • Offline: You can close a Pump process only by using the offline-pump command in the binlogctl tool. After receiving the offline instruction, the Pump node sets its state to closing and stops receiving the binlog write requests. The Pump node continues providing binlog to Drainer nodes until all binlog data is consumed by Drainer nodes. Then, the Pump node sets its state to offline and exits the process.

Drainer

  • Starting: When started, the Drainer node sets its state to online and tries to pull binlogs from all Pump nodes which are not in the offline state. If it fails to get the binlogs, it keeps trying.
  • Exiting: The Drainer node enters the paused or offline state before the process is exited normally; if the process is exited abnormally (caused by kill -9, process panic, crash), the Drainer node is still in the online state.
    • Pause: You can pause a Drainer process by using the kill command (not kill -9), pressing Ctrl+C or using the pause-drainer command in the binlogctl tool. After receiving the pause instruction, the Drainer node sets its state to pausing and stops pulling binlogs from Pump nodes. After all threads are safely exited, the Drainer node sets its state to paused and exits the process.
    • Offline: You can close a Drainer process only by using the offline-drainer command in the binlogctl tool. After receiving the offline instruction, the Drainer node sets its state to closing and stops pulling binlogs from Pump nodes. After all threads are safely exited, the Drainer node updates its state to offline and exits the process.

For how to pause, close, check, and modify the state of Drainer, see the binlogctl guide as follows.

binlogctl guide

binlogctl is an operations tool for TiDB Binlog with the following features:

  • Checking the state of Pump or Drainer
  • Pausing or closing Pump or Drainer
  • Handling the abnormal state of Pump or Drainer

Usage scenarios of binlogctl

  • An error occurs during data replication and you need to check the running status and state of Pump or Drainer.
  • While maintaining the cluster, you need to pause or close Pump or Drainer.
  • Pump or Drainer process is exited abnormally, but the node state is not updated or is unexpected, which influences the application.

Download binlogctl

Your distribution of TiDB or TiDB Binlog might already include binlogctl. If not, download binlogctl:

wget https://download.pingcap.org/tidb-{version}-linux-amd64.tar.gz && \ wget https://download.pingcap.org/tidb-{version}-linux-amd64.sha256

The following command checks the file integrity. If the result is OK, the file is correct.

sha256sum -c tidb-{version}-linux-amd64.sha256

For TiDB v2.1.0 GA or later versions, binlogctl is already included in the TiDB download package. For earlier versions, you need to download binlogctl separately.

wget https://download.pingcap.org/tidb-enterprise-tools-latest-linux-amd64.tar.gz && \ wget https://download.pingcap.org/tidb-enterprise-tools-latest-linux-amd64.sha256

The following command checks the file integrity. If the result is OK, the file is correct.

sha256sum -c tidb-enterprise-tools-latest-linux-amd64.sha256

binlogctl usage description

Command line parameters:

Usage of binlogctl: -V Outputs the binlogctl version information -cmd string the command mode, including "generate_meta" (deprecated), "pumps", "drainers", "update-pump" ,"update-drainer", "pause-pump", "pause-drainer", "offline-pump", and "offline-drainer" -data-dir string the file path where the checkpoint file of Drainer is stored ("binlog_position" by default) (deprecated) -node-id string ID of Pump or Drainer -pd-urls string the address of PD. If multiple addresses exist, use "," to separate each ("http://127.0.0.1:2379" by default) -ssl-ca string the file path of SSL CAs -ssl-cert string the file path of the X509 certificate file in the PEM format -ssl-key string the file path of X509 key file of the PEM format -time-zone string If a time zone is set, the corresponding time of the obtained `tso` is printed in the "generate_meta" mode. For example, "Asia/Shanghai" is the CST time zone and "Local" is the local time zone -show-offline-nodes used with the `-cmd pumps` or `-cmd drainers` command. The two commands do not show the offline node by default unless this parameter is explicitly specified

Command example:

  • Check the state of all the Pump or Drainer nodes:

    Set cmd as pumps or drainers to check the state of all the Pump or Drainer nodes. For example,

    bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd pumps
    [2019/04/28 09:29:59.016 +00:00] [INFO] [nodes.go:48] ["query node"] [type=pump] [node="{NodeID: 1.1.1.1:8250, Addr: pump:8250, State: online, MaxCommitTS: 408012403141509121, UpdateTime: 2019-04-28 09:29:57 +0000 UTC}"]
    bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd drainers
    [2019/04/28 09:29:59.016 +00:00] [INFO] [nodes.go:48] ["query node"] [type=drainer] [node="{NodeID: 1.1.1.1:8249, Addr: 1.1.1.1:8249, State: online, MaxCommitTS: 408012403141509121, UpdateTime: 2019-04-28 09:29:57 +0000 UTC}"]
  • Pause or close Pump or Drainer:

    binlogctl provides the following commands to pause or close services:

    cmdDescriptionExample
    pause-pumpPause Pumpbin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd pause-pump -node-id ip-127-0-0-1:8250
    pause-drainerPause Drainerbin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd pause-drainer -node-id ip-127-0-0-1:8249
    offline-pumpClose Pumpbin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd offline-pump -node-id ip-127-0-0-1:8250
    offline-drainerClose Drainerbin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd offline-drainer -node-id ip-127-0-0-1:8249

    binlogctl sends the HTTP request to the Pump or Drainer node. After receiving the request, the node executes the corresponding exiting procedures.

  • Modify the state of a Pump or Drainer node in abnormal situations

    When a Pump or Drainer node runs normally or when it is paused or closed in the normal process, it is in the right state. But in some abnormal situations, the Pump or Drainer node cannot correctly maintain its state, which can influence data replication tasks. In these situations, use the binlogctl tool to repair the state information.

    Set cmd to update-pump or update-drainer to update the state of a Pump or Drainer node. The state can be paused or offline. For example:

    bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd update-pump -node-id ip-127-0-0-1:8250 -state paused

Use SQL statements to manage Pump or Drainer

To view or modify binlog related states, execute corresponding SQL statements in TiDB.

  • Check whether binlog is enabled:

    show variables like "log_bin";
    +---------------+-------+ | Variable_name | Value | +---------------+-------+ | log_bin | ON | +---------------+-------+

    When the Value is ON, it means that the binlog is enabled.

  • Check the status of all the Pump or Drainer nodes:

    show pump status;
    +--------|----------------|--------|--------------------|---------------------| | NodeID | Address | State | Max_Commit_Ts | Update_Time | +--------|----------------|--------|--------------------|---------------------| | pump1 | 127.0.0.1:8250 | Online | 408553768673342237 | 2019-05-01 00:00:01 | +--------|----------------|--------|--------------------|---------------------| | pump2 | 127.0.0.2:8250 | Online | 408553768673342335 | 2019-05-01 00:00:02 | +--------|----------------|--------|--------------------|---------------------|
    show drainer status;
    +----------|----------------|--------|--------------------|---------------------| | NodeID | Address | State | Max_Commit_Ts | Update_Time | +----------|----------------|--------|--------------------|---------------------| | drainer1 | 127.0.0.3:8249 | Online | 408553768673342532 | 2019-05-01 00:00:03 | +----------|----------------|--------|--------------------|---------------------| | drainer2 | 127.0.0.4:8249 | Online | 408553768673345531 | 2019-05-01 00:00:04 | +----------|----------------|--------|--------------------|---------------------|
  • Modify the states of a Pump or Drainer node in abnormal situations

    change pump to node_state ='paused' for node_id 'pump1';
    Query OK, 0 rows affected (0.01 sec)
    change drainer to node_state ='paused' for node_id 'drainer1';
    Query OK, 0 rows affected (0.01 sec)

    Executing the above SQL statements works the same as the update-pump or update-drainer commands in binlogctl. Use the above SQL statements only when the Pump or Drainer node is in abnormal situations.