Quick Start Guide on Integrating TiDB with Confluent Platform
This document introduces how to integrate TiDB to Confluent Platform using TiCDC.
Confluent Platform is a data streaming platform with Apache Kafka at its core. With many official and third-party sink connectors, Confluent Platform enables you to easily connect stream sources to relational or non-relational databases.
To integrate TiDB with Confluent Platform, you can use the TiCDC component with the Avro protocol. TiCDC can stream data changes to Kafka in the format that Confluent Platform recognizes. For the detailed integration guide, see the following sections:
Prerequisites
Make sure that Zookeeper, Kafka, and Schema Registry are properly installed. It is recommended that you follow the Confluent Platform Quick Start Guide to deploy a local test environment.
Make sure that JDBC sink connector is installed by running the following command. The result should contain
jdbc-sink
.confluent local services connect connector list
Integration procedures
Save the following configuration into
jdbc-sink-connector.json
:{ "name": "jdbc-sink-connector", "config": { "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector", "tasks.max": "1", "topics": "testdb_test", "connection.url": "sqlite:test.db", "connection.ds.pool.size": 5, "table.name.format": "test", "auto.create": true, "auto.evolve": true } }Create an instance of the JDBC sink connector by running the following command (assuming Kafka is listening on
127.0.0.1:8083
):curl -X POST -H "Content-Type: application/json" -d @jdbc-sink-connector.json http://127.0.0.1:8083/connectorsDeploy TiCDC in one of the following ways. If TiCDC is already deployed, you can skip this step.
- Deploy a new TiDB cluster that includes TiCDC using TiUP
- Add TiCDC to an existing TiDB cluster using TiUP
- Add TiCDC to an existing TiDB cluster using binary (not recommended)
Make sure that your TiDB and TiCDC clusters are healthy before proceeding.
Create a
changefeed
by running thecdc cli
command:./cdc cli changefeed create --pd="http://127.0.0.1:2379" --sink-uri="kafka://127.0.0.1:9092/testdb_test?protocol=avro" --opts "registry=http://127.0.0.1:8081"
Test data replication
After TiDB is integrated with Confluent Platform, you can follow the example procedures below to test the data replication.
Create the
testdb
database in your TiDB cluster:CREATE DATABASE IF NOT EXISTS testdb;Create the
test
table intestdb
:USE testdb; CREATE TABLE test ( id INT PRIMARY KEY, v TEXT );Insert data into TiDB:
INSERT INTO test (id, v) values (1, 'a'); INSERT INTO test (id, v) values (2, 'b'); INSERT INTO test (id, v) values (3, 'c'); INSERT INTO test (id, v) values (4, 'd');Wait a moment for data to be replicated to the downstream. Then check the downstream for data:
sqlite3 test.db sqlite> SELECT * from test;