Chapter 2. Installing Change Data Capture connectors


Install Change Data Capture connectors through AMQ Streams by extending Kafka Connect with connector plugins. Following a deployment of AMQ Streams, you can deploy Change Data Capture as a connector configuration through Kafka Connect.

2.1. Prerequisites

A Change Data Capture installation requires the following:

  • Red Hat Enterprise Linux version 7.x or 8.x with an x86_64 architecture.
  • Administrative privileges (sudo access).
  • AMQ Streams 1.3 on Red Hat Enterprise Linux is installed on the host machine.

  • Credentials for the kafka user that was created when AMQ Streams was installed.
  • An AMQ Streams cluster is running.

Note

If you have an earlier version of AMQ Streams, you need to upgrade to AMQ Streams 1.3. For upgrade instructions, see AMQ Streams and Kafka Upgrades.

Additional resources

2.2. Kafka topic creation recommendations

Change Data Capture uses multiple Kafka topics for storing data. The topics have to be either created by an administrator, or by Kafka itself by enabling auto-creation for topics using the auto.create.topics.enable broker configuration.

The following list describes limitations and recommendations to consider when creating topics:

  • Replication, a factor of at least 3 in production
  • Single partition
  • Infinite (or very long) retention if topic compaction is disabled
  • Log compaction enabled, if you wish to only keep the last change event for a given record
Warning

Do not enable topic compaction for the database history topics used by the MySQL and SQL Server connectors.

If you relax the single partition rule, your application must be able to handle out-of-order events for different rows in the database (events for a single row are still fully ordered). If multiple partitions are used, Kafka will determine the partition by hashing the key by default. Other partition strategies require using Simple Message Transforms (SMTs) to set the key for each record.

For log compaction, configure the min.compaction.lag.ms and delete.retention.ms topic-level settings in Apache Kafka so that consumers have enough time to receive all events and delete markers. Specifically, these values should be larger than the maximum downtime you anticipate for the sink connectors, such as when you are updating them.

2.3. Deploying Change Data Capture with AMQ Streams on RHEL

This procedure describes how to set up connectors for Change Data Capture on Red Hat Enterprise Linux. Connectors are deployed to an AMQ Streams cluster using Kafka Connect, a framework for streaming data between Apache Kafka and external systems. Kafka Connect must be run in distributed mode rather than standalone mode.

This procedure assumes that AMQ Streams is installed and ZooKeeper and Kafka are running.

Procedure

  1. Visit Software Downloads on the Red Hat Customer Portal and download the Change Data Capture connector or connectors that you want to use. For example, download the Debezium 1.0.0 (Technical Preview) MySQL Connector to use Change Data Capture with a MySQL database.
  2. In /opt/kafka, create the connector-plugins directory if not already created for other Kafka Connect plugins:

    $ sudo mkdir /opt/kafka/connector-plugins
  3. Extract the contents of the Change Data Capture connector archive to the /opt/kafka/connector-plugins directory:

    $ sudo unzip mysql-connector-java-8.0.16.jar -d /opt/kafka/connector-plugins
  4. Repeat the above step for each connector that you want to install.
  5. Switch to the kafka user:

    $ su - kafka
    $ Password:
  6. Check whether Kafka Connect is already running in distributed mode. If it is running, stop the associated process on all Kafka Connect worker nodes. For example:

    $ jcmd | grep ConnectDistributed
    18514 org.apache.kafka.connect.cli.ConnectDistributed /opt/kafka/config/connect-distributed.properties
    $ kill 18514
  7. Edit the connect-distributed.properties file in /opt/kafka/config/ and specify the location of the Change Data Capture connector:

    plugin.path=/opt/kafka/connector-plugins
  8. Run Kafka Connect in distributed mode:

    $ /opt/kafka/bin/connect-distributed.sh /opt/kafka/config/connect-distributed.properties

    Kafka Connect runs. During startup, Change Data Capture connectors are loaded from the connector-plugins directory.

  9. Repeat steps 6–8 for each Kafka Connect worker node.

Updating Kafka Connect

If you need to update your deployment, amend the Change Data Capture connector JAR files in the /opt/kafka/connector-plugins directory, and then restart Kafka Connect.

Next Steps

The Change Data Capture User Guide describes how to configure each connector and its source database for change data capture. Once configured, a connector will connect to the source database and produce events for each inserted, updated, and deleted row or document.

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.