Chapter 1. Debezium 2.7.3 release notes


Debezium is a distributed change data capture platform that captures row-level changes that occur in database tables and then passes corresponding change event records to Apache Kafka topics. Applications can read these change event streams and access the change events in the order in which they occurred. Debezium is built on Apache Kafka and is deployed and integrated with Streams for Apache Kafka on OpenShift Container Platform or on Red Hat Enterprise Linux.

The following topics provide release details:

1.1. Debezium database connectors

Debezium provides source connectors and sink connectors that are based on Kafka Connect. Connectors are available for the following common databases:

Source connectors
  • Db2
  • MariaDB (Technology Preview)
  • MongoDB
  • MySQL
  • Oracle
  • PostgreSQL
  • SQL Server
Sink connectors
  • JDBC Sink connector

1.1.1. Connector usage notes

Db2
  • The Debezium Db2 connector does not include the Db2 JDBC driver (jcc-11.5.0.0.jar). See the Db2 connector deployment instructions for information about how to deploy the necessary JDBC driver.
  • The Db2 connector requires the use of the abstract syntax notation (ASN) libraries, which are available as a standard part of Db2 for Linux.
  • To use the ASN libraries, you must have a license for IBM InfoSphere Data Replication (IIDR). You do not have to install IIDR to use the libraries.
Oracle
PostgreSQL
  • To use the Debezium PostgreSQL connector you must use the pgoutput logical decoding output plug-in, which is the default for PostgreSQL versions 10 and later.

1.2. Debezium supported configurations

For information about Debezium supported configurations, including information about supported database versions, see the Debezium 2.7.3 Supported configurations page.

1.2.1. Streams for Apache Kafka API version

Debezium runs on Streams for Apache Kafka 2.7.

Streams for Apache Kafka supports the v1beta2 API version, which updates the schemas of the AMQ Streams custom resources. Older API versions are not supported.

1.3. Debezium installation options

You can install Debezium with Streams for Apache Kafka on OpenShift or on Red Hat Enterprise Linux:

1.4. Debezium 2.7.3 features and improvements

1.4.1. Breaking changes

Breaking changes represent significant differences in connector behavior and require configuration changes that are not compatible with earlier Debezium versions.

Debezium 2.7.3 introduces breaking changes that affect the following components:

For information about breaking changes in the previous Debezium release, see the Debezium 2.5.4 Release Notes.

The following breaking changes apply to all connectors:

Under certain circumstances, such as database communication errors, JDBC queries can become stuck in a persistent hung state. To mitigate this problem, Debezium now provides a query.timeout.ms property that enables you configure how long the connector waits for query responses. To disable timeout handling, you can set the value to 0. The default value is 600000 ms (600 seconds).

The Debezium TimezoneConverter SMT enables you to convert outgoing payload time values to a specified time zone. The original implementation was specifically restricted to allow conversion of values within the before or after parts of the payload. You can now use the converter to convert other time-based fields in the metadata, such as ts_ms in the source information block.
This change helps to improve the calculation of lag metrics when the timezone setting of the JVM that runs the connector does not match the setting on the source database server, resulting in errors in the difference between the envelope ts_ms and source ts_ms values. By configuring the connector to use the TimezoneConverter to convert metadata fields to use a consistent timezone, you can more easily calculate the accurate lag time between those two fields.

1.4.1.1.3. Signal table watermark metadata

Except when you run the MySQL connector in read-only mode, running an incremental snapshot requires a signal table to write open and close markers to coordinate the change boundaries with the data recorded in the transaction logs. In some cases, having a record of when the snapshot window opened and when it closed can be helpful.
Debezium now provides these details about these signal markers in the data column of the signal table, as shown in the following examples:

Example 1.1. Window open marker

{"openWindowTimestamp": "<window-open-time>"}
Copy to Clipboard Toggle word wrap

Example 1.2. Window close marker

{"openWindowTimestamp": "<window-open-time>", "closeWindowTimestamp": "<window-close-time>"}
Copy to Clipboard Toggle word wrap

1.4.1.2. CloudEvents converter breaking changes

1.4.1.3. JDBC sink connector breaking changes

1.4.1.4. MongoDB connector breaking changes

1.4.1.5. Oracle connector breaking changes

1.4.1.5.1. Support for Oracle 12c is discontinued

Support for Oracle 12 was deprecated in a previous release. Beginning with Debezium 2.7.3, you can no longer use Debezium with an Oracle 12 database.

Oracle Database 12c Release 1 (12.1) reached its end of life on July 31, 2022. After this date, Oracle no longer provides bug fixes, security updates, or patches for this version. Debezium 2.1.4 was the last release that advertised support for Oracle 12c.

1.4.1.6. SQL Server connector breaking changes

The SQL Server connector was not capturing all schemas when the connector was first deployed, and instead, was only capturing the schemas based on the tables defined in the configuration’s include list. This was a bug that could prevent users from easily adding new tables to the connector when expecting that the new table’s schema would already exist in the schema history topic. The connector now correctly honors the store.only.captured.tables.ddl configuration option (DBZ-7593).

For existing connector deployments, if you do not specifically set the store.only.captured.tables.ddl property for the schema history topic, the connector will begin capturing schema changes for all relevant tables in your database. If you want to prevent this and retain the prior behavior, you will need to adjust your connector configuration by adding schema.history.internal.store.only.captured.tables.ddl with a value of true.

1.4.2. General availability features

Debezium 2.7.3 provides new features for the following connectors:

1.4.2.1. Features promoted to General Availability

The following features are promoted from Technology Preview to General Availability in the Debezium 2.7.3 release:

In earlier releases, a Technology Preview feature provided the option to test the use of the Debezium connector for Oracle Database in an Oracle RAC environment. Use of the connector in an Oracle RAC environment is now fully supported.

When you use the Debezium connector with a PostgreSQL 16 database, you can now define replication slots on stand-by instances of the database. Based on this change to the database, in Debezium you can now connect to a standby PostgreSQL 16 server, and stream changes from it. By performing change data capture from a replica, rather than from the production system, you can better distribute server load, particularly in a very active database.

1.4.2.2.1. Improved event timestamp precision

To improve the precision of timestamps in change events, change events now include four additional fields, two in the envelope, and two in the source information block, as shown in the following example:

Example 1.3. Timestamp fields added to provide greater precision

{
  "source": {
    ...,
    "ts_us": "1559033904863123",
    "ts_ns": "1559033904863123000"
  },
  "ts_us": "1580390884335451",
  "ts_ns": "1580390884335451325",
}
Copy to Clipboard Toggle word wrap

The envelope values provide microsecond (ts_us) and nanosecond (ts_ns) values. The source information block can include both microsecond and nanosecond precision values. If the source database does not provide a sufficient level of precision, the timestamp values are truncated to a lower precision.

In earlier versions of Debezium, the column.truncate.* functionality returned a sliced ByteBuffer based on the truncation configuration. Although this worked when using Avro, the truncation was not honored if your connector configuration used the JsonConverter as it operated on the entire underlying array rather than the given slice.

The column truncation logic now explicitly creates a ByteBuffer based on a new array. This change allows the JsonConverter to respect the truncated column value during the serialization to Kafka.

The column.truncate.to.length.chars configuration property was improved to support array field types along with the string types that it previously supported. For more information, see the documentation for the SQL-based source connectors in the Debezium User Guide.

1.4.2.2.3. Unified snapshot modes (DBZ-7233)

The snapshot process is an integral part of each connector’s lifecycle, and enables you to collect and send all of the historical data that is present in a data store to a specified target system. In earlier releases, Debezium connectors did not support a consistent set of snapshot modes. This lack of consistency was particularly confusing for users who work with multiple connector types.

In this release, Debezium implements a consistent set of snapshot modes for all connectors. The same snapshot modes are available to all connectors, with the exception of the never mode, which remains specific to the MySQL connector. As a result, certain snapshot modes, such as when_needed, that were previously not available for some connectors, can now be used with any connector.

By default, after a snapshot completes, Debezium immediately transitions to the streaming phase. You now add the streaming.delay.ms property to the connector configuration to specify a delay between the end of the snapshot and the start of the streaming phase. This property can help you to prevent unnecessary re-snapshotting, and improve the overall stability of the connector during the transition from snapshot to streaming.

Use this property, in combination with the offset.flush.interval.ms property to ensure that offsets are properly flushed before streaming begins. To permit at least one offset flush interval to occur before streaming begins, set the value of streaming.delay.ms to a value that is slightly more than the offset.flush.interval.ms.

In some pipelines, event ordering is critical for consuming applications. Certain operations, such as re-partitioninf a Kafka topic, can disrupt the normal ordering of events in a data pipeline. Errors might occur when an application attempts to reconstruct the order of events.

In this release, when you enable transaction metadata, metadata events encode their transaction order. The encoded ordering field provides consumers with the information that they require to correctly reconstruct the transaction timeline.

1.4.2.3. JDBC sink connector GA features

The JDBC sink connector supports the use of mapping source columns to Kafka ARRAY-based payload field types. With Debezium 2.7.3, you can now serialize ARRAY-based fields to a target PostgreSQL database, with no change in configuration.

1.4.2.4. MongoDB connector GA features

The incremental snapshot process is an instrumental part in various recovery situations to collect whole or part of the data set from a source table or collection. Relational connectors have long supported the idea of supplying an additional-conditions value on the incremental snapshot signal to restrict the data set, providing for targeted resynchronization of specific rows of data. We’re happy to announce that this is now possible with MongoDB (DBZ-7138). Unlike relational databases, the additional-conditions should be supplied in JSON format. It will be applied to the specified collection using the find operation to obtain the subset list of documents that are to be incrementally snapshotted.

Beginning with this release, the Debezium MongoDB ExtractNewDocumentState single message transformation now adds an _id attribute to the event payload of delete events. The addition of this attribute enhances the ability consuming applications to make greater use of the events that the SMT generates.

1.4.2.5. Oracle connector GA features

1.4.2.5.1. Oracle RAC support

See Section 1.4.2.1.1, “Support for Oracle Real Application Clusters (RAC)”.

Debezium treats a RAW column type as a series of bytes, and therefore, change events that contain RAW columns use a schema type of BYTES. In the absence of information about how a consuming application uses the data in a RAW columns, this default makes sense. However, to support applications that are designed to consume data emitted as STRING types, rather than as BYTES, Debezium now provides a RawToStringConverter, which automatically emits RAW columns as STRING based.

For information about how to configure the converter, see the Debezium User Guide.

While ROW_ID is not unique across all rows of a table for the table’s lifespan, it can be used in certain situations when the lifecycle of the table and rows are managed in a very strict way. At the community’s request, we’ve added a new row_id field to the Oracle connector’s change event source information block. This field is be populated with the ROW_ID value under the following conditions:

  • Only populated from streaming events for inserts, updates, and deletes.
  • Snapshot events do not contain a row_id value.
  • Only provided by the LogMiner and XStream streaming adapters, OpenLogReplicator is not supported.

Any event that does not match the criteria will not include a row_id field as its marked as optional.

In earlier versions of Debezium, the Oracle connector was designed to create the LogMiner flush table in the default tablespace of the connector user account. This could cause problems if the DBA wanted to move the flush table to a different tablespace. Previously, to add the flush table to the preferred tablespace required you to complete several steps. First, you would modify the user account to specify a new default table space, or create a different account that specified that table space. Then, you would have to re-create the table so that it was placed in the correct location.

Beginning with this release, you can use the log.mining.flush.table.name property to specify where to create the flush table, as shown in the following example:

Example 1.4. Configuration that uses a custom schema name to specify the flush table location

log.mining.flush.table.name=THE_OTHER_SCHEMA.LOG_MINING_FLUSH_TABLE
Copy to Clipboard Toggle word wrap

The schema name is optional. If you do not specify a schema name, the connector follows the legacy behavior and uses the default tablespace of the connector user account to create the flush table and check for its existence.

The Debezium Oracle connector can support thousands of tables in a single connector deployment. However, you might want to customize the query filter using the IN mode. When a high volume of changes occur in the tables of a configured database, you might want to filter that dataset at the database level before changes are passed to Debezium for processing.

In earlier versions, a SQL error would result if the connector’s log.mining.query.filter.mode was set to in, and the table include list contained more than 1000 elements. Oracle does not permit more than 1000 elements within an IN clause. To address this limitation, Debezium 2.7.3 uses a disjunction between multiple buckets of 1000 item IN clause lists.

The Debezium Oracle connector can now report the original SQL for an operation in the source information block of INSERT, UPDATE, and DELETE event records. To configure the connector to include the REDO SQL as part of the change event, set the log.mining.include.redo.sql property as shown in the following example:

"log.mining.include.redo.sql": "true"
Copy to Clipboard Toggle word wrap

This feature is an opt-in only feature; to use it, you must explicitly enable it.

Important

Enabling this feature can more than double the size of event payloads.

When this option is enabled, the source information block contains a new field redo_sql, as shown in the following example:

Example 1.5. INSERT source information block with REDO SQL enabled

"source": {
  ...
  "redo_sql": "INSERT INTO \"DEBEZIUM\".\"TEST\" (\"ID\",\"DATA\") values ('1', 'Test');"
}
Copy to Clipboard Toggle word wrap
Warning

Because of the way that LogMiner reconstructs the SQL related to CLOB, BLOB, and XML data types, you cannot enable the use of REDO SQL if lob.enabled is set to true. If you enable both log.mining.include.redo.sql and lob.enabled, after you start the connector it reports a configuration error.

A new delay strategy for transaction registration has been added when using LogMiner. This strategy effectively delays the creation of the transaction record in the buffer until after the connector captures the first change for that transaction.

Note

If you set lob.enabled to true, you cannot use this delay strategy, because of the way that the connector handles specific operations in these two modes.

Delaying transaction registration has a number of benefits, which include:

  • Reduces the overhead on the transaction cache, especially when the connector processes large numbers of concurrent requests at the same time.
  • Prevents the connector from capturing long-running transactions that have no changes.
  • Facilitates the advance of the low-watermark SCN in the offsets in some scenarios.

1.4.2.6. PostgreSQL connector GA features

See Section 1.4.2.1, “Features promoted to General Availability”

1.4.2.7. SQL Server connector GA features

The Debezium SQL Server connector uses a common SQL Server stored procedure called fn_cdc_get_all_changes…​ to fetch all of the relevant captured changes for a given table. This query performs several unions and then returns data from only one of the union sub-queries, which can be inefficient.

This release of the connector introduces the configuration property data.query.mode, which specifies the method that the connector uses to gather details about table changes. The default behavior remains unchanged from earlier releases, and is set by using the value function to delegate to the previously mentioned stored procedure.

To enhance the efficiency with which the connector gathers changes, configure the connector to build the query internally by setting the value of data.query.mode to direct.

Debezium 2.7.3 adds support for the heartbeat.action.query connector configuration property to the SQL Server connector. The heartbeat.action.query property enables the connector to perform a write operation to the source database on an interval defined by heartbeat.interval.ms. The write operation is meant to produce a change event that is captured by the connector, and is then sent to Kafka or the target system.

In an active database that is capturing changes regularly, the constant stream of changes is sufficient to keep offsets synchronized with the read position in the transaction logs. However, if the connector is capturing changes from a source more changes occur in non-captured tables than in captured tables, setting heartbeat.action.query can help to keep the read position in the offsets synchronized with the lower capture activity.

1.4.3. Technology Preview features

The following Technology Preview features are available in Debezium 2.7.3:

Important

Technology Preview features are not supported with Red Hat production service-level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend implementing any Technology Preview features in production environments. Technology Preview features provide early access to upcoming product innovations, enabling you to test functionality and provide feedback during the development process. For more information about support scope, see Technology Preview Features Support Scope.

1.4.3.1. Debezium connector for MariaDB (DBZ-7693)

In the previous release, you could configure the Debezium MySQL connector to run against a MariaDB database deployment. Debezium 2.7.3 goes further by introducing a new standalone connector implementation for MariaDB.

With this addition to the suite of connectors, Debezium withdraws the previous ability to set the connection.adapter configuration to enable use of the MySQL connector with a MariaDB database.

1.4.3.2. Compatibility with Oracle Database 23ai

This release of Debezium provides integration with Oracle 23ai as a Technology Preview feature.

For more information, see Oracle binary character LOB types in the Debezium User Guide.

The ability to use multiple parallel threads to perform an initial snapshot is available as a Technology Preview feature for all of the Debezium SQL connectors, with the exception of the MySQL connector. For for the Debezium MySQL connector, a Developer preview of Parallel initial snapshots is available.

To configure connectors to use parallel initial snapshots, set the snapshot.max.threads property in the connector configuration to a value greater than 1.

For more information, see snapshot.max.threads in the Debezium User Guide documentation for your connector.

1.4.3.5. CloudEvents converter

The CloudEvents converter emits change event records that conform to the CloudEvents specification. The CloudEvents change event envelope can be JSON or Avro and each envelope type supports JSON or Avro as the data format. For more information, see CloudEvents converter.

1.4.3.6. Custom converters

In cases where the default data type conversions do not meet your needs, you can create custom converters to use with a connector. For more information, see Custom-developed converters.

1.4.4. Developer Preview features

The following Developer Preview features are available in Debezium 2.7.3:

Important

Developer Preview features are not supported by Red Hat in any way and are not functionally complete or production-ready. Do not use Developer Preview software for production or business-critical workloads. Developer Preview software provides early access to upcoming product software in advance of its possible inclusion in a Red Hat product offering. Customers can use this software to test functionality and provide feedback during the development process. This software might not have any documentation, is subject to change or removal at any time, and has received limited testing. Red Hat might provide ways to submit feedback on Developer Preview software without an associated SLA.

For more information about the support scope of Red Hat Developer Preview software, see Developer Preview Support Scope.

1.4.4.1. Debezium Server (Developer Preview)

Debezium Server is a ready-to-use application that streams change events from a data source directly to a configured Kafka or Redis data sink. Unlike the generally available Debezium connectors, the deployment of Debezium Server has no dependencies on Apache Kafka Connect. For more information about the {ProductNameServer} Developer Preview, see the Debezium User Guide

In previous iterations of the Debezium MongoDB connector, change streams could be opened against the deployment and database scopes, which was not always ideal for restrictive permission environments. Debezium 2.7.3 introduces a new change stream mode in which the connector operates within the scope of a single collection only, allowing for such granular permissive configurations.

To limit the capture scope of a MongoDB connector to a specific collection, set the value of the capture.scope property in the connector configuration to collection. Use this setting when you intend for a connector to capture changes only from a single MongoDB collection.

Certain limitations exist when using this feature. If you set value of capture.scope to collection, the connector cannot use the default source signaling channel. Enabling the source channel for a connector is required to permit processing of incremental snapshot signals, including signals sent via the Kafka, JMX, or File channels. Thus, if the connector configuration sets the collection option for the capture-scope property, the connector cannot perform incremental snapshots.

To improve snapshot performance the MySQL connector, implements parallelization to concurrently snapshot change events and generate schema events for tables. By running snapshots and generating schema events in parallel, the connector reduces the time required to capture the schema for many tables in your database.

The Debezium initial snapshot for MySQL has always been single-threaded. This limitation primarily stems from the complexities of ensuring data consistency across multiple transactions.

In this release, you can configure a MySQL connector to use multiple threads to execute table-level snapshots in parallel.

In order to take advantage of this new feature, add the snapshot.max.threads property to the connector configuration, and set the property to a value greater than 1.

Example 1.6. Example configuration using parallel snapshots

snapshot.max.threads=4
Copy to Clipboard Toggle word wrap

Based on the configuration in the preceding example, the connector can snapshot a maximum of four tables at once. If the number of tables to snapshot is greater than four, after a thread finishes processing one of the first four tables, it then finds the next table in the queue and begins to perform a snapshot. The process continues until the connector finishes performing snapshots on all of the designated tables.

For more information, see snapshot.max.threads in the Debezium User Guide.

When the Debezium connector for Oracle connects to a production or primary database, it uses an internal flush table to manage the flush cycles of the Oracle Log Writer Buffer (LGWR) process. The flush process requires that the user account through which the connector accesses the database has permission to create and write to this flush table. However, logical stand-by databases often have restrictive data manipulation rules, and may even be read-only. As a result, writing to the database might not be feasible.

To support an Oracle read-only logical stand-by database, Debezium introduces a property to disable the creation and management of the flush table. You can use this feature with both Oracle Standalone and Oracle RAC installations.

To enable the Oracle connector to use a read-only logical stand-by, add the following connector option:

internal.log.mining.read.only=true
Copy to Clipboard Toggle word wrap

For more information, see the Oracle connector documentation in the Debezium User Guide.

1.4.4.4.2. Oracle LogMiner hybrid mining strategy

You can now configure the Debezium connector for Oracle to use the LogMiner hybrid mining strategy. This strategy is designed to support all of the schema evolution features of the default mining strategy while also taking advantage of the performance optimizations offered by the online catalog strategy. To configure the connector to use this strategy, set the log.mining.strategy configuration property to the value hybrid.

With the online_catalog strategy, a schema change and a data change can occur within the same mining step. When this happens, LogMiner cannot reconstruct the SQL correctly, and the resulting table is assigned a name such as OBJ# xxxxxx; similarly, columns are represented as COL1, COL2, and so on. To avoid this situation when the online catalog strategy is used, requires you to carefully choreograph schema changes so that a mining step never observes both a schema change and a data change together; however, this is not always feasible.

By contrast, the hybrid strategy tracks a table’s object ID at the database level, and then uses this ID to look up the schema associated with the table from Debezium relational table model. Debezium is thus able to do what Oracle LogMiner is unable to do in these specific corner cases. The table name is taken from the relational model’s table name, and columns are mapped by column position.

Unfortunately, Oracle does not provide a way to reconstruct failed SQL operations for CLOB, BLOB, and XML data types. As a result, you cannot configure the connector to use the hybrid strategy if the value of lob.enabled is set to true. If you attempt to start a connector that is configured to use the hybrid strategy and has lob.enabled set to true, the connector fails to start and report a configuration failure.

In this release, Debezium provides support for exactly-once semantics for the PostgreSQL connector. Exactly-once delivery for PostgreSQL applies only to the streaming phase; exactly-once delivery does not apply to snapshots.

Debezium was designed to provide at-least-once delivery with a goal of ensuring that connectors capture all change events that occur in the monitored sources. In KIP-618 the Apache Kafka community proposes a solution to address problems that occur when a producer retries a message. Source connectors sometimes resend batches of events to the Kafka broker, even after the broker previously committed the batch. This situation can result in duplicate events being sent to consumers (sink connectors), which can cause problems for consumers that do not handle duplicates easily.

No connector configuration changes are required to enable exactly-once delivery. However, exactly-once delivery must be configured in your Kafka Connect worker configuration. For information about setting the required Kafka Connect configuration properties, see KIP-618.

Note

To set exactly.once.support to required in the Kafka worker configuration, all connectors in the Kafka Connect cluster must supports exactly-once delivery, If you attempt to set this option in a cluster in which workers do not consistently support exactly-once delivery, the connectors that do not support this feature fail validation at start-up.

1.4.5. Other updates in this release

This Debezium 2.7.3 release provides multiple other feature updates and fixes. For a complete list, see the Debezium 2.7.3 Enhancement Advisory (RHEA-2024:139598).

1.5. Deprecated features

The following features are deprecated and will be removed in a future release:

Deprecation of schema_only and schema_only_recovery snapshot modes
  • The schema_only_recovery mode is deprecated and is replaced by the recovery mode.
  • The schema_only mode is deprecated and is replaced by the no_data mode.
Important

The current release continues to include the deprecated snapshot modes, but they are scheduled for removal in a future release. To prepare for their removal, adjust any scripts, configurations, and processes that depend on these deprecated modes.

For information about features that were deprecated in the previous release, see the 2.5.4 Release Notes

1.6. Known issues

The following known issue affects Debezium 2.7.3:

Back to top
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2025 Red Hat