Chapter 1. Debezium 2.7.3 release notes
Debezium is a distributed change data capture platform that captures row-level changes that occur in database tables and then passes corresponding change event records to Apache Kafka topics. Applications can read these change event streams and access the change events in the order in which they occurred. Debezium is built on Apache Kafka and is deployed and integrated with Streams for Apache Kafka on OpenShift Container Platform or on Red Hat Enterprise Linux.
The following topics provide release details:
- Section 1.1, “Debezium database connectors”
- Section 1.2, “Debezium supported configurations”
- Section 1.3, “Debezium installation options”
- Section 1.4, “Debezium 2.7.3 features and improvements”
- Section 1.4.1, “Breaking changes”
- Section 1.4.2, “General availability features”
- Section 1.4.3, “Technology Preview features”
- Section 1.4.4, “Developer Preview features”
- Section 1.4.5, “Other updates in this release”
- Section 1.5, “Deprecated features”
- Section 1.6, “Known issues”
1.1. Debezium database connectors Copy linkLink copied to clipboard!
Debezium provides source connectors and sink connectors that are based on Kafka Connect. Connectors are available for the following common databases:
- Source connectors
- Db2
- MariaDB (Technology Preview)
- MongoDB
- MySQL
- Oracle
- PostgreSQL
- SQL Server
- Sink connectors
- JDBC Sink connector
1.1.1. Connector usage notes Copy linkLink copied to clipboard!
- Db2
-
The Debezium Db2 connector does not include the Db2 JDBC driver (
jcc-11.5.0.0.jar). See the Db2 connector deployment instructions for information about how to deploy the necessary JDBC driver. - The Db2 connector requires the use of the abstract syntax notation (ASN) libraries, which are available as a standard part of Db2 for Linux.
- To use the ASN libraries, you must have a license for IBM InfoSphere Data Replication (IIDR). You do not have to install IIDR to use the libraries.
-
The Debezium Db2 connector does not include the Db2 JDBC driver (
- Oracle
-
The Debezium Oracle connector does not include the Oracle JDBC driver (
21.6.0.0). See the Oracle connector deployment instructions for information about how to deploy the necessary JDBC driver.
-
The Debezium Oracle connector does not include the Oracle JDBC driver (
- PostgreSQL
-
To use the Debezium PostgreSQL connector you must use the
pgoutputlogical decoding output plug-in, which is the default for PostgreSQL versions 10 and later.
-
To use the Debezium PostgreSQL connector you must use the
1.2. Debezium supported configurations Copy linkLink copied to clipboard!
For information about Debezium supported configurations, including information about supported database versions, see the Debezium 2.7.3 Supported configurations page.
1.2.1. Streams for Apache Kafka API version Copy linkLink copied to clipboard!
Debezium runs on Streams for Apache Kafka 2.7.
Streams for Apache Kafka supports the v1beta2 API version, which updates the schemas of the AMQ Streams custom resources. Older API versions are not supported.
1.3. Debezium installation options Copy linkLink copied to clipboard!
You can install Debezium with Streams for Apache Kafka on OpenShift or on Red Hat Enterprise Linux:
1.4. Debezium 2.7.3 features and improvements Copy linkLink copied to clipboard!
1.4.1. Breaking changes Copy linkLink copied to clipboard!
Breaking changes represent significant differences in connector behavior and require configuration changes that are not compatible with earlier Debezium versions.
Debezium 2.7.3 introduces breaking changes that affect the following components:
- Section 1.4.1.1, “Breaking changes relevant to all connectors”
- Section 1.4.1.2, “CloudEvents converter breaking changes”
- Section 1.4.1.3, “JDBC sink connector breaking changes”
- Section 1.4.1.4, “MongoDB connector breaking changes”
- Section 1.4.1.5, “Oracle connector breaking changes”
- Section 1.4.1.6, “SQL Server connector breaking changes”
For information about breaking changes in the previous Debezium release, see the Debezium 2.5.4 Release Notes.
1.4.1.1. Breaking changes relevant to all connectors Copy linkLink copied to clipboard!
The following breaking changes apply to all connectors:
1.4.1.1.1. Configurable query timeout property (query.timeout.ms) (DBZ-7616) Copy linkLink copied to clipboard!
Under certain circumstances, such as database communication errors, JDBC queries can become stuck in a persistent hung state. To mitigate this problem, Debezium now provides a query.timeout.ms property that enables you configure how long the connector waits for query responses. To disable timeout handling, you can set the value to 0. The default value is 600000 ms (600 seconds).
1.4.1.1.2. Timestamp converter improvements (DBZ-7022) Copy linkLink copied to clipboard!
The Debezium TimezoneConverter SMT enables you to convert outgoing payload time values to a specified time zone. The original implementation was specifically restricted to allow conversion of values within the before or after parts of the payload. You can now use the converter to convert other time-based fields in the metadata, such as ts_ms in the source information block.
This change helps to improve the calculation of lag metrics when the timezone setting of the JVM that runs the connector does not match the setting on the source database server, resulting in errors in the difference between the envelope ts_ms and source ts_ms values. By configuring the connector to use the TimezoneConverter to convert metadata fields to use a consistent timezone, you can more easily calculate the accurate lag time between those two fields.
1.4.1.1.3. Signal table watermark metadata Copy linkLink copied to clipboard!
Except when you run the MySQL connector in read-only mode, running an incremental snapshot requires a signal table to write open and close markers to coordinate the change boundaries with the data recorded in the transaction logs. In some cases, having a record of when the snapshot window opened and when it closed can be helpful.
Debezium now provides these details about these signal markers in the data column of the signal table, as shown in the following examples:
Example 1.1. Window open marker
{"openWindowTimestamp": "<window-open-time>"}
{"openWindowTimestamp": "<window-open-time>"}
Example 1.2. Window close marker
{"openWindowTimestamp": "<window-open-time>", "closeWindowTimestamp": "<window-close-time>"}
{"openWindowTimestamp": "<window-open-time>", "closeWindowTimestamp": "<window-close-time>"}
1.4.1.2. CloudEvents converter breaking changes Copy linkLink copied to clipboard!
1.4.1.2.1. CloudEvents schema name customization (DBZ-7284) Copy linkLink copied to clipboard!
By default, the CloudEvents converter automatically generate the schema for a CloudEvent message. However, if the automatically generated schema names are not sufficient, you can adjust the configuration by specifying a dataSchemaName in the metadata.source property. You can set the dataSchemaName to one of the following values:
generate- This option maintains the default behavior that is described in the preceding paragraph.
header- Extracts the schema name directly from the specified event header field.
1.4.1.3. JDBC sink connector breaking changes Copy linkLink copied to clipboard!
1.4.1.3.1. JDBC sink value serialization changes (DBZ-7191) Copy linkLink copied to clipboard!
In some cases, when the JDBC sink connector consumed an event that included fields with null values, when writing the event record to the target database, it incorrectly wrote the default field value rather, than the the value NULL.
1.4.1.4. MongoDB connector breaking changes Copy linkLink copied to clipboard!
1.4.1.4.1. MongoDB connector no longer supports the replica_set connection mode (DBZ-7260) Copy linkLink copied to clipboard!
The deprecated replica_set mode for the MongoDB connector is now removed. The replica_set connection mode for the Debezium enabled the MongoDB connector to use separate Kafka Connect tasks to process each database shard individually, within the confines of the tasks.max configuration. This configuration circumvented the MongoDB router (mongos). However, the MongoDB database is designed with the expectation that clients always connect through the router. As a result, the replica_set connection mode could causes problems with signaling, incremental snapshot support, and overall complexity of our codebase.
Removing the replica_set connection mode restricts the connector to single task operation. If you previously configured the MongoDB connector to use replica_set mode, to continue using the connector after the upgrade to Debezium 2.7.3, you must adjust the configuration. For more information, see the MongoDB connector documentation.
1.4.1.5. Oracle connector breaking changes Copy linkLink copied to clipboard!
1.4.1.5.1. Support for Oracle 12c is discontinued Copy linkLink copied to clipboard!
Support for Oracle 12 was deprecated in a previous release. Beginning with Debezium 2.7.3, you can no longer use Debezium with an Oracle 12 database.
Oracle Database 12c Release 1 (12.1) reached its end of life on July 31, 2022. After this date, Oracle no longer provides bug fixes, security updates, or patches for this version. Debezium 2.1.4 was the last release that advertised support for Oracle 12c.
This has been fixed and the connector now correctly emits these columns according to the configured decimal handling mode. After you upgrade the connector, this change in behavior might cause problems in deployments that use strict schema registry compatibility rules.
1.4.1.6. SQL Server connector breaking changes Copy linkLink copied to clipboard!
1.4.1.6.1. SQL Server connector correctly honors setting of the store.only.captured.tables.ddl configuration property Copy linkLink copied to clipboard!
The SQL Server connector was not capturing all schemas when the connector was first deployed, and instead, was only capturing the schemas based on the tables defined in the configuration’s include list. This was a bug that could prevent users from easily adding new tables to the connector when expecting that the new table’s schema would already exist in the schema history topic. The connector now correctly honors the store.only.captured.tables.ddl configuration option (DBZ-7593).
For existing connector deployments, if you do not specifically set the store.only.captured.tables.ddl property for the schema history topic, the connector will begin capturing schema changes for all relevant tables in your database. If you want to prevent this and retain the prior behavior, you will need to adjust your connector configuration by adding schema.history.internal.store.only.captured.tables.ddl with a value of true.
1.4.1.6.2. New default value for max.iteration.transactions (DBZ-7750) Copy linkLink copied to clipboard!
In older versions of Debezium, the SQL Server connector would process all transactions available during a poll iteration. When there was a high volume of traffic, memory overloads could result, leading to poor performance. Although the max.iteration.transactions configuration property already exists to address this problem, it was previously configured with a default value of 0. Based on the former default value, the connector would still attempt to process all transactions. To limit the number of transactions that the connector processes during a poll iteration, in this release, the default value of the property is now set to 500.
1.4.1.6.3. MBean name for the JMX signal channel now includes the taskId (DBZ-8137) Copy linkLink copied to clipboard!
Normal JMX metrics are registered with a taskId attribute, because SQL Server supports spawning a unique task per database mapping. In past releases, the JMX signal channel did not honor this, which could prevent the connector from starting a channel for each task. This has been fixed, and the MBean name of the JMX signal channel now includes a taskId in its name. By including the taskId in the MBean name, it’s now possible to uniquely identify the signal channel for each database task when you deploy a single connector to stream changes from multiple SQL Server databases.
1.4.2. General availability features Copy linkLink copied to clipboard!
Debezium 2.7.3 provides new features for the following connectors:
- Section 1.4.2.1, “Features promoted to General Availability”
- Section 1.4.2.2, “GA features that apply to all source connectors”
- Section 1.4.2.3, “JDBC sink connector GA features”
- Section 1.4.2.4, “MongoDB connector GA features”
- Section 1.4.2.5, “Oracle connector GA features”
- Section 1.4.2.6, “PostgreSQL connector GA features”
- Section 1.4.2.7, “SQL Server connector GA features”
1.4.2.1. Features promoted to General Availability Copy linkLink copied to clipboard!
The following features are promoted from Technology Preview to General Availability in the Debezium 2.7.3 release:
1.4.2.1.1. Support for Oracle Real Application Clusters (RAC) Copy linkLink copied to clipboard!
In earlier releases, a Technology Preview feature provided the option to test the use of the Debezium connector for Oracle Database in an Oracle RAC environment. Use of the connector in an Oracle RAC environment is now fully supported.
1.4.2.1.2. Streaming from a PostgreSQL 16 standby (DBZ-7181) Copy linkLink copied to clipboard!
When you use the Debezium connector with a PostgreSQL 16 database, you can now define replication slots on stand-by instances of the database. Based on this change to the database, in Debezium you can now connect to a standby PostgreSQL 16 server, and stream changes from it. By performing change data capture from a replica, rather than from the production system, you can better distribute server load, particularly in a very active database.
1.4.2.2. GA features that apply to all source connectors Copy linkLink copied to clipboard!
1.4.2.2.1. Improved event timestamp precision Copy linkLink copied to clipboard!
To improve the precision of timestamps in change events, change events now include four additional fields, two in the envelope, and two in the source information block, as shown in the following example:
Example 1.3. Timestamp fields added to provide greater precision
The envelope values provide microsecond (ts_us) and nanosecond (ts_ns) values. The source information block can include both microsecond and nanosecond precision values. If the source database does not provide a sufficient level of precision, the timestamp values are truncated to a lower precision.
1.4.2.2.2. Improved handling of truncated array fields (DBZ-7925, DBZ-8189) Copy linkLink copied to clipboard!
In earlier versions of Debezium, the column.truncate.* functionality returned a sliced ByteBuffer based on the truncation configuration. Although this worked when using Avro, the truncation was not honored if your connector configuration used the JsonConverter as it operated on the entire underlying array rather than the given slice.
The column truncation logic now explicitly creates a ByteBuffer based on a new array. This change allows the JsonConverter to respect the truncated column value during the serialization to Kafka.
The column.truncate.to.length.chars configuration property was improved to support array field types along with the string types that it previously supported. For more information, see the documentation for the SQL-based source connectors in the Debezium User Guide.
1.4.2.2.3. Unified snapshot modes (DBZ-7233) Copy linkLink copied to clipboard!
The snapshot process is an integral part of each connector’s lifecycle, and enables you to collect and send all of the historical data that is present in a data store to a specified target system. In earlier releases, Debezium connectors did not support a consistent set of snapshot modes. This lack of consistency was particularly confusing for users who work with multiple connector types.
In this release, Debezium implements a consistent set of snapshot modes for all connectors. The same snapshot modes are available to all connectors, with the exception of the never mode, which remains specific to the MySQL connector. As a result, certain snapshot modes, such as when_needed, that were previously not available for some connectors, can now be used with any connector.
1.4.2.2.4. Optional delay between snapshot and streaming (DBZ-7902) Copy linkLink copied to clipboard!
By default, after a snapshot completes, Debezium immediately transitions to the streaming phase. You now add the streaming.delay.ms property to the connector configuration to specify a delay between the end of the snapshot and the start of the streaming phase. This property can help you to prevent unnecessary re-snapshotting, and improve the overall stability of the connector during the transition from snapshot to streaming.
Use this property, in combination with the offset.flush.interval.ms property to ensure that offsets are properly flushed before streaming begins. To permit at least one offset flush interval to occur before streaming begins, set the value of streaming.delay.ms to a value that is slightly more than the offset.flush.interval.ms.
1.4.2.2.5. Transaction metadata encoded ordering (DBZ-7698) Copy linkLink copied to clipboard!
In some pipelines, event ordering is critical for consuming applications. Certain operations, such as re-partitioninf a Kafka topic, can disrupt the normal ordering of events in a data pipeline. Errors might occur when an application attempts to reconstruct the order of events.
In this release, when you enable transaction metadata, metadata events encode their transaction order. The encoded ordering field provides consumers with the information that they require to correctly reconstruct the transaction timeline.
1.4.2.3. JDBC sink connector GA features Copy linkLink copied to clipboard!
1.4.2.3.1. PostgreSQL Arrays with the JDBC sink (DBZ-7752) Copy linkLink copied to clipboard!
The JDBC sink connector supports the use of mapping source columns to Kafka ARRAY-based payload field types. With Debezium 2.7.3, you can now serialize ARRAY-based fields to a target PostgreSQL database, with no change in configuration.
1.4.2.4. MongoDB connector GA features Copy linkLink copied to clipboard!
1.4.2.4.1. Predicate conditions for MongoDB incremental snapshots Copy linkLink copied to clipboard!
The incremental snapshot process is an instrumental part in various recovery situations to collect whole or part of the data set from a source table or collection. Relational connectors have long supported the idea of supplying an additional-conditions value on the incremental snapshot signal to restrict the data set, providing for targeted resynchronization of specific rows of data. We’re happy to announce that this is now possible with MongoDB (DBZ-7138). Unlike relational databases, the additional-conditions should be supplied in JSON format. It will be applied to the specified collection using the find operation to obtain the subset list of documents that are to be incrementally snapshotted.
1.4.2.4.2. ExtractNewDocumentState includes document id for MongoDB deletes (DBZ-7695) Copy linkLink copied to clipboard!
Beginning with this release, the Debezium MongoDB ExtractNewDocumentState single message transformation now adds an _id attribute to the event payload of delete events. The addition of this attribute enhances the ability consuming applications to make greater use of the events that the SMT generates.
1.4.2.5. Oracle connector GA features Copy linkLink copied to clipboard!
1.4.2.5.1. Oracle RAC support Copy linkLink copied to clipboard!
See Section 1.4.2.1.1, “Support for Oracle Real Application Clusters (RAC)”.
1.4.2.5.2. Oracle RAW data type to STRING converter (DBZ-7753) Copy linkLink copied to clipboard!
Debezium treats a RAW column type as a series of bytes, and therefore, change events that contain RAW columns use a schema type of BYTES. In the absence of information about how a consuming application uses the data in a RAW columns, this default makes sense. However, to support applications that are designed to consume data emitted as STRING types, rather than as BYTES, Debezium now provides a RawToStringConverter, which automatically emits RAW columns as STRING based.
For information about how to configure the converter, see the Debezium User Guide.
1.4.2.5.3. Oracle ROW_ID included in change events (DBZ-4332) Copy linkLink copied to clipboard!
While ROW_ID is not unique across all rows of a table for the table’s lifespan, it can be used in certain situations when the lifecycle of the table and rows are managed in a very strict way. At the community’s request, we’ve added a new row_id field to the Oracle connector’s change event source information block. This field is be populated with the ROW_ID value under the following conditions:
- Only populated from streaming events for inserts, updates, and deletes.
-
Snapshot events do not contain a
row_idvalue. - Only provided by the LogMiner and XStream streaming adapters, OpenLogReplicator is not supported.
Any event that does not match the criteria will not include a row_id field as its marked as optional.
1.4.2.5.4. Oracle flush table with custom schema names (DBZ-7819) Copy linkLink copied to clipboard!
In earlier versions of Debezium, the Oracle connector was designed to create the LogMiner flush table in the default tablespace of the connector user account. This could cause problems if the DBA wanted to move the flush table to a different tablespace. Previously, to add the flush table to the preferred tablespace required you to complete several steps. First, you would modify the user account to specify a new default table space, or create a different account that specified that table space. Then, you would have to re-create the table so that it was placed in the correct location.
Beginning with this release, you can use the log.mining.flush.table.name property to specify where to create the flush table, as shown in the following example:
Example 1.4. Configuration that uses a custom schema name to specify the flush table location
log.mining.flush.table.name=THE_OTHER_SCHEMA.LOG_MINING_FLUSH_TABLE
log.mining.flush.table.name=THE_OTHER_SCHEMA.LOG_MINING_FLUSH_TABLE
The schema name is optional. If you do not specify a schema name, the connector follows the legacy behavior and uses the default tablespace of the connector user account to create the flush table and check for its existence.
1.4.2.5.5. Oracle query filter with large numbers of tables (DBZ-7847) Copy linkLink copied to clipboard!
The Debezium Oracle connector can support thousands of tables in a single connector deployment. However, you might want to customize the query filter using the IN mode. When a high volume of changes occur in the tables of a configured database, you might want to filter that dataset at the database level before changes are passed to Debezium for processing.
In earlier versions, a SQL error would result if the connector’s log.mining.query.filter.mode was set to in, and the table include list contained more than 1000 elements. Oracle does not permit more than 1000 elements within an IN clause. To address this limitation, Debezium 2.7.3 uses a disjunction between multiple buckets of 1000 item IN clause lists.
1.4.2.5.6. Oracle Redo SQL per event with LogMiner (DBZ-6960) Copy linkLink copied to clipboard!
The Debezium Oracle connector can now report the original SQL for an operation in the source information block of INSERT, UPDATE, and DELETE event records. To configure the connector to include the REDO SQL as part of the change event, set the log.mining.include.redo.sql property as shown in the following example:
"log.mining.include.redo.sql": "true"
"log.mining.include.redo.sql": "true"
This feature is an opt-in only feature; to use it, you must explicitly enable it.
Enabling this feature can more than double the size of event payloads.
When this option is enabled, the source information block contains a new field redo_sql, as shown in the following example:
Example 1.5. INSERT source information block with REDO SQL enabled
"source": {
...
"redo_sql": "INSERT INTO \"DEBEZIUM\".\"TEST\" (\"ID\",\"DATA\") values ('1', 'Test');"
}
"source": {
...
"redo_sql": "INSERT INTO \"DEBEZIUM\".\"TEST\" (\"ID\",\"DATA\") values ('1', 'Test');"
}
Because of the way that LogMiner reconstructs the SQL related to CLOB, BLOB, and XML data types, you cannot enable the use of REDO SQL if lob.enabled is set to true. If you enable both log.mining.include.redo.sql and lob.enabled, after you start the connector it reports a configuration error.
1.4.2.5.7. Oracle LogMiner transaction buffer improvements Copy linkLink copied to clipboard!
A new delay strategy for transaction registration has been added when using LogMiner. This strategy effectively delays the creation of the transaction record in the buffer until after the connector captures the first change for that transaction.
If you set lob.enabled to true, you cannot use this delay strategy, because of the way that the connector handles specific operations in these two modes.
Delaying transaction registration has a number of benefits, which include:
- Reduces the overhead on the transaction cache, especially when the connector processes large numbers of concurrent requests at the same time.
- Prevents the connector from capturing long-running transactions that have no changes.
- Facilitates the advance of the low-watermark SCN in the offsets in some scenarios.
1.4.2.6. PostgreSQL connector GA features Copy linkLink copied to clipboard!
See Section 1.4.2.1, “Features promoted to General Availability”
1.4.2.7. SQL Server connector GA features Copy linkLink copied to clipboard!
1.4.2.7.1. SQL Server query improvements (DBZ-7273) Copy linkLink copied to clipboard!
The Debezium SQL Server connector uses a common SQL Server stored procedure called fn_cdc_get_all_changes… to fetch all of the relevant captured changes for a given table. This query performs several unions and then returns data from only one of the union sub-queries, which can be inefficient.
This release of the connector introduces the configuration property data.query.mode, which specifies the method that the connector uses to gather details about table changes. The default behavior remains unchanged from earlier releases, and is set by using the value function to delegate to the previously mentioned stored procedure.
To enhance the efficiency with which the connector gathers changes, configure the connector to build the query internally by setting the value of data.query.mode to direct.
1.4.2.7.2. Heartbeat action query now supported for the SQL Server connector (DBZ-7801) Copy linkLink copied to clipboard!
Debezium 2.7.3 adds support for the heartbeat.action.query connector configuration property to the SQL Server connector. The heartbeat.action.query property enables the connector to perform a write operation to the source database on an interval defined by heartbeat.interval.ms. The write operation is meant to produce a change event that is captured by the connector, and is then sent to Kafka or the target system.
In an active database that is capturing changes regularly, the constant stream of changes is sufficient to keep offsets synchronized with the read position in the transaction logs. However, if the connector is capturing changes from a source more changes occur in non-captured tables than in captured tables, setting heartbeat.action.query can help to keep the read position in the offsets synchronized with the lower capture activity.
1.4.3. Technology Preview features Copy linkLink copied to clipboard!
The following Technology Preview features are available in Debezium 2.7.3:
- Section 1.4.3.1, “Debezium connector for MariaDB (DBZ-7693)”
- Section 1.4.3.2, “Compatibility with Oracle Database 23ai”
-
Section 1.4.3.3, “Use of large data types (
BLOB,CLOB, andNCLOB) with the Oracle connector” - Section 1.4.3.4, “Parallel initial snapshots for SQL connectors”
- Section 1.4.3.5, “CloudEvents converter”
- Section 1.4.3.6, “Custom converters”
Technology Preview features are not supported with Red Hat production service-level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend implementing any Technology Preview features in production environments. Technology Preview features provide early access to upcoming product innovations, enabling you to test functionality and provide feedback during the development process. For more information about support scope, see Technology Preview Features Support Scope.
1.4.3.1. Debezium connector for MariaDB (DBZ-7693) Copy linkLink copied to clipboard!
In the previous release, you could configure the Debezium MySQL connector to run against a MariaDB database deployment. Debezium 2.7.3 goes further by introducing a new standalone connector implementation for MariaDB.
With this addition to the suite of connectors, Debezium withdraws the previous ability to set the connection.adapter configuration to enable use of the MySQL connector with a MariaDB database.
1.4.3.2. Compatibility with Oracle Database 23ai Copy linkLink copied to clipboard!
This release of Debezium provides integration with Oracle 23ai as a Technology Preview feature.
1.4.3.3. Use of large data types (BLOB, CLOB, and NCLOB) with the Oracle connector Copy linkLink copied to clipboard!
For more information, see Oracle binary character LOB types in the Debezium User Guide.
1.4.3.4. Parallel initial snapshots for SQL connectors Copy linkLink copied to clipboard!
The ability to use multiple parallel threads to perform an initial snapshot is available as a Technology Preview feature for all of the Debezium SQL connectors, with the exception of the MySQL connector. For for the Debezium MySQL connector, a Developer preview of Parallel initial snapshots is available.
To configure connectors to use parallel initial snapshots, set the snapshot.max.threads property in the connector configuration to a value greater than 1.
For more information, see snapshot.max.threads in the Debezium User Guide documentation for your connector.
1.4.3.5. CloudEvents converter Copy linkLink copied to clipboard!
The CloudEvents converter emits change event records that conform to the CloudEvents specification. The CloudEvents change event envelope can be JSON or Avro and each envelope type supports JSON or Avro as the data format. For more information, see CloudEvents converter.
1.4.3.6. Custom converters Copy linkLink copied to clipboard!
In cases where the default data type conversions do not meet your needs, you can create custom converters to use with a connector. For more information, see Custom-developed converters.
1.4.4. Developer Preview features Copy linkLink copied to clipboard!
The following Developer Preview features are available in Debezium 2.7.3:
- Section 1.4.4.1, “Debezium Server (Developer Preview)”
- Section 1.4.4.2, “MongoDB connector Developer Preview features”
- Section 1.4.4.3, “MySQL connector Developer Preview features”
- Section 1.4.4.4, “Oracle connector Developer Preview features”
- Section 1.4.4.5, “PostgreSQL connector Developer Preview features”
Developer Preview features are not supported by Red Hat in any way and are not functionally complete or production-ready. Do not use Developer Preview software for production or business-critical workloads. Developer Preview software provides early access to upcoming product software in advance of its possible inclusion in a Red Hat product offering. Customers can use this software to test functionality and provide feedback during the development process. This software might not have any documentation, is subject to change or removal at any time, and has received limited testing. Red Hat might provide ways to submit feedback on Developer Preview software without an associated SLA.
For more information about the support scope of Red Hat Developer Preview software, see Developer Preview Support Scope.
1.4.4.1. Debezium Server (Developer Preview) Copy linkLink copied to clipboard!
Debezium Server is a ready-to-use application that streams change events from a data source directly to a configured Kafka or Redis data sink. Unlike the generally available Debezium connectors, the deployment of Debezium Server has no dependencies on Apache Kafka Connect. For more information about the {ProductNameServer} Developer Preview, see the Debezium User Guide
1.4.4.2. MongoDB connector Developer Preview features Copy linkLink copied to clipboard!
1.4.4.2.1. MongoDB collection-scoped streaming (Developer Preview) (DBZ-7760) Copy linkLink copied to clipboard!
In previous iterations of the Debezium MongoDB connector, change streams could be opened against the deployment and database scopes, which was not always ideal for restrictive permission environments. Debezium 2.7.3 introduces a new change stream mode in which the connector operates within the scope of a single collection only, allowing for such granular permissive configurations.
To limit the capture scope of a MongoDB connector to a specific collection, set the value of the capture.scope property in the connector configuration to collection. Use this setting when you intend for a connector to capture changes only from a single MongoDB collection.
Certain limitations exist when using this feature. If you set value of capture.scope to collection, the connector cannot use the default source signaling channel. Enabling the source channel for a connector is required to permit processing of incremental snapshot signals, including signals sent via the Kafka, JMX, or File channels. Thus, if the connector configuration sets the collection option for the capture-scope property, the connector cannot perform incremental snapshots.
1.4.4.3. MySQL connector Developer Preview features Copy linkLink copied to clipboard!
1.4.4.3.1. MySQL parallel schema snapshots (DBZ-6472) (Developer Preview) Copy linkLink copied to clipboard!
To improve snapshot performance the MySQL connector, implements parallelization to concurrently snapshot change events and generate schema events for tables. By running snapshots and generating schema events in parallel, the connector reduces the time required to capture the schema for many tables in your database.
1.4.4.3.2. MySQL parallel initial snapshots (Developer Preview) Copy linkLink copied to clipboard!
The Debezium initial snapshot for MySQL has always been single-threaded. This limitation primarily stems from the complexities of ensuring data consistency across multiple transactions.
In this release, you can configure a MySQL connector to use multiple threads to execute table-level snapshots in parallel.
In order to take advantage of this new feature, add the snapshot.max.threads property to the connector configuration, and set the property to a value greater than 1.
Example 1.6. Example configuration using parallel snapshots
snapshot.max.threads=4
snapshot.max.threads=4
Based on the configuration in the preceding example, the connector can snapshot a maximum of four tables at once. If the number of tables to snapshot is greater than four, after a thread finishes processing one of the first four tables, it then finds the next table in the queue and begins to perform a snapshot. The process continues until the connector finishes performing snapshots on all of the designated tables.
For more information, see snapshot.max.threads in the Debezium User Guide.
1.4.4.4. Oracle connector Developer Preview features Copy linkLink copied to clipboard!
1.4.4.4.1. Ingesting changes from a logical standby (Developer Preview) Copy linkLink copied to clipboard!
When the Debezium connector for Oracle connects to a production or primary database, it uses an internal flush table to manage the flush cycles of the Oracle Log Writer Buffer (LGWR) process. The flush process requires that the user account through which the connector accesses the database has permission to create and write to this flush table. However, logical stand-by databases often have restrictive data manipulation rules, and may even be read-only. As a result, writing to the database might not be feasible.
To support an Oracle read-only logical stand-by database, Debezium introduces a property to disable the creation and management of the flush table. You can use this feature with both Oracle Standalone and Oracle RAC installations.
To enable the Oracle connector to use a read-only logical stand-by, add the following connector option:
internal.log.mining.read.only=true
internal.log.mining.read.only=true
For more information, see the Oracle connector documentation in the Debezium User Guide.
1.4.4.4.2. Oracle LogMiner hybrid mining strategy Copy linkLink copied to clipboard!
You can now configure the Debezium connector for Oracle to use the LogMiner hybrid mining strategy. This strategy is designed to support all of the schema evolution features of the default mining strategy while also taking advantage of the performance optimizations offered by the online catalog strategy. To configure the connector to use this strategy, set the log.mining.strategy configuration property to the value hybrid.
With the online_catalog strategy, a schema change and a data change can occur within the same mining step. When this happens, LogMiner cannot reconstruct the SQL correctly, and the resulting table is assigned a name such as OBJ# xxxxxx; similarly, columns are represented as COL1, COL2, and so on. To avoid this situation when the online catalog strategy is used, requires you to carefully choreograph schema changes so that a mining step never observes both a schema change and a data change together; however, this is not always feasible.
By contrast, the hybrid strategy tracks a table’s object ID at the database level, and then uses this ID to look up the schema associated with the table from Debezium relational table model. Debezium is thus able to do what Oracle LogMiner is unable to do in these specific corner cases. The table name is taken from the relational model’s table name, and columns are mapped by column position.
Unfortunately, Oracle does not provide a way to reconstruct failed SQL operations for CLOB, BLOB, and XML data types. As a result, you cannot configure the connector to use the hybrid strategy if the value of lob.enabled is set to true. If you attempt to start a connector that is configured to use the hybrid strategy and has lob.enabled set to true, the connector fails to start and report a configuration failure.
1.4.4.5. PostgreSQL connector Developer Preview features Copy linkLink copied to clipboard!
1.4.4.5.1. Exactly-once delivery for PostgreSQL streaming (Developer Preview) Copy linkLink copied to clipboard!
In this release, Debezium provides support for exactly-once semantics for the PostgreSQL connector. Exactly-once delivery for PostgreSQL applies only to the streaming phase; exactly-once delivery does not apply to snapshots.
Debezium was designed to provide at-least-once delivery with a goal of ensuring that connectors capture all change events that occur in the monitored sources. In KIP-618 the Apache Kafka community proposes a solution to address problems that occur when a producer retries a message. Source connectors sometimes resend batches of events to the Kafka broker, even after the broker previously committed the batch. This situation can result in duplicate events being sent to consumers (sink connectors), which can cause problems for consumers that do not handle duplicates easily.
No connector configuration changes are required to enable exactly-once delivery. However, exactly-once delivery must be configured in your Kafka Connect worker configuration. For information about setting the required Kafka Connect configuration properties, see KIP-618.
To set exactly.once.support to required in the Kafka worker configuration, all connectors in the Kafka Connect cluster must supports exactly-once delivery, If you attempt to set this option in a cluster in which workers do not consistently support exactly-once delivery, the connectors that do not support this feature fail validation at start-up.
1.4.5. Other updates in this release Copy linkLink copied to clipboard!
This Debezium 2.7.3 release provides multiple other feature updates and fixes. For a complete list, see the Debezium 2.7.3 Enhancement Advisory (RHEA-2024:139598).
1.5. Deprecated features Copy linkLink copied to clipboard!
The following features are deprecated and will be removed in a future release:
- Deprecation of schema_only and schema_only_recovery snapshot modes
-
The
schema_only_recoverymode is deprecated and is replaced by therecoverymode. -
The
schema_onlymode is deprecated and is replaced by theno_datamode.
-
The
The current release continues to include the deprecated snapshot modes, but they are scheduled for removal in a future release. To prepare for their removal, adjust any scripts, configurations, and processes that depend on these deprecated modes.
For information about features that were deprecated in the previous release, see the 2.5.4 Release Notes
1.6. Known issues Copy linkLink copied to clipboard!
The following known issue affects Debezium 2.7.3: