이 콘텐츠는 선택한 언어로 제공되지 않습니다.
Chapter 8. Known issues
Known issues in Streams for Apache Kafka 3.1 on OpenShift.
8.1. Apache Kafka 링크 복사링크가 클립보드에 복사되었습니다!
A summary of known issues for Apache Kafka.
8.1.1. Intra-broker log directory reassignment can cause a log directory to go offline 링크 복사링크가 클립보드에 복사되었습니다!
When using multiple log directories per broker (JBOD) and performing intra-broker log directory reassignment (moving replicas between log directories on the same broker), Apache Kafka can incorrectly mark a log directory as failed if a transient filesystem or I/O error occurs during the operation.
This issue is caused by a race condition between background log flush operations and file deletion during replica movement. Under these conditions, Kafka may encounter a NoSuchFileException or a related I/O error and treat it as a fatal storage failure. As a result, the broker takes the entire log directory offline to protect data integrity, and any partitions stored on that directory become unavailable. The log directory can remain marked as failed even after the reassignment completes.
This behavior affects intra-broker log directory reassignment only. Inter-broker partition reassignment is not affected.
Workaround
Restart the affected Kafka broker. On restart, the broker re-scans the log directories and marks the disk as healthy if no underlying filesystem issue is present.
This is a known issue in Apache Kafka. Work to address this issue is being tracked in the Apache Kafka issue tracker (KAFKA-19571). A fix will be included in a future release of Red Hat Streams for Apache Kafka once it is available in the underlying Apache Kafka distribution.
8.1.2. Enabling eligible.leader.replicas.version 링크 복사링크가 클립보드에 복사되었습니다!
A change in Apache Kafka 4.0.0 introduces restrictions on how the min.insync.replicas configuration is managed when the Eligible Leader Replicas (ELR) feature is enabled by setting eligible.leader.replicas.version to 1.
ELR is disabled by default. Enabling ELR without understanding these restrictions can lead to configuration reconciliation issues or cluster instability.
When ELR is enabled, the behavior of min.insync.replicas changes as follows:
Cluster-level (default):
- Cannot be removed
- Changing it will clear ELR configuration from all topics
Broker-level:
- Ignored entirely
- You cannot change it
-
It is recommended to remove all broker-level
min.insync.replicasbefore enabling ELR
Topic-level:
-
Changing
min.insync.replicasclears the ELR configuration for that topic
-
Changing
Workaround Only set eligible.leader.replicas.version to 1 if you have a specific use case that requires
If you must use ELR:
-
Remove all broker-level
min.insync.replicasconfigurations. These settings are ignored when ELR is enabled but can lead to reconciliation errors and, in some cases, leaderless partitions during rolling updates or broker restarts. - Avoid changing the cluster-level default unless you intend to remove ELR configuration from all topics.
- Avoid changing the topic-level configuration unless you want to remove ELR configuration from that topic.
8.2. Streams for Apache Kafka 링크 복사링크가 클립보드에 복사되었습니다!
A summary of known issues for Streams for Apache Kafka.
8.2.1. Multi-Version upgrades from the OperatorHub LTS channel 링크 복사링크가 클립보드에 복사되었습니다!
Currently, multi-version upgrades between Long Term Support (LTS) versions are not supported through the Operator Lifecycle Manager (OLM) when using the OperatorHub LTS channel.
For example, you cannot directly upgrade from version 2.2 LTS to version 2.9 LTS. Instead, you must perform incremental upgrades, stepping through each intermediate minor version to reach version 2.9.
8.2.2. Cruise Control CPU utilization estimation 링크 복사링크가 클립보드에 복사되었습니다!
Cruise Control for Streams for Apache Kafka has a known issue that relates to the calculation of CPU utilization estimation. CPU utilization is calculated as a percentage of the defined capacity of a broker pod. The issue occurs when running Kafka brokers across nodes with varying CPU cores. For example, node1 might have 2 CPU cores and node2 might have 4 CPU cores. In this situation, Cruise Control can underestimate and overestimate CPU load of brokers The issue can prevent cluster rebalances when the pod is under heavy load.
There are two workarounds for this issue.
Workaround one
Equal CPU requests and limits: You can set CPU requests equal to CPU limits in Kafka.spec.kafka.resources. That way, all CPU resources are reserved upfront and are always available. This configuration allows Cruise Control to properly evaluate the CPU utilization when preparing the rebalance proposals based on CPU goals.
Workaround two Exclude CPU goals: You can exclude CPU goals from the hard and default goals specified in the Cruise Control configuration:
For more information, see Insufficient CPU capacity.
8.2.3. JMX authentication when running in FIPS mode 링크 복사링크가 클립보드에 복사되었습니다!
When running Streams for Apache Kafka in FIPS mode with JMX authentication enabled, clients may fail authentication. To work around this issue, do not enable JMX authentication while running in FIPS mode. We are investigating the issue and working to resolve it in a future release.
8.3. Kafka Bridge 링크 복사링크가 클립보드에 복사되었습니다!
There are no new or existing known issues for the Kafka Bridge.
8.4. Proxy 링크 복사링크가 클립보드에 복사되었습니다!
A summary of known issues for Streams for Apache Kafka Proxy.
8.4.1. Proxy pod may restart without a user-initiated configuration change 링크 복사링크가 클립보드에 복사되었습니다!
The Streams for Apache Kafka Proxy Operator may trigger unnecessary restarts of the proxy pod even when no user-initiated configuration changes have occurred.
This issue stems from how the Operator tracks changes to ConfigMap and Secret resources. These resources do not include a generation field, so the Operator falls back to using the resourceVersion and UUID to detect changes. However, resourceVersion can be incremented by non-user activity, such as etcd performing automatic encryption key rotation (typically weekly), resulting in a new resourceVersion.
As a result, the operator may interpret system-driven updates as user changes and trigger a restart of the proxy pod. This issue will be addressed in a future release.
8.4.2. Message production fails for records ~1MB with Record Encryption filter 링크 복사링크가 클립보드에 복사되었습니다!
When using the Record Encryption filter, producing Kafka messages with a size of approximately 1MB (specifically 1,048,319 bytes or larger) will fail. This is caused by an internal limitation in the encryption filter that incorrectly calculates the required encryption operations for records of this size, even if the Kafka broker is configured to accept larger messages.
Symptom
-
Kafka producers will repeatedly fail with
NETWORK_EXCEPTIONerrors and disconnect from the proxy. -
The proxy logs will show a
WARNwith aDekUsageExceptionerror, stating:The Encryptor has no more operations allowed.The connection is then closed.
Workaround
Ensure individual Kafka records are kept below the 1,048,319-byte threshold when the Record Encryption filter is active.
8.4.3. Describing consumer groups fails when using the Authorization filter 링크 복사링크가 클립보드에 복사되었습니다!
When using the Authorization filter, attempts to describe consumer groups through the proxy fail.
This issue occurs when running the kafka-consumer-groups --describe command against a proxy configured with the Authorization filter.
Symptom
-
The
kafka-consumer-groupscommand fails with the following error:
org.apache.kafka.common.errors.UnsupportedVersionException: The node does not support DESCRIBE_GROUPS
org.apache.kafka.common.errors.UnsupportedVersionException: The node does not support DESCRIBE_GROUPS
Workaround
Describe consumer groups by connecting directly to the Kafka cluster rather than through the proxy.
The proxy cannot be configured with a SASL Inspection filter that includes SASL subject builder configuration. This issue is caused by a configuration parsing defect.
If such a configuration is applied, the proxy fails during startup.
Symptom
-
The proxy fails to start and throws a
PluginConfigurationExceptionwith aClassCastException, indicating that aLinkedHashMapcannot be cast to the expected subject builder configuration class.
Impact
This issue prevents mapping the SASL authorized ID. As a result, it may not be possible to define concise or targeted ACL rules.
Workaround
There is no workaround. Do not configure SASL subject builder settings in the SASL Inspection filter until this issue is resolved.
8.5. Console 링크 복사링크가 클립보드에 복사되었습니다!
A summary of known issues for Streams for Apache Kafka Console.
8.5.1. Role bindings apply to all Kafka clusters with the same name, regardless of namespace 링크 복사링크가 클립보드에 복사되었습니다!
The console allows Kafka clusters to have the same name in different namespaces. However, when configuring a role to grant role-based access control (RBAC) permissions to a specific Kafka cluster by name, the configuration does not account for the cluster’s namespace. As a result, permissions granted to a resource by name apply to all Kafka clusters with that name, across all namespaces.
Symptom A role intended to grant access to a single Kafka cluster (my-kafka in namespace1) inadvertently grants access to all other clusters with the same name in different namespaces, such as my-kafka in namespace2. It is not possible to configure a role that distinguishes between Kafka clusters with identical names located in separate namespaces.
Workaround To isolate permissions for a specific Kafka cluster, ensure that all cluster names are globally unique across all namespaces.