Chapter 8. Known issues
This section lists the known issues for AMQ Streams 2.3 on OpenShift.
8.1. Kafka Bridge sending messages with CORS enabled
If Cross-Origin Resource Sharing (CORS) is enabled for the Kafka Bridge, a 400 bad request error is returned when sending a HTTP request to produce messages.
Workaround
To avoid this error, disable CORS in the Kafka Bridge configuration. HTTP requests to produce messages must have CORS disabled in the Kafka Bridge. This issue will be fixed in a future release of AMQ Streams.
To use CORS, you can deploy Red Hat 3scale for the Kafka Bridge.
- For information on deploying 3scale see, Using 3scale API Management with the AMQ Streams Kafka Bridge.
- For information on CORS request handling by 3scale, see Administering the API Gateway.
8.2. AMQ Streams Cluster Operator on IPv6 clusters
The AMQ Streams Cluster Operator does not start on Internet Protocol version 6 (IPv6) clusters.
Workaround
There are two workarounds for this issue.
Workaround one: Set the KUBERNETES_MASTER
environment variable
Display the address of the Kubernetes master node of your OpenShift Container Platform cluster:
oc cluster-info Kubernetes master is running at <master_address> # ...
Copy the address of the master node.
List all Operator subscriptions:
oc get subs -n <operator_namespace>
Edit the
Subscription
resource for AMQ Streams:oc edit sub amq-streams -n <operator_namespace>
In
spec.config.env
, add theKUBERNETES_MASTER
environment variable, set to the address of the Kubernetes master node. For example:apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: amq-streams namespace: <operator_namespace> spec: channel: amq-streams-1.8.x installPlanApproval: Automatic name: amq-streams source: mirror-amq-streams sourceNamespace: openshift-marketplace config: env: - name: KUBERNETES_MASTER value: MASTER-ADDRESS
- Save and exit the editor.
Check that the
Subscription
was updated:oc get sub amq-streams -n <operator_namespace>
Check that the Cluster Operator
Deployment
was updated to use the new environment variable:oc get deployment <cluster_operator_deployment_name>
Workaround two: Disable hostname verification
List all Operator subscriptions:
oc get subs -n <operator_namespace>
Edit the
Subscription
resource for AMQ Streams:oc edit sub amq-streams -n <operator_namespace>
In
spec.config.env
, add theKUBERNETES_DISABLE_HOSTNAME_VERIFICATION
environment variable, set totrue
. For example:apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: amq-streams namespace: <operator_namespace> spec: channel: amq-streams-1.8.x installPlanApproval: Automatic name: amq-streams source: mirror-amq-streams sourceNamespace: openshift-marketplace config: env: - name: KUBERNETES_DISABLE_HOSTNAME_VERIFICATION value: "true"
- Save and exit the editor.
Check that the
Subscription
was updated:oc get sub amq-streams -n <operator_namespace>
Check that the Cluster Operator
Deployment
was updated to use the new environment variable:oc get deployment <cluster_operator_deployment_name>
8.3. Cruise Control CPU utilization estimation
Cruise Control for AMQ Streams has a known issue that relates to the calculation of CPU utilization estimation. CPU utilization is calculated as a percentage of the defined capacity of a broker pod. The issue occurs when running Kafka brokers across nodes with varying CPU cores. For example, node1 might have 2 CPU cores and node2 might have 4 CPU cores. In this situation, Cruise Control can underestimate and overestimate CPU load of brokers The issue can prevent cluster rebalances when the pod is under heavy load.
Workaround
There are two workarounds for this issue.
Workaround one: Equal CPU requests and limits
You can set CPU requests equal to CPU limits in Kafka.spec.kafka.resources
. That way, all CPU resources are reserved upfront and are always available. This configuration allows Cruise Control to properly evaluate the CPU utilization when preparing the rebalance proposals based on CPU goals.
Workaround two: Exclude CPU goals
You can exclude CPU goals from the hard and default goals specified in the Cruise Control configuration.
Example Cruise Control configuration without CPU goals
apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: my-cluster spec: kafka: # ... zookeeper: # ... entityOperator: topicOperator: {} userOperator: {} cruiseControl: brokerCapacity: inboundNetwork: 10000KB/s outboundNetwork: 10000KB/s config: hard.goals: > com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.MinTopicLeadersPerBrokerGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal default.goals: > com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.MinTopicLeadersPerBrokerGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.PotentialNwOutGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskUsageDistributionGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundUsageDistributionGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundUsageDistributionGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.TopicReplicaDistributionGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal, com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderBytesInDistributionGoal
For more information, see Insufficient CPU capacity.
8.4. User Operator scalability
The User Operator can timeout when creating multiple users at the same time. Reconciliation can take too long.
Workaround
If you encounter this issue, reduce the number of users you are creating at the same time. And wait until they are ready before creating more users.
8.5. OAuth password grants configuration
OAuth password grants are currently not being handled correctly by the Kafka Bridge. The OAuth authentication is not being configured properly.
This will be fixed for the next release.
Issue Number | Description |
Newly added OAuth Password Grant feature not working in Kafka Bridge |
8.6. OpenTelemetry: running Jaeger with TLS enabled
Support for tracing using OpenTelemetry is built in to the following Kafka components:
- Kafka Connect
- MirrorMaker
- MirrorMaker 2
- AMQ Streams Kafka Bridge
When using the Jaeger exporter, trace data is retrieved through the Jaeger gPRC endpoint. By default, this endpoint does not have TLS enabled. However, it can still be configured to use TLS when deploying the Jaeger instance using the Jaeger operator. For example, when running the Red Hat OpenShift distributed tracing operator on OpenShift, which is a Jaeger operator, the operator automatically enables TLS. Jaeger instances with TLS on the gRPC endpoint are not supported on AMQ Streams.
There are two workarounds for this issue.
Workaround one: Disable TLS on the gRPC endpoint
Create a Jaeger custom resource and disable TLS on the gRPC port by specifying the following properties.
-
collector.grpc.tls.enabled: false
-
reporter.grpc.tls.enabled: false
Example Jaeger custom resource to disable TLS
apiVersion: jaegertracing.io/v1 kind: Jaeger metadata: name: my-jaeger spec: allInOne: options: agent.grpc.tls.enabled: false collector.grpc.tls.enabled: false
This configuration uses the allInOne
strategy, which deploys all Jaeger components in a single pod. Other deployment strategies, such as the production
strategy for production environments, separate the Jaeger components into separate pods for increased scalability and reliability.
Workaround two: Export traces through an OpenTelemetry collector
Use a collector to receive, process, and export the OpenTelemetry trace data. To resolve the issue by exporting trace data through an OpenTelemetry collector, you can follow these steps:
- Deploy the Red Hat OpenShift distributed tracing collection operator.
-
Configure an
OpenTelemetryCollector
custom resource to deploy the collector to receive trace data through a non-TLS-enabled endpoint and pass it to a TLS-enabled endpoint. -
In the custom resource, specify the
receivers
properties to create a non-TLS-enabled Jaeger gRPC endpoint on port 14250. You can also create other endpoints, such as an OTLP endpoint, if you are using other tracing systems. -
Specify the
exporters
properties to point to the TLS-enabled Jaeger gRPC endpoint. -
Declare the pipeline configuration in the
pipelines
properties of the custom resource.
In this example, the pipeline is from Jaeger and OTLP receivers to a Jaeger gRPC endpoint.
Example OpenTelemetry collector configuration
apiVersion: opentelemetry.io/v1alpha1 kind: OpenTelemetryCollector metadata: name: cluster-collector namespace: <namespace> spec: mode: deployment config: | receivers: otlp: protocols: grpc: http: jaeger: protocols: grpc: exporters: jaeger: endpoint: jaeger-all-in-one-inmemory-collector-headless.openshift-distributed-tracing.svc.cluster.local:14250 tls: ca_file: "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt" service: pipelines: traces: receivers: [otlp,jaeger] exporters: [jaeger]
To use the collector, you then need to specify the collector endpoint as the exporter endpoint in the tracing configuration.
Example tracing configuration for Kafka Connect using OpenTelemetry
apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaConnect metadata: name: my-connect-cluster spec: #... template: connectContainer: env: - name: OTEL_SERVICE_NAME value: my-otel-service - name: OTEL_EXPORTER_JAEGER_ENDPOINT value: "http:// jaeger-all-in-one-inmemory-collector-headless.openshift-distributed-tracing.svc.cluster.local:14250" tracing: type: opentelemetry #...