Appendix A. Frequently asked questions
A.1. Questions related to the Cluster Operator
A.1.1. Why do I need cluster administrator privileges to install AMQ Streams?
To install AMQ Streams, you need to be able to create the following cluster-scoped resources:
-
Custom Resource Definitions (CRDs) to instruct OpenShift about resources that are specific to AMQ Streams, such as
Kafka
andKafkaConnect
-
ClusterRoles
andClusterRoleBindings
Cluster-scoped resources, which are not scoped to a particular OpenShift namespace, typically require cluster administrator privileges to install.
As a cluster administrator, you can inspect all the resources being installed (in the /install/
directory) to ensure that the ClusterRoles
do not grant unnecessary privileges.
After installation, the Cluster Operator runs as a regular Deployment
, so any standard (non-admin) OpenShift user with privileges to access the Deployment
can configure it. The cluster administrator can grant standard users the privileges necessary to manage Kafka
custom resources.
See also:
A.1.2. Why does the Cluster Operator need to create ClusterRoleBindings
?
OpenShift has built-in privilege escalation prevention, which means that the Cluster Operator cannot grant privileges it does not have itself, specifically, it cannot grant such privileges in a namespace it cannot access. Therefore, the Cluster Operator must have the privileges necessary for all the components it orchestrates.
The Cluster Operator needs to be able to grant access so that:
-
The Topic Operator can manage
KafkaTopics
, by creatingRoles
andRoleBindings
in the namespace that the operator runs in -
The User Operator can manage
KafkaUsers
, by creatingRoles
andRoleBindings
in the namespace that the operator runs in -
The failure domain of a
Node
is discovered by AMQ Streams, by creating aClusterRoleBinding
When using rack-aware partition assignment, the broker pod needs to be able to get information about the Node
it is running on, for example, the Availability Zone in Amazon AWS. A Node
is a cluster-scoped resource, so access to it can only be granted through a ClusterRoleBinding
, not a namespace-scoped RoleBinding
.
A.1.3. Can standard OpenShift users create Kafka custom resources?
By default, standard OpenShift users will not have the privileges necessary to manage the custom resources handled by the Cluster Operator. The cluster administrator can grant a user the necessary privileges using OpenShift RBAC resources.
For more information, see Designating AMQ Streams administrators in the Deploying and Upgrading AMQ Streams on OpenShift guide.
A.1.4. What do the failed to acquire lock warnings in the log mean?
For each cluster, the Cluster Operator executes only one operation at a time. The Cluster Operator uses locks to make sure that there are never two parallel operations running for the same cluster. Other operations must wait until the current operation completes before the lock is released.
- INFO
- Examples of cluster operations include cluster creation, rolling update, scale down , and scale up.
If the waiting time for the lock takes too long, the operation times out and the following warning message is printed to the log:
2018-03-04 17:09:24 WARNING AbstractClusterOperations:290 - Failed to acquire lock for kafka cluster lock::kafka::myproject::my-cluster
Depending on the exact configuration of STRIMZI_FULL_RECONCILIATION_INTERVAL_MS
and STRIMZI_OPERATION_TIMEOUT_MS
, this warning message might appear occasionally without indicating any underlying issues. Operations that time out are picked up in the next periodic reconciliation, so that the operation can acquire the lock and execute again.
Should this message appear periodically, even in situations when there should be no other operations running for a given cluster, it might indicate that the lock was not properly released due to an error. If this is the case, try restarting the Cluster Operator.
A.1.5. Why is hostname verification failing when connecting to NodePorts using TLS?
Currently, off-cluster access using NodePorts with TLS encryption enabled does not support TLS hostname verification. As a result, the clients that verify the hostname will fail to connect. For example, the Java client will fail with the following exception:
Caused by: java.security.cert.CertificateException: No subject alternative names matching IP address 168.72.15.231 found at sun.security.util.HostnameChecker.matchIP(HostnameChecker.java:168) at sun.security.util.HostnameChecker.match(HostnameChecker.java:94) at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:455) at sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:436) at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:252) at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1501) ... 17 more
To connect, you must disable hostname verification. In the Java client, you can do this by setting the configuration option ssl.endpoint.identification.algorithm
to an empty string.
When configuring the client using a properties file, you can do it this way:
ssl.endpoint.identification.algorithm=
When configuring the client directly in Java, set the configuration option to an empty string:
props.put("ssl.endpoint.identification.algorithm", "");