Chapter 1. Deployment overview
AMQ Streams simplifies the process of running Apache Kafka in an OpenShift cluster.
This guide provides instructions for deploying and managing AMQ Streams. Deployment options and steps are covered using the example installation files included with AMQ Streams. While the guide highlights important configuration considerations, it does not cover all available options. For a deeper understanding of the Kafka component configuration options, refer to the AMQ Streams Custom Resource API Reference.
In addition to deployment instructions, the guide offers pre- and post-deployment guidance. It covers setting up and securing client access to your Kafka cluster. Furthermore, it explores additional deployment options such as metrics integration, distributed tracing, and cluster management tools like Cruise Control and the AMQ Streams Drain Cleaner. You’ll also find recommendations on managing AMQ Streams and fine-tuning Kafka configuration for optimal performance.
Upgrade instructions are provided for both AMQ Streams and Kafka, to help keep your deployment up to date.
AMQ Streams is designed to be compatible with all types of OpenShift clusters, irrespective of their distribution. Whether your deployment involves public or private clouds, or if you are setting up a local development environment, the instructions in this guide are applicable in all cases.
1.1. AMQ Streams custom resources
Deployment of Kafka components to an OpenShift cluster using AMQ Streams is highly configurable through the application of custom resources. These custom resources are created as instances of APIs added by Custom Resource Definitions (CRDs) to extend OpenShift resources.
CRDs act as configuration instructions to describe the custom resources in an OpenShift cluster, and are provided with AMQ Streams for each Kafka component used in a deployment, as well as users and topics. CRDs and custom resources are defined as YAML files. Example YAML files are provided with the AMQ Streams distribution.
CRDs also allow AMQ Streams resources to benefit from native OpenShift features like CLI accessibility and configuration validation.
1.1.1. AMQ Streams custom resource example
CRDs require a one-time installation in a cluster to define the schemas used to instantiate and manage AMQ Streams-specific resources.
After a new custom resource type is added to your cluster by installing a CRD, you can create instances of the resource based on its specification.
Depending on the cluster setup, installation typically requires cluster admin privileges.
Access to manage custom resources is limited to AMQ Streams administrators. For more information, see Section 4.6, “Designating AMQ Streams administrators”.
A CRD defines a new kind
of resource, such as kind:Kafka
, within an OpenShift cluster.
The Kubernetes API server allows custom resources to be created based on the kind
and understands from the CRD how to validate and store the custom resource when it is added to the OpenShift cluster.
Each AMQ Streams-specific custom resource conforms to the schema defined by the CRD for the resource’s kind
. The custom resources for AMQ Streams components have common configuration properties, which are defined under spec
.
To understand the relationship between a CRD and a custom resource, let’s look at a sample of the CRD for a Kafka topic.
Kafka topic CRD
apiVersion: kafka.strimzi.io/v1beta2 kind: CustomResourceDefinition metadata: 1 name: kafkatopics.kafka.strimzi.io labels: app: strimzi spec: 2 group: kafka.strimzi.io versions: v1beta2 scope: Namespaced names: # ... singular: kafkatopic plural: kafkatopics shortNames: - kt 3 additionalPrinterColumns: 4 # ... subresources: status: {} 5 validation: 6 openAPIV3Schema: properties: spec: type: object properties: partitions: type: integer minimum: 1 replicas: type: integer minimum: 1 maximum: 32767 # ...
- 1
- The metadata for the topic CRD, its name and a label to identify the CRD.
- 2
- The specification for this CRD, including the group (domain) name, the plural name and the supported schema version, which are used in the URL to access the API of the topic. The other names are used to identify instance resources in the CLI. For example,
oc get kafkatopic my-topic
oroc get kafkatopics
. - 3
- The shortname can be used in CLI commands. For example,
oc get kt
can be used as an abbreviation instead ofoc get kafkatopic
. - 4
- The information presented when using a
get
command on the custom resource. - 5
- The current status of the CRD as described in the schema reference for the resource.
- 6
- openAPIV3Schema validation provides validation for the creation of topic custom resources. For example, a topic requires at least one partition and one replica.
You can identify the CRD YAML files supplied with the AMQ Streams installation files, because the file names contain an index number followed by ‘Crd’.
Here is a corresponding example of a KafkaTopic
custom resource.
Kafka topic custom resource
apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaTopic 1 metadata: name: my-topic labels: strimzi.io/cluster: my-cluster 2 spec: 3 partitions: 1 replicas: 1 config: retention.ms: 7200000 segment.bytes: 1073741824 status: conditions: 4 lastTransitionTime: "2019-08-20T11:37:00.706Z" status: "True" type: Ready observedGeneration: 1 / ...
- 1
- The
kind
andapiVersion
identify the CRD of which the custom resource is an instance. - 2
- A label, applicable only to
KafkaTopic
andKafkaUser
resources, that defines the name of the Kafka cluster (which is same as the name of theKafka
resource) to which a topic or user belongs. - 3
- The spec shows the number of partitions and replicas for the topic as well as the configuration parameters for the topic itself. In this example, the retention period for a message to remain in the topic and the segment file size for the log are specified.
- 4
- Status conditions for the
KafkaTopic
resource. Thetype
condition changed toReady
at thelastTransitionTime
.
Custom resources can be applied to a cluster through the platform CLI. When the custom resource is created, it uses the same validation as the built-in resources of the Kubernetes API.
After a KafkaTopic
custom resource is created, the Topic Operator is notified and corresponding Kafka topics are created in AMQ Streams.
1.2. AMQ Streams operators
AMQ Streams operators are purpose-built with specialist operational knowledge to effectively manage Kafka on OpenShift. Each operator performs a distinct function.
- Cluster Operator
- The Cluster Operator handles the deployment and management of Apache Kafka clusters on OpenShift. It automates the setup of Kafka brokers, and other Kafka components and resources.
- Topic Operator
- The Topic Operator manages the creation, configuration, and deletion of topics within Kafka clusters.
- User Operator
- The User Operator manages Kafka users that require access to Kafka brokers.
When you deploy AMQ Streams, you first deploy the Cluster Operator. The Cluster Operator is then ready to handle the deployment of Kafka. You can also deploy the Topic Operator and User Operator using the Cluster Operator (recommended) or as standalone operators. You would use a standalone operator with a Kafka cluster that is not managed by the Cluster Operator.
The Topic Operator and User Operator are part of the Entity Operator. The Cluster Operator can deploy one or both operators based on the Entity Operator configuration.
To deploy the standalone operators, you need to set environment variables to connect to a Kafka cluster. These environment variables do not need to be set if you are deploying the operators using the Cluster Operator as they will be set by the Cluster Operator.
1.2.1. Watching AMQ Streams resources in OpenShift namespaces
Operators watch and manage AMQ Streams resources in OpenShift namespaces. The Cluster Operator can watch a single namespace, multiple namespaces, or all namespaces in an OpenShift cluster. The Topic Operator and User Operator can watch a single namespace.
-
The Cluster Operator watches for
Kafka
resources -
The Topic Operator watches for
KafkaTopic
resources -
The User Operator watches for
KafkaUser
resources
The Topic Operator and the User Operator can only watch a single Kafka cluster in a namespace. And they can only be connected to a single Kafka cluster.
If multiple Topic Operators watch the same namespace, name collisions and topic deletion can occur. This is because each Kafka cluster uses Kafka topics that have the same name (such as __consumer_offsets
). Make sure that only one Topic Operator watches a given namespace.
When using multiple User Operators with a single namespace, a user with a given username can exist in more than one Kafka cluster.
If you deploy the Topic Operator and User Operator using the Cluster Operator, they watch the Kafka cluster deployed by the Cluster Operator by default. You can also specify a namespace using watchedNamespace
in the operator configuration.
For a standalone deployment of each operator, you specify a namespace and connection to the Kafka cluster to watch in the configuration.
1.2.2. Managing RBAC resources
The Cluster Operator creates and manages role-based access control (RBAC) resources for AMQ Streams components that need access to OpenShift resources.
For the Cluster Operator to function, it needs permission within the OpenShift cluster to interact with Kafka resources, such as Kafka
and KafkaConnect
, as well as managed resources like ConfigMap
, Pod
, Deployment
, and Service
.
Permission is specified through the following OpenShift RBAC resources:
-
ServiceAccount
-
Role
andClusterRole
-
RoleBinding
andClusterRoleBinding
1.2.2.1. Delegating privileges to AMQ Streams components
The Cluster Operator runs under a service account called strimzi-cluster-operator
. It is assigned cluster roles that give it permission to create the RBAC resources for AMQ Streams components. Role bindings associate the cluster roles with the service account.
OpenShift prevents components operating under one ServiceAccount
from granting another ServiceAccount
privileges that the granting ServiceAccount
does not have. Because the Cluster Operator creates the RoleBinding
and ClusterRoleBinding
RBAC resources needed by the resources it manages, it requires a role that gives it the same privileges.
The following sections describe the RBAC resources required by the Cluster Operator.
1.2.2.2. ClusterRole
resources
The Cluster Operator uses ClusterRole
resources to provide the necessary access to resources. Depending on the OpenShift cluster setup, a cluster administrator might be needed to create the cluster roles.
Cluster administrator rights are only needed for the creation of ClusterRole
resources. The Cluster Operator will not run under a cluster admin account.
The RBAC resources follow the principle of least privilege and contain only those privileges needed by the Cluster Operator to operate the cluster of the Kafka component.
All cluster roles are required by the Cluster Operator in order to delegate privileges.
Name | Description |
---|---|
| Access rights for namespace-scoped resources used by the Cluster Operator to deploy and manage the operands. |
| Access rights for cluster-scoped resources used by the Cluster Operator to deploy and manage the operands. |
| Access rights used by the Cluster Operator for leader election. |
| Access rights used by the Cluster Operator to watch and manage the AMQ Streams custom resources. |
| Access rights to allow Kafka brokers to get the topology labels from OpenShift worker nodes when rack-awareness is used. |
| Access rights used by the Topic and User Operators to manage Kafka users and topics. |
| Access rights to allow Kafka Connect, MirrorMaker (1 and 2), and Kafka Bridge to get the topology labels from OpenShift worker nodes when rack-awareness is used. |
1.2.2.3. ClusterRoleBinding
resources
The Cluster Operator uses ClusterRoleBinding
and RoleBinding
resources to associate its ClusterRole
with its ServiceAccount
. Cluster role bindings are required by cluster roles containing cluster-scoped resources.
Name | Description |
---|---|
|
Grants the Cluster Operator the rights from the |
|
Grants the Cluster Operator the rights from the |
|
Grants the Cluster Operator the rights from the |
Name | Description |
---|---|
|
Grants the Cluster Operator the rights from the |
|
Grants the Cluster Operator the rights from the |
|
Grants the Cluster Operator the rights from the |
|
Grants the Cluster Operator the rights from the |
1.2.2.4. ServiceAccount
resources
The Cluster Operator runs using the strimzi-cluster-operator
ServiceAccount
. This service account grants it the privileges it requires to manage the operands. The Cluster Operator creates additional ClusterRoleBinding
and RoleBinding
resources to delegate some of these RBAC rights to the operands.
Each of the operands uses its own service account created by the Cluster Operator. This allows the Cluster Operator to follow the principle of least privilege and give the operands only the access rights that are really need.
Name | Used by |
---|---|
| ZooKeeper pods |
| Kafka broker pods |
| Entity Operator |
| Cruise Control pods |
| Kafka Exporter pods |
| Kafka Connect pods |
| MirrorMaker pods |
| MirrorMaker 2 pods |
| Kafka Bridge pods |
1.3. Using the Kafka Bridge to connect with a Kafka cluster
You can use the AMQ Streams Kafka Bridge API to create and manage consumers and send and receive records over HTTP rather than the native Kafka protocol.
When you set up the Kafka Bridge you configure HTTP access to the Kafka cluster. You can then use the Kafka Bridge to produce and consume messages from the cluster, as well as performing other operations through its REST interface.
Additional resources
- For information on installing and using the Kafka Bridge, see Using the AMQ Streams Kafka Bridge.
1.4. Seamless FIPS support
Federal Information Processing Standards (FIPS) are standards for computer security and interoperability. When running AMQ Streams on a FIPS-enabled OpenShift cluster, the OpenJDK used in AMQ Streams container images automatically switches to FIPS mode. From version 2.3, AMQ Streams can run on FIPS-enabled OpenShift clusters without any changes or special configuration. It uses only the FIPS-compliant security libraries from the OpenJDK.
Minimum password length
When running in the FIPS mode, SCRAM-SHA-512 passwords need to be at least 32 characters long. From AMQ Streams 2.3, the default password length in AMQ Streams User Operator is set to 32 characters as well. If you have a Kafka cluster with custom configuration that uses a password length that is less than 32 characters, you need to update your configuration. If you have any users with passwords shorter than 32 characters, you need to regenerate a password with the required length. You can do that, for example, by deleting the user secret and waiting for the User Operator to create a new password with the appropriate length.
If you are using FIPS-enabled OpenShift clusters, you may experience higher memory consumption compared to regular OpenShift clusters. To avoid any issues, we suggest increasing the memory request to at least 512Mi.
1.5. Document Conventions
User-replaced values
User-replaced values, also known as replaceables, are shown in with angle brackets (< >). Underscores ( _ ) are used for multi-word values. If the value refers to code or commands, monospace
is also used.
For example, the following code shows that <my_namespace>
must be replaced by the correct namespace name:
sed -i 's/namespace: .*/namespace: <my_namespace>' install/cluster-operator/*RoleBinding*.yaml