Chapter 1. Deployment overview


AMQ Streams simplifies the process of running Apache Kafka in an OpenShift cluster.

This guide provides instructions for deploying and managing AMQ Streams. Deployment options and steps are covered using the example installation files included with AMQ Streams. While the guide highlights important configuration considerations, it does not cover all available options. For a deeper understanding of the Kafka component configuration options, refer to the AMQ Streams Custom Resource API Reference.

In addition to deployment instructions, the guide offers pre- and post-deployment guidance. It covers setting up and securing client access to your Kafka cluster. Furthermore, it explores additional deployment options such as metrics integration, distributed tracing, and cluster management tools like Cruise Control and the AMQ Streams Drain Cleaner. You’ll also find recommendations on managing AMQ Streams and fine-tuning Kafka configuration for optimal performance.

Upgrade instructions are provided for both AMQ Streams and Kafka, to help keep your deployment up to date.

AMQ Streams is designed to be compatible with all types of OpenShift clusters, irrespective of their distribution. Whether your deployment involves public or private clouds, or if you are setting up a local development environment, the instructions in this guide are applicable in all cases.

1.1. AMQ Streams custom resources

Deployment of Kafka components to an OpenShift cluster using AMQ Streams is highly configurable through the application of custom resources. These custom resources are created as instances of APIs added by Custom Resource Definitions (CRDs) to extend OpenShift resources.

CRDs act as configuration instructions to describe the custom resources in an OpenShift cluster, and are provided with AMQ Streams for each Kafka component used in a deployment, as well as users and topics. CRDs and custom resources are defined as YAML files. Example YAML files are provided with the AMQ Streams distribution.

CRDs also allow AMQ Streams resources to benefit from native OpenShift features like CLI accessibility and configuration validation.

1.1.1. AMQ Streams custom resource example

CRDs require a one-time installation in a cluster to define the schemas used to instantiate and manage AMQ Streams-specific resources.

After a new custom resource type is added to your cluster by installing a CRD, you can create instances of the resource based on its specification.

Depending on the cluster setup, installation typically requires cluster admin privileges.

Note

Access to manage custom resources is limited to AMQ Streams administrators. For more information, see Section 4.6, “Designating AMQ Streams administrators”.

A CRD defines a new kind of resource, such as kind:Kafka, within an OpenShift cluster.

The Kubernetes API server allows custom resources to be created based on the kind and understands from the CRD how to validate and store the custom resource when it is added to the OpenShift cluster.

Each AMQ Streams-specific custom resource conforms to the schema defined by the CRD for the resource’s kind. The custom resources for AMQ Streams components have common configuration properties, which are defined under spec.

To understand the relationship between a CRD and a custom resource, let’s look at a sample of the CRD for a Kafka topic.

Kafka topic CRD

apiVersion: kafka.strimzi.io/v1beta2
kind: CustomResourceDefinition
metadata: 1
  name: kafkatopics.kafka.strimzi.io
  labels:
    app: strimzi
spec: 2
  group: kafka.strimzi.io
  versions:
    v1beta2
  scope: Namespaced
  names:
    # ...
    singular: kafkatopic
    plural: kafkatopics
    shortNames:
    - kt 3
  additionalPrinterColumns: 4
      # ...
  subresources:
    status: {} 5
  validation: 6
    openAPIV3Schema:
      properties:
        spec:
          type: object
          properties:
            partitions:
              type: integer
              minimum: 1
            replicas:
              type: integer
              minimum: 1
              maximum: 32767
      # ...

1
The metadata for the topic CRD, its name and a label to identify the CRD.
2
The specification for this CRD, including the group (domain) name, the plural name and the supported schema version, which are used in the URL to access the API of the topic. The other names are used to identify instance resources in the CLI. For example, oc get kafkatopic my-topic or oc get kafkatopics.
3
The shortname can be used in CLI commands. For example, oc get kt can be used as an abbreviation instead of oc get kafkatopic.
4
The information presented when using a get command on the custom resource.
5
The current status of the CRD as described in the schema reference for the resource.
6
openAPIV3Schema validation provides validation for the creation of topic custom resources. For example, a topic requires at least one partition and one replica.
Note

You can identify the CRD YAML files supplied with the AMQ Streams installation files, because the file names contain an index number followed by ‘Crd’.

Here is a corresponding example of a KafkaTopic custom resource.

Kafka topic custom resource

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic 1
metadata:
  name: my-topic
  labels:
    strimzi.io/cluster: my-cluster 2
spec: 3
  partitions: 1
  replicas: 1
  config:
    retention.ms: 7200000
    segment.bytes: 1073741824
status:
  conditions: 4
    lastTransitionTime: "2019-08-20T11:37:00.706Z"
    status: "True"
    type: Ready
  observedGeneration: 1
  / ...

1
The kind and apiVersion identify the CRD of which the custom resource is an instance.
2
A label, applicable only to KafkaTopic and KafkaUser resources, that defines the name of the Kafka cluster (which is same as the name of the Kafka resource) to which a topic or user belongs.
3
The spec shows the number of partitions and replicas for the topic as well as the configuration parameters for the topic itself. In this example, the retention period for a message to remain in the topic and the segment file size for the log are specified.
4
Status conditions for the KafkaTopic resource. The type condition changed to Ready at the lastTransitionTime.

Custom resources can be applied to a cluster through the platform CLI. When the custom resource is created, it uses the same validation as the built-in resources of the Kubernetes API.

After a KafkaTopic custom resource is created, the Topic Operator is notified and corresponding Kafka topics are created in AMQ Streams.

1.2. AMQ Streams operators

AMQ Streams operators are purpose-built with specialist operational knowledge to effectively manage Kafka on OpenShift. Each operator performs a distinct function.

Cluster Operator
The Cluster Operator handles the deployment and management of Apache Kafka clusters on OpenShift. It automates the setup of Kafka brokers, and other Kafka components and resources.
Topic Operator
The Topic Operator manages the creation, configuration, and deletion of topics within Kafka clusters.
User Operator
The User Operator manages Kafka users that require access to Kafka brokers.

When you deploy AMQ Streams, you first deploy the Cluster Operator. The Cluster Operator is then ready to handle the deployment of Kafka. You can also deploy the Topic Operator and User Operator using the Cluster Operator (recommended) or as standalone operators. You would use a standalone operator with a Kafka cluster that is not managed by the Cluster Operator.

The Topic Operator and User Operator are part of the Entity Operator. The Cluster Operator can deploy one or both operators based on the Entity Operator configuration.

Important

To deploy the standalone operators, you need to set environment variables to connect to a Kafka cluster. These environment variables do not need to be set if you are deploying the operators using the Cluster Operator as they will be set by the Cluster Operator.

1.2.1. Watching AMQ Streams resources in OpenShift namespaces

Operators watch and manage AMQ Streams resources in OpenShift namespaces. The Cluster Operator can watch a single namespace, multiple namespaces, or all namespaces in an OpenShift cluster. The Topic Operator and User Operator can watch a single namespace.

  • The Cluster Operator watches for Kafka resources
  • The Topic Operator watches for KafkaTopic resources
  • The User Operator watches for KafkaUser resources

The Topic Operator and the User Operator can only watch a single Kafka cluster in a namespace. And they can only be connected to a single Kafka cluster.

If multiple Topic Operators watch the same namespace, name collisions and topic deletion can occur. This is because each Kafka cluster uses Kafka topics that have the same name (such as __consumer_offsets). Make sure that only one Topic Operator watches a given namespace.

When using multiple User Operators with a single namespace, a user with a given username can exist in more than one Kafka cluster.

If you deploy the Topic Operator and User Operator using the Cluster Operator, they watch the Kafka cluster deployed by the Cluster Operator by default. You can also specify a namespace using watchedNamespace in the operator configuration.

For a standalone deployment of each operator, you specify a namespace and connection to the Kafka cluster to watch in the configuration.

1.2.2. Managing RBAC resources

The Cluster Operator creates and manages role-based access control (RBAC) resources for AMQ Streams components that need access to OpenShift resources.

For the Cluster Operator to function, it needs permission within the OpenShift cluster to interact with Kafka resources, such as Kafka and KafkaConnect, as well as managed resources like ConfigMap, Pod, Deployment, and Service.

Permission is specified through the following OpenShift RBAC resources:

  • ServiceAccount
  • Role and ClusterRole
  • RoleBinding and ClusterRoleBinding

1.2.2.1. Delegating privileges to AMQ Streams components

The Cluster Operator runs under a service account called strimzi-cluster-operator. It is assigned cluster roles that give it permission to create the RBAC resources for AMQ Streams components. Role bindings associate the cluster roles with the service account.

OpenShift prevents components operating under one ServiceAccount from granting another ServiceAccount privileges that the granting ServiceAccount does not have. Because the Cluster Operator creates the RoleBinding and ClusterRoleBinding RBAC resources needed by the resources it manages, it requires a role that gives it the same privileges.

The following sections describe the RBAC resources required by the Cluster Operator.

1.2.2.2. ClusterRole resources

The Cluster Operator uses ClusterRole resources to provide the necessary access to resources. Depending on the OpenShift cluster setup, a cluster administrator might be needed to create the cluster roles.

Note

Cluster administrator rights are only needed for the creation of ClusterRole resources. The Cluster Operator will not run under a cluster admin account.

The RBAC resources follow the principle of least privilege and contain only those privileges needed by the Cluster Operator to operate the cluster of the Kafka component.

All cluster roles are required by the Cluster Operator in order to delegate privileges.

Table 1.1. ClusterRole resources
NameDescription

strimzi-cluster-operator-namespaced

Access rights for namespace-scoped resources used by the Cluster Operator to deploy and manage the operands.

strimzi-cluster-operator-global

Access rights for cluster-scoped resources used by the Cluster Operator to deploy and manage the operands.

strimzi-cluster-operator-leader-election

Access rights used by the Cluster Operator for leader election.

strimzi-cluster-operator-watched

Access rights used by the Cluster Operator to watch and manage the AMQ Streams custom resources.

strimzi-kafka-broker

Access rights to allow Kafka brokers to get the topology labels from OpenShift worker nodes when rack-awareness is used.

strimzi-entity-operator

Access rights used by the Topic and User Operators to manage Kafka users and topics.

strimzi-kafka-client

Access rights to allow Kafka Connect, MirrorMaker (1 and 2), and Kafka Bridge to get the topology labels from OpenShift worker nodes when rack-awareness is used.

1.2.2.3. ClusterRoleBinding resources

The Cluster Operator uses ClusterRoleBinding and RoleBinding resources to associate its ClusterRole with its ServiceAccount. Cluster role bindings are required by cluster roles containing cluster-scoped resources.

Table 1.2. ClusterRoleBinding resources
NameDescription

strimzi-cluster-operator

Grants the Cluster Operator the rights from the strimzi-cluster-operator-global cluster role.

strimzi-cluster-operator-kafka-broker-delegation

Grants the Cluster Operator the rights from the strimzi-entity-operator cluster role.

strimzi-cluster-operator-kafka-client-delegation

Grants the Cluster Operator the rights from the strimzi-kafka-client cluster role.

Table 1.3. RoleBinding resources
NameDescription

strimzi-cluster-operator

Grants the Cluster Operator the rights from the strimzi-cluster-operator-namespaced cluster role.

strimzi-cluster-operator-leader-election

Grants the Cluster Operator the rights from the strimzi-cluster-operator-leader-election cluster role.

strimzi-cluster-operator-watched

Grants the Cluster Operator the rights from the strimzi-cluster-operator-watched cluster role.

strimzi-cluster-operator-entity-operator-delegation

Grants the Cluster Operator the rights from the strimzi-cluster-operator-entity-operator-delegation cluster role.

1.2.2.4. ServiceAccount resources

The Cluster Operator runs using the strimzi-cluster-operator ServiceAccount. This service account grants it the privileges it requires to manage the operands. The Cluster Operator creates additional ClusterRoleBinding and RoleBinding resources to delegate some of these RBAC rights to the operands.

Each of the operands uses its own service account created by the Cluster Operator. This allows the Cluster Operator to follow the principle of least privilege and give the operands only the access rights that are really need.

Table 1.4. ServiceAccount resources
NameUsed by

<cluster_name>-zookeeper

ZooKeeper pods

<cluster_name>-kafka

Kafka broker pods

<cluster_name>-entity-operator

Entity Operator

<cluster_name>-cruise-control

Cruise Control pods

<cluster_name>-kafka-exporter

Kafka Exporter pods

<cluster_name>-connect

Kafka Connect pods

<cluster_name>-mirror-maker

MirrorMaker pods

<cluster_name>-mirrormaker2

MirrorMaker 2 pods

<cluster_name>-bridge

Kafka Bridge pods

1.3. Using the Kafka Bridge to connect with a Kafka cluster

You can use the AMQ Streams Kafka Bridge API to create and manage consumers and send and receive records over HTTP rather than the native Kafka protocol.

When you set up the Kafka Bridge you configure HTTP access to the Kafka cluster. You can then use the Kafka Bridge to produce and consume messages from the cluster, as well as performing other operations through its REST interface.

Additional resources

1.4. Seamless FIPS support

Federal Information Processing Standards (FIPS) are standards for computer security and interoperability. When running AMQ Streams on a FIPS-enabled OpenShift cluster, the OpenJDK used in AMQ Streams container images automatically switches to FIPS mode. From version 2.3, AMQ Streams can run on FIPS-enabled OpenShift clusters without any changes or special configuration. It uses only the FIPS-compliant security libraries from the OpenJDK.

Minimum password length

When running in the FIPS mode, SCRAM-SHA-512 passwords need to be at least 32 characters long. From AMQ Streams 2.3, the default password length in AMQ Streams User Operator is set to 32 characters as well. If you have a Kafka cluster with custom configuration that uses a password length that is less than 32 characters, you need to update your configuration. If you have any users with passwords shorter than 32 characters, you need to regenerate a password with the required length. You can do that, for example, by deleting the user secret and waiting for the User Operator to create a new password with the appropriate length.

Important

If you are using FIPS-enabled OpenShift clusters, you may experience higher memory consumption compared to regular OpenShift clusters. To avoid any issues, we suggest increasing the memory request to at least 512Mi.

1.5. Document Conventions

User-replaced values

User-replaced values, also known as replaceables, are shown in with angle brackets (< >). Underscores ( _ ) are used for multi-word values. If the value refers to code or commands, monospace is also used.

For example, the following code shows that <my_namespace> must be replaced by the correct namespace name:

sed -i 's/namespace: .*/namespace: <my_namespace>' install/cluster-operator/*RoleBinding*.yaml

1.6. Additional resources

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.