Chapter 1. Deployment overview
AMQ Streams simplifies the process of running Apache Kafka in an OpenShift cluster.
This guide provides instructions on all the options available for deploying and upgrading AMQ Streams, describing what is deployed, and the order of deployment required to run Apache Kafka in an OpenShift cluster.
As well as describing the deployment steps, the guide also provides pre- and post-deployment instructions to prepare for and verify a deployment. The guide also describes additional deployment options for introducing metrics.
Upgrade instructions are provided for AMQ Streams and Kafka upgrades.
AMQ Streams is designed to work on all types of OpenShift cluster regardless of distribution, from public and private clouds to local deployments intended for development.
1.1. Configuring a deployment
The deployment procedures in this guide are designed to help you set up the initial structure of your deployment. After setting up the structure, you can use custom resources to configure the deployment to your precise needs. The deployment procedures use the example installation files provided with AMQ Streams. The procedures highlight any important configuration considerations, but they do not describe all the configuration options available.
You might want to review the configuration options available for Kafka components before you deploy AMQ Streams. For more information on the configuration options, see Configuring AMQ Streams on OpenShift.
1.1.1. Securing Kafka
On deployment, the Cluster Operator automatically sets up TLS certificates for data encryption and authentication within your cluster.
AMQ Streams provides additional configuration options for encryption, authentication and authorization:
- Secure data exchange between the Kafka cluster and clients by Managing secure access to Kafka.
- Configure your deployment to use an authorization server to provide OAuth 2.0 authentication and OAuth 2.0 authorization.
- Secure Kafka using your own certificates.
1.1.2. Monitoring a deployment
AMQ Streams supports additional deployment options to monitor your deployment.
- Extract metrics and monitor Kafka components by deploying Prometheus and Grafana with your Kafka cluster.
- Extract additional metrics, particularly related to monitoring consumer lag, by deploying Kafka Exporter with your Kafka cluster.
- Track messages end-to-end by setting up distributed tracing.
1.1.3. CPU and memory resource limits and requests
By default, the AMQ Streams Cluster Operator does not specify requests and limits for CPU and memory resources for any operands it deploys.
Having sufficient resources is important for applications like Kafka to be stable and deliver good performance.
The right amount of resources you should use depends on the specific requirements and use-cases.
You should consider configuring the CPU and memory resources. You can set resource requests and limits for each container in the AMQ Streams custom resources.
1.2. AMQ Streams custom resources
A deployment of Kafka components to an OpenShift cluster using AMQ Streams is highly configurable through the application of custom resources. Custom resources are created as instances of APIs added by Custom resource definitions (CRDs) to extend OpenShift resources.
CRDs act as configuration instructions to describe the custom resources in an OpenShift cluster, and are provided with AMQ Streams for each Kafka component used in a deployment, as well as users and topics. CRDs and custom resources are defined as YAML files. Example YAML files are provided with the AMQ Streams distribution.
CRDs also allow AMQ Streams resources to benefit from native OpenShift features like CLI accessibility and configuration validation.
1.2.1. AMQ Streams custom resource example
CRDs require a one-time installation in a cluster to define the schemas used to instantiate and manage AMQ Streams-specific resources.
After a new custom resource type is added to your cluster by installing a CRD, you can create instances of the resource based on its specification.
Depending on the cluster setup, installation typically requires cluster admin privileges.
Access to manage custom resources is limited to AMQ Streams administrators. For more information, see Designating AMQ Streams administrators.
A CRD defines a new kind
of resource, such as kind:Kafka
, within an OpenShift cluster.
The Kubernetes API server allows custom resources to be created based on the kind
and understands from the CRD how to validate and store the custom resource when it is added to the OpenShift cluster.
When CRDs are deleted, custom resources of that type are also deleted. Additionally, the resources created by the custom resource, such as pods and statefulsets are also deleted.
Each AMQ Streams-specific custom resource conforms to the schema defined by the CRD for the resource’s kind
. The custom resources for AMQ Streams components have common configuration properties, which are defined under spec
.
To understand the relationship between a CRD and a custom resource, let’s look at a sample of the CRD for a Kafka topic.
Kafka topic CRD
apiVersion: kafka.strimzi.io/v1beta2 kind: CustomResourceDefinition metadata: 1 name: kafkatopics.kafka.strimzi.io labels: app: strimzi spec: 2 group: kafka.strimzi.io versions: v1beta2 scope: Namespaced names: # ... singular: kafkatopic plural: kafkatopics shortNames: - kt 3 additionalPrinterColumns: 4 # ... subresources: status: {} 5 validation: 6 openAPIV3Schema: properties: spec: type: object properties: partitions: type: integer minimum: 1 replicas: type: integer minimum: 1 maximum: 32767 # ...
- 1
- The metadata for the topic CRD, its name and a label to identify the CRD.
- 2
- The specification for this CRD, including the group (domain) name, the plural name and the supported schema version, which are used in the URL to access the API of the topic. The other names are used to identify instance resources in the CLI. For example,
oc get kafkatopic my-topic
oroc get kafkatopics
. - 3
- The shortname can be used in CLI commands. For example,
oc get kt
can be used as an abbreviation instead ofoc get kafkatopic
. - 4
- The information presented when using a
get
command on the custom resource. - 5
- The current status of the CRD as described in the schema reference for the resource.
- 6
- openAPIV3Schema validation provides validation for the creation of topic custom resources. For example, a topic requires at least one partition and one replica.
You can identify the CRD YAML files supplied with the AMQ Streams installation files, because the file names contain an index number followed by ‘Crd’.
Here is a corresponding example of a KafkaTopic
custom resource.
Kafka topic custom resource
apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaTopic 1 metadata: name: my-topic labels: strimzi.io/cluster: my-cluster 2 spec: 3 partitions: 1 replicas: 1 config: retention.ms: 7200000 segment.bytes: 1073741824 status: conditions: 4 lastTransitionTime: "2019-08-20T11:37:00.706Z" status: "True" type: Ready observedGeneration: 1 / ...
- 1
- The
kind
andapiVersion
identify the CRD of which the custom resource is an instance. - 2
- A label, applicable only to
KafkaTopic
andKafkaUser
resources, that defines the name of the Kafka cluster (which is same as the name of theKafka
resource) to which a topic or user belongs. - 3
- The spec shows the number of partitions and replicas for the topic as well as the configuration parameters for the topic itself. In this example, the retention period for a message to remain in the topic and the segment file size for the log are specified.
- 4
- Status conditions for the
KafkaTopic
resource. Thetype
condition changed toReady
at thelastTransitionTime
.
Custom resources can be applied to a cluster through the platform CLI. When the custom resource is created, it uses the same validation as the built-in resources of the Kubernetes API.
After a KafkaTopic
custom resource is created, the Topic Operator is notified and corresponding Kafka topics are created in AMQ Streams.
1.3. Using the Kafka Bridge to connect with a Kafka cluster
You can use the AMQ Streams Kafka Bridge API to create and manage consumers and send and receive records over HTTP rather than the native Kafka protocol.
When you set up the Kafka Bridge you configure HTTP access to the Kafka cluster. You can then use the Kafka Bridge to produce and consume messages from the cluster, as well as performing other operations through its REST interface.
Additional resources
- For information on installing and using the Kafka Bridge, see Using the AMQ Streams Kafka Bridge.
1.4. Document Conventions
User-replaced values
User-replaced values, also known as replaceables, are shown in italics with angle brackets (< >). Underscores ( _ ) are used for multi-word values. If the value refers to code or commands, monospace
is also used.
For example, in the following code, you will want to replace <my_namespace>
with the name of your namespace:
sed -i 's/namespace: .*/namespace: <my_namespace>/' install/cluster-operator/*RoleBinding*.yaml