Chapter 8. Setting up metrics and dashboards for AMQ Streams
You can use Prometheus and Grafana to monitor your AMQ Streams deployment.
You can monitor your AMQ Streams deployment by viewing key metrics on dashboards and setting up alerts that trigger under certain conditions. Metrics are available for each of the components of AMQ Streams.
You can also collect metrics specific to oauth
authentication and opa
or keycloak
authorization. You do this by setting the enableMetrics
property to true
in the listener configuration of the Kafka
resource. For example, set enableMetrics
to true
in spec.kafka.listeners.authentication
and spec.kafka.authorization
. Similarly, you can enable metrics for oauth
authentication in the KafkaBridge
, KafkaConnect
, KafkaMirrorMaker
, and KafkaMirrorMaker2
custom resources.
To provide metrics information, AMQ Streams uses Prometheus rules and Grafana dashboards.
When configured with a set of rules for each component of AMQ Streams, Prometheus consumes key metrics from the pods that are running in your cluster. Grafana then visualizes those metrics on dashboards. AMQ Streams includes example Grafana dashboards that you can customize to suit your deployment.
AMQ Streams employs monitoring for user-defined projects (an OpenShift feature) to simplify the Prometheus setup process.
Depending on your requirements, you can:
With Prometheus and Grafana set up, you can use the example Grafana dashboards provided by AMQ Streams for monitoring.
Additionally, you can configure your deployment to track messages end-to-end by setting up distributed tracing.
AMQ Streams provides example installation files for Prometheus and Grafana. You can use these files as a starting point when trying out monitoring of AMQ Streams. For further support, try engaging with the Prometheus and Grafana developer communities.
Supporting documentation for metrics and monitoring tools
For more information on the metrics and monitoring tools, refer to the supporting documentation:
- Prometheus
- Prometheus configuration
- Kafka Exporter
- Grafana Labs
- Apache Kafka Monitoring describes JMX metrics exposed by Apache Kafka
- ZooKeeper JMX describes JMX metrics exposed by Apache ZooKeeper
8.1. Monitoring consumer lag with Kafka Exporter
Kafka Exporter is an open source project to enhance monitoring of Apache Kafka brokers and clients. You can configure the Kafka
resource to deploy Kafka Exporter with your Kafka cluster. Kafka Exporter extracts additional metrics data from Kafka brokers related to offsets, consumer groups, consumer lag, and topics. The metrics data is used, for example, to help identify slow consumers. Lag data is exposed as Prometheus metrics, which can then be presented in Grafana for analysis.
Kafka Exporter reads from the __consumer_offsets
topic, which stores information on committed offsets for consumer groups. For Kafka Exporter to be able to work properly, consumer groups needs to be in use.
A Grafana dashboard for Kafka Exporter is one of a number of example Grafana dashboards provided by AMQ Streams.
Kafka Exporter provides only additional metrics related to consumer lag and consumer offsets. For regular Kafka metrics, you have to configure the Prometheus metrics in Kafka brokers.
Consumer lag indicates the difference in the rate of production and consumption of messages. Specifically, consumer lag for a given consumer group indicates the delay between the last message in the partition and the message being currently picked up by that consumer.
The lag reflects the position of the consumer offset in relation to the end of the partition log.
Consumer lag between the producer and consumer offset
This difference is sometimes referred to as the delta between the producer offset and consumer offset: the read and write positions in the Kafka broker topic partitions.
Suppose a topic streams 100 messages a second. A lag of 1000 messages between the producer offset (the topic partition head) and the last offset the consumer has read means a 10-second delay.
The importance of monitoring consumer lag
For applications that rely on the processing of (near) real-time data, it is critical to monitor consumer lag to check that it does not become too big. The greater the lag becomes, the further the process moves from the real-time processing objective.
Consumer lag, for example, might be a result of consuming too much old data that has not been purged, or through unplanned shutdowns.
Reducing consumer lag
Use the Grafana charts to analyze lag and to check if actions to reduce lag are having an impact on an affected consumer group. If, for example, Kafka brokers are adjusted to reduce lag, the dashboard will show the Lag by consumer group chart going down and the Messages consumed per minute chart going up.
Typical actions to reduce lag include:
- Scaling-up consumer groups by adding new consumers
- Increasing the retention time for a message to remain in a topic
- Adding more disk capacity to increase the message buffer
Actions to reduce consumer lag depend on the underlying infrastructure and the use cases AMQ Streams is supporting. For instance, a lagging consumer is less likely to benefit from the broker being able to service a fetch request from its disk cache. And in certain cases, it might be acceptable to automatically drop messages until a consumer has caught up.
8.2. Monitoring Cruise Control operations
Cruise Control monitors Kafka brokers in order to track the utilization of brokers, topics, and partitions. Cruise Control also provides a set of metrics for monitoring its own performance.
The Cruise Control metrics reporter collects raw metrics data from Kafka brokers. The data is produced to topics that are automatically created by Cruise Control. The metrics are used to generate optimization proposals for Kafka clusters.
Cruise Control metrics are available for real-time monitoring of Cruise Control operations. For example, you can use Cruise Control metrics to monitor the status of rebalancing operations that are running or provide alerts on any anomalies that are detected in an operation’s performance.
You expose Cruise Control metrics by enabling the Prometheus JMX Exporter in the Cruise Control configuration.
For a full list of available Cruise Control metrics, which are known as sensors, see the Cruise Control documentation.
8.2.1. Exposing Cruise Control metrics
If you want to expose metrics on Cruise Control operations, configure the Kafka
resource to deploy Cruise Control and enable Prometheus metrics in the deployment. You can use your own configuration or use the example kafka-cruise-control-metrics.yaml
file provided by AMQ Streams.
You add the configuration to the metricsConfig
of the CruiseControl
property in the Kafka
resource. The configuration enables the Prometheus JMX Exporter to expose Cruise Control metrics through an HTTP endpoint. The HTTP endpoint is scraped by the Prometheus server.
Example metrics configuration for Cruise Control
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: my-cluster
Spec:
# ...
cruiseControl:
# ...
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: cruise-control-metrics
key: metrics-config.yml
---
kind: ConfigMap
apiVersion: v1
metadata:
name: cruise-control-metrics
labels:
app: strimzi
data:
metrics-config.yml: |
# metrics configuration...
8.2.2. Viewing Cruise Control metrics
After you expose the Cruise Control metrics, you can use Prometheus or another suitable monitoring system to view information on the metrics data. AMQ Streams provides an example Grafana dashboard to display visualizations of Cruise Control metrics. The dashboard is a JSON file called strimzi-cruise-control.json
. The exposed metrics provide the monitoring data when you enable the Grafana dashboard.
8.2.2.1. Monitoring balancedness scores
Cruise Control metrics include a balancedness score. Balancedness is the measure of how evenly a workload is distributed in a Kafka cluster.
The Cruise Control metric for balancedness score (balancedness-score
) might differ from the balancedness score in the KafkaRebalance
resource. Cruise Control calculates each score using anomaly.detection.goals
which might not be the same as the default.goals
used in the KafkaRebalance
resource. The anomaly.detection.goals
are specified in the spec.cruiseControl.config
of the Kafka
custom resource.
Refreshing the KafkaRebalance
resource fetches an optimization proposal. The latest cached optimization proposal is fetched if one of the following conditions applies:
-
KafkaRebalance
goals
match the goals configured in thedefault.goals
section of theKafka
resource -
KafkaRebalance
goals
are not specified
Otherwise, Cruise Control generates a new optimization proposal based on KafkaRebalance goals
. If new proposals are generated with each refresh, this can impact performance monitoring.
8.2.2.2. Alerts on anomaly detection
Cruise control’s anomaly detector provides metrics data for conditions that block the generation of optimization goals, such as broker failures. If you want more visibility, you can use the metrics provided by the anomaly detector to set up alerts and send out notifications. You can set up Cruise Control’s anomaly notifier to route alerts based on these metrics through a specified notification channel. Alternatively, you can set up Prometheus to scrape the metrics data provided by the anomaly detector and generate alerts. Prometheus Alertmanager can then route the alerts generated by Prometheus.
The Cruise Control documentation provides information on AnomalyDetector
metrics and the anomaly notifier.
8.3. Example metrics files
You can find example Grafana dashboards and other metrics configuration files in the example configuration files provided by AMQ Streams.
Example metrics files provided with AMQ Streams
metrics ├── grafana-dashboards 1 │ ├── strimzi-cruise-control.json │ ├── strimzi-kafka-bridge.json │ ├── strimzi-kafka-connect.json │ ├── strimzi-kafka-exporter.json │ ├── strimzi-kafka-mirror-maker-2.json │ ├── strimzi-kafka.json │ ├── strimzi-operators.json │ └── strimzi-zookeeper.json ├── grafana-install │ └── grafana.yaml 2 ├── prometheus-additional-properties │ └── prometheus-additional.yaml 3 ├── prometheus-alertmanager-config │ └── alert-manager-config.yaml 4 ├── prometheus-install │ ├── alert-manager.yaml 5 │ ├── prometheus-rules.yaml 6 │ ├── prometheus.yaml 7 │ ├── strimzi-pod-monitor.yaml 8 ├── kafka-bridge-metrics.yaml 9 ├── kafka-connect-metrics.yaml 10 ├── kafka-cruise-control-metrics.yaml 11 ├── kafka-metrics.yaml 12 └── kafka-mirror-maker-2-metrics.yaml 13
- 1
- Example Grafana dashboards for the different AMQ Streams components.
- 2
- Installation file for the Grafana image.
- 3
- Additional configuration to scrape metrics for CPU, memory and disk volume usage, which comes directly from the OpenShift cAdvisor agent and kubelet on the nodes.
- 4
- Hook definitions for sending notifications through Alertmanager.
- 5
- Resources for deploying and configuring Alertmanager.
- 6
- Alerting rules examples for use with Prometheus Alertmanager (deployed with Prometheus).
- 7
- Installation resource file for the Prometheus image.
- 8
- PodMonitor definitions translated by the Prometheus Operator into jobs for the Prometheus server to be able to scrape metrics data directly from pods.
- 9
- Kafka Bridge resource with metrics enabled.
- 10
- Metrics configuration that defines Prometheus JMX Exporter relabeling rules for Kafka Connect.
- 11
- Metrics configuration that defines Prometheus JMX Exporter relabeling rules for Cruise Control.
- 12
- Metrics configuration that defines Prometheus JMX Exporter relabeling rules for Kafka and ZooKeeper.
- 13
- Metrics configuration that defines Prometheus JMX Exporter relabeling rules for Kafka Mirror Maker 2.0.
8.3.1. Example Prometheus metrics configuration
AMQ Streams uses the Prometheus JMX Exporter to expose metrics through an HTTP endpoint, which can be scraped by the Prometheus server.
Grafana dashboards are dependent on Prometheus JMX Exporter relabeling rules, which are defined for AMQ Streams components in the custom resource configuration.
A label is a name-value pair. Relabeling is the process of writing a label dynamically. For example, the value of a label may be derived from the name of a Kafka server and client ID.
AMQ Streams provides example custom resource configuration YAML files with relabeling rules. When deploying Prometheus metrics configuration, you can can deploy the example custom resource or copy the metrics configuration to your own custom resource definition.
Component | Custom resource | Example YAML file |
---|---|---|
Kafka and ZooKeeper |
|
|
Kafka Connect |
|
|
Kafka MirrorMaker 2.0 |
|
|
Kafka Bridge |
|
|
Cruise Control |
|
|
8.3.2. Example Prometheus rules for alert notifications
Example Prometheus rules for alert notifications are provided with the example metrics configuration files provided by AMQ Streams. The rules are specified in the example prometheus-rules.yaml
file for use in a Prometheus deployment.
Alerting rules provide notifications about specific conditions observed in metrics. Rules are declared on the Prometheus server, but Prometheus Alertmanager is responsible for alert notifications.
Prometheus alerting rules describe conditions using PromQL expressions that are continuously evaluated.
When an alert expression becomes true, the condition is met and the Prometheus server sends alert data to the Alertmanager. Alertmanager then sends out a notification using the communication method configured for its deployment.
General points about the alerting rule definitions:
-
A
for
property is used with the rules to determine the period of time a condition must persist before an alert is triggered. -
A tick is a basic ZooKeeper time unit, which is measured in milliseconds and configured using the
tickTime
parameter ofKafka.spec.zookeeper.config
. For example, if ZooKeepertickTime=3000
, 3 ticks (3 x 3000) equals 9000 milliseconds. -
The availability of the
ZookeeperRunningOutOfSpace
metric and alert is dependent on the OpenShift configuration and storage implementation used. Storage implementations for certain platforms may not be able to supply the information on available space required for the metric to provide an alert.
Alertmanager can be configured to use email, chat messages or other notification methods. Adapt the default configuration of the example rules according to your specific needs.
8.3.2.1. Example altering rules
The prometheus-rules.yaml
file contains example rules for the following components:
- Kafka
- ZooKeeper
- Entity Operator
- Kafka Connect
- Kafka Bridge
- MirrorMaker
- Kafka Exporter
A description of each of the example rules is provided in the file.
8.3.3. Example Grafana dashboards
If you deploy Prometheus to provide metrics, you can use the example Grafana dashboards provided with AMQ Streams to monitor AMQ Streams components.
Example dashboards are provided in the examples/metrics/grafana-dashboards
directory as JSON files.
All dashboards provide JVM metrics, as well as metrics specific to the component. For example, the Grafana dashboard for AMQ Streams operators provides information on the number of reconciliations or custom resources they are processing.
The example dashboards don’t show all the metrics supported by Kafka. The dashboards are populated with a representative set of metrics for monitoring.
Component | Example JSON file |
---|---|
AMQ Streams operators |
|
Kafka |
|
ZooKeeper |
|
Kafka Connect |
|
Kafka MirrorMaker 2.0 |
|
Kafka Bridge |
|
Cruise Control |
|
Kafka Exporter |
|
When metrics are not available to the Kafka Exporter, because there is no traffic in the cluster yet, the Kafka Exporter Grafana dashboard will show N/A
for numeric fields and No data to show
for graphs.
8.4. Deploying Prometheus metrics configuration
Deploy Prometheus metrics configuration to use Prometheus with AMQ Streams. Use the metricsConfig
property to enable and configure Prometheus metrics.
You can use your own configuration or the example custom resource configuration files provided with AMQ Streams.
-
kafka-metrics.yaml
-
kafka-connect-metrics.yaml
-
kafka-mirror-maker-2-metrics.yaml
-
kafka-bridge-metrics.yaml
-
kafka-cruise-control-metrics.yaml
The example configuration files have relabeling rules and the configuration required to enable Prometheus metrics. Prometheus scrapes metrics from target HTTP endpoints. The example files are a good way to try Prometheus with AMQ Streams.
To apply the relabeling rules and metrics configuration, do one of the following:
- Copy the example configuration to your own custom resources
- Deploy the custom resource with the metrics configuration
If you want to include Kafka Exporter metrics, add kafkaExporter
configuration to your Kafka
resource.
Kafka Exporter provides only additional metrics related to consumer lag and consumer offsets. For regular Kafka metrics, you have to configure the Prometheus metrics in Kafka brokers.
This procedure shows how to deploy Prometheus metrics configuration in the Kafka
resource. The process is the same when using the example files for other resources.
Procedure
Deploy the example custom resource with the Prometheus configuration.
For example, for each
Kafka
resource you apply thekafka-metrics.yaml
file.Deploying the example configuration
oc apply -f kafka-metrics.yaml
Alternatively, you can copy the example configuration in
kafka-metrics.yaml
to your ownKafka
resource.Copying the example configuration
oc edit kafka <kafka-configuration-file>
Copy the
metricsConfig
property and theConfigMap
it references to yourKafka
resource.Example metrics configuration for Kafka
apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: my-cluster spec: kafka: # ... metricsConfig: 1 type: jmxPrometheusExporter valueFrom: configMapKeyRef: name: kafka-metrics key: kafka-metrics-config.yml --- kind: ConfigMap 2 apiVersion: v1 metadata: name: kafka-metrics labels: app: strimzi data: kafka-metrics-config.yml: | # metrics configuration...
NoteFor Kafka Bridge, you specify the
enableMetrics
property and set it totrue
.apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaBridge metadata: name: my-bridge spec: # ... bootstrapServers: my-cluster-kafka:9092 http: # ... enableMetrics: true # ...
To deploy Kafka Exporter, add
kafkaExporter
configuration.kafkaExporter
configuration is only specified in theKafka
resource.Example configuration for deploying Kafka Exporter
apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: my-cluster spec: # ... kafkaExporter: image: my-registry.io/my-org/my-exporter-cluster:latest 1 groupRegex: ".*" 2 topicRegex: ".*" 3 resources: 4 requests: cpu: 200m memory: 64Mi limits: cpu: 500m memory: 128Mi logging: debug 5 enableSaramaLogging: true 6 template: 7 pod: metadata: labels: label1: value1 imagePullSecrets: - name: my-docker-credentials securityContext: runAsUser: 1000001 fsGroup: 0 terminationGracePeriodSeconds: 120 readinessProbe: 8 initialDelaySeconds: 15 timeoutSeconds: 5 livenessProbe: 9 initialDelaySeconds: 15 timeoutSeconds: 5 # ...
- 1
- ADVANCED OPTION: Container image configuration, which is recommended only in special situations.
- 2
- A regular expression to specify the consumer groups to include in the metrics.
- 3
- A regular expression to specify the topics to include in the metrics.
- 4
- 5
- Logging configuration, to log messages with a given severity (debug, info, warn, error, fatal) or above.
- 6
- Boolean to enable Sarama logging, a Go client library used by Kafka Exporter.
- 7
- 8
- 9
For Kafka Exporter to be able to work properly, consumer groups need to be in use.
Additional resources
8.5. Viewing Kafka metrics and dashboards in OpenShift
When AMQ Streams is deployed to OpenShift Container Platform, metrics are provided through monitoring for user-defined projects. This OpenShift feature gives developers access to a separate Prometheus instance for monitoring their own projects (for example, a Kafka
project).
If monitoring for user-defined projects is enabled, the openshift-user-workload-monitoring
project contains the following components:
- A Prometheus Operator
- A Prometheus instance (automatically deployed by the Prometheus Operator)
- A Thanos Ruler instance
AMQ Streams uses these components to consume metrics.
A cluster administrator must enable monitoring for user-defined projects and then grant developers and other users permission to monitor applications within their own projects.
Grafana deployment
You can deploy a Grafana instance to the project containing your Kafka cluster. The example Grafana dashboards can then be used to visualize Prometheus metrics for AMQ Streams in the Grafana user interface.
The openshift-monitoring
project provides monitoring for core platform components. Do not use the Prometheus and Grafana components in this project to configure monitoring for AMQ Streams on OpenShift Container Platform 4.x.
Procedure outline
To set up AMQ Streams monitoring in OpenShift Container Platform, follow these procedures in order:
8.5.1. Prerequisites
- You have deployed the Prometheus metrics configuration using the example YAML files.
-
Monitoring for user-defined projects is enabled. A cluster administrator has created a
cluster-monitoring-config
config map in your OpenShift cluster. -
A cluster administrator has assigned you a
monitoring-rules-edit
ormonitoring-edit
role.
For more information on creating a cluster-monitoring-config
config map and granting users permission to monitor user-defined projects, see OpenShift Container Platform Monitoring.
8.5.2. Additional resources
- OpenShift Container Platform Monitoring
8.5.3. Deploying the Prometheus resources
Use Prometheus to obtain monitoring data in your Kafka cluster.
You can use your own Prometheus deployment or deploy Prometheus using the example metrics configuration files provided by AMQ Streams. To use the example files, you configure and deploy the PodMonitor
resources. The PodMonitors
scrape data directly from pods for Apache Kafka, ZooKeeper, Operators, the Kafka Bridge, and Cruise Control.
Then, you deploy the example alerting rules for Alertmanager.
Prerequisites
- A running Kafka cluster.
- Check the example alerting rules provided with AMQ Streams.
Procedure
Check that monitoring for user-defined projects is enabled:
oc get pods -n openshift-user-workload-monitoring
If enabled, pods for the monitoring components are returned. For example:
NAME READY STATUS RESTARTS AGE prometheus-operator-5cc59f9bc6-kgcq8 1/1 Running 0 25s prometheus-user-workload-0 5/5 Running 1 14s prometheus-user-workload-1 5/5 Running 1 14s thanos-ruler-user-workload-0 3/3 Running 0 14s thanos-ruler-user-workload-1 3/3 Running 0 14s
If no pods are returned, monitoring for user-defined projects is disabled. See the Prerequisites in Section 8.5, “Viewing Kafka metrics and dashboards in OpenShift”.
Multiple
PodMonitor
resources are defined inexamples/metrics/prometheus-install/strimzi-pod-monitor.yaml
.For each
PodMonitor
resource, edit thespec.namespaceSelector.matchNames
property:apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: cluster-operator-metrics labels: app: strimzi spec: selector: matchLabels: strimzi.io/kind: cluster-operator namespaceSelector: matchNames: - <project-name> 1 podMetricsEndpoints: - path: /metrics port: http # ...
- 1
- The project where the pods to scrape the metrics from are running, for example,
Kafka
.
Deploy the
strimzi-pod-monitor.yaml
file to the project where your Kafka cluster is running:oc apply -f strimzi-pod-monitor.yaml -n MY-PROJECT
Deploy the example Prometheus rules to the same project:
oc apply -f prometheus-rules.yaml -n MY-PROJECT
8.5.4. Creating a service account for Grafana
A Grafana instance for AMQ Streams needs to run with a service account that is assigned the cluster-monitoring-view
role.
Create a service account if you are using Grafana to present metrics for monitoring.
Prerequisites
Procedure
Create a
ServiceAccount
for Grafana. Here the resource is namedgrafana-serviceaccount
.apiVersion: v1 kind: ServiceAccount metadata: name: grafana-serviceaccount labels: app: strimzi
Deploy the
ServiceAccount
to the project containing your Kafka cluster:oc apply -f GRAFANA-SERVICEACCOUNT -n MY-PROJECT
Create a
ClusterRoleBinding
resource that assigns thecluster-monitoring-view
role to the GrafanaServiceAccount
. Here the resource is namedgrafana-cluster-monitoring-binding
.apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: grafana-cluster-monitoring-binding labels: app: strimzi subjects: - kind: ServiceAccount name: grafana-serviceaccount namespace: <my-project> 1 roleRef: kind: ClusterRole name: cluster-monitoring-view apiGroup: rbac.authorization.k8s.io
- 1
- Name of your project.
Deploy the
ClusterRoleBinding
to the project containing your Kafka cluster:oc apply -f <grafana-cluster-monitoring-binding> -n <my-project>
8.5.5. Deploying Grafana with a Prometheus datasource
Deploy Grafana to present Prometheus metrics. A Grafana application requires configuration for the OpenShift Container Platform monitoring stack.
OpenShift Container Platform includes a Thanos Querier instance in the openshift-monitoring
project. Thanos Querier is used to aggregate platform metrics.
To consume the required platform metrics, your Grafana instance requires a Prometheus data source that can connect to Thanos Querier. To configure this connection, you create a config map that authenticates, by using a token, to the oauth-proxy
sidecar that runs alongside Thanos Querier. A datasource.yaml
file is used as the source of the config map.
Finally, you deploy the Grafana application with the config map mounted as a volume to the project containing your Kafka cluster.
Procedure
Get the access token of the Grafana
ServiceAccount
:oc serviceaccounts get-token grafana-serviceaccount -n MY-PROJECT
Copy the access token to use in the next step.
Create a
datasource.yaml
file containing the Thanos Querier configuration for Grafana.Paste the access token into the
httpHeaderValue1
property as indicated.apiVersion: 1 datasources: - name: Prometheus type: prometheus url: https://thanos-querier.openshift-monitoring.svc.cluster.local:9091 access: proxy basicAuth: false withCredentials: false isDefault: true jsonData: timeInterval: 5s tlsSkipVerify: true httpHeaderName1: "Authorization" secureJsonData: httpHeaderValue1: "Bearer ${GRAFANA-ACCESS-TOKEN}" 1 editable: true
- 1
GRAFANA-ACCESS-TOKEN
: The value of the access token for the GrafanaServiceAccount
.
Create a config map named
grafana-config
from thedatasource.yaml
file:oc create configmap grafana-config --from-file=datasource.yaml -n MY-PROJECT
Create a Grafana application consisting of a
Deployment
and aService
.The
grafana-config
config map is mounted as a volume for the datasource configuration.apiVersion: apps/v1 kind: Deployment metadata: name: grafana labels: app: strimzi spec: replicas: 1 selector: matchLabels: name: grafana template: metadata: labels: name: grafana spec: serviceAccountName: grafana-serviceaccount containers: - name: grafana image: grafana/grafana:9.3.1 ports: - name: grafana containerPort: 3000 protocol: TCP volumeMounts: - name: grafana-data mountPath: /var/lib/grafana - name: grafana-logs mountPath: /var/log/grafana - name: grafana-config mountPath: /etc/grafana/provisioning/datasources/datasource.yaml readOnly: true subPath: datasource.yaml readinessProbe: httpGet: path: /api/health port: 3000 initialDelaySeconds: 5 periodSeconds: 10 livenessProbe: httpGet: path: /api/health port: 3000 initialDelaySeconds: 15 periodSeconds: 20 volumes: - name: grafana-data emptyDir: {} - name: grafana-logs emptyDir: {} - name: grafana-config configMap: name: grafana-config --- apiVersion: v1 kind: Service metadata: name: grafana labels: app: strimzi spec: ports: - name: grafana port: 3000 targetPort: 3000 protocol: TCP selector: name: grafana type: ClusterIP
Deploy the Grafana application to the project containing your Kafka cluster:
oc apply -f <grafana-application> -n <my-project>
8.5.6. Creating a route to the Grafana Service
You can access the Grafana user interface through a Route that exposes the Grafana service.
Prerequisites
Procedure
Create an edge route to the
grafana
service:oc create route edge <my-grafana-route> --service=grafana --namespace=KAFKA-NAMESPACE
8.5.7. Importing the example Grafana dashboards
Use Grafana to provide visualizations of Prometheus metrics on customizable dashboards.
AMQ Streams provides example dashboard configuration files for Grafana in JSON format.
-
examples/metrics/grafana-dashboards
This procedure uses the example Grafana dashboards.
The example dashboards are a good starting point for monitoring key metrics, but they don’t show all the metrics supported by Kafka. You can modify the example dashboards or add other metrics, depending on your infrastructure.
Prerequisites
Procedure
Get the details of the Route to the Grafana Service. For example:
oc get routes NAME HOST/PORT PATH SERVICES MY-GRAFANA-ROUTE MY-GRAFANA-ROUTE-amq-streams.net grafana
- In a web browser, access the Grafana login screen using the URL for the Route host and port.
Enter your user name and password, and then click Log In.
The default Grafana user name and password are both
admin
. After logging in for the first time, you can change the password.- In Configuration > Data Sources, check that the Prometheus data source was created. The data source was created in Section 8.5.5, “Deploying Grafana with a Prometheus datasource”.
- Click the + icon and then click Import.
-
In
examples/metrics/grafana-dashboards
, copy the JSON of the dashboard to import. - Paste the JSON into the text box, and then click Load.
- Repeat steps 5-7 for the other example Grafana dashboards.
The imported Grafana dashboards are available to view from the Dashboards home page.