Chapter 7. Monitoring
Monitoring data allows you to monitor the performance and health of AMQ Streams. You can configure your deployment to capture metrics data for analysis and notifications.
Metrics data is useful when investigating issues with connectivity and data delivery. For example, metrics data can identify under-replicated partitions or the rate at which messages are consumed. Alerting rules can provide time-critical notifications on such metrics through a specified communications channel. Monitoring visualizations present real-time metrics data to help determine when and how to update the configuration of your deployment. Example metrics configuration files are provided with AMQ Streams.
Distributed tracing complements the gathering of metrics data by providing a facility for end-to-end tracking of messages through AMQ Streams.
Metrics and monitoring tools
AMQ Streams can employ the following tools for metrics and monitoring:
- Prometheus pulls metrics from Kafka, ZooKeeper and Kafka Connect clusters. The Prometheus Alertmanager plugin handles alerts and routes them to a notification service.
- Kafka Exporter adds additional Prometheus metrics
- Grafana provides dashboard visualizations of Prometheus metrics
- Jaeger provides distributed tracing support to track transactions between applications
Additional resources
7.1. Prometheus
Prometheus can extract metrics data from Kafka, Kafka Connect and ZooKeeper.
In order to use Prometheus to obtain metrics data and provide notifications, Prometheus and the Prometheus Alertmanager plugin must be deployed. Kafka resources must also be deployed or redeployed with Prometheus configuration to expose the metrics data.
Prometheus scrapes the exposed metrics data for monitoring. Alertmanager issues notifications on conditions that indicate potential issues based on pre-defined alerting rules.
Sample metrics and alerting rules configuration files are provided with AMQ Streams. The sample alerting mechanism provided with AMQ Streams is configured to send notifications to a Slack channel.
7.2. Grafana
Grafana uses the metrics data exposed by Prometheus to present dashboard visualizations for monitoring.
A deployment of Grafana is required, with Prometheus added as a data source. Example dashboards, supplied with AMQ Streams as JSON files, are imported through the Grafana interface to present monitoring data.
7.3. Kafka Exporter
Kafka Exporter is an open source project to enhance monitoring of Apache Kafka brokers and clients. Kafka Exporter is deployed with a Kafka cluster to extract additional Prometheus metrics data from Kafka brokers related to offsets, consumer groups, consumer lag, and topics. You can use the Grafana dashboard provided to visualize the data collected by Prometheus from Kafka Exporter.
A sample configuration file, alerting rules and Grafana dashboard for Kafka Exporter are provided with AMQ Streams.
7.4. Distributed tracing
Within a Kafka deployment, distributed tracing using Jaeger is supported for:
- MirrorMaker to trace messages from a source cluster to a target cluster
- Kafka Connect to trace messages consumed and produced by Kafka Connect
- Kafka Bridge to trace messages consumed and produced by Kafka Bridge, and HTTP requests from client applications
Template configuration properties are set for the Kafka resources, which describe tracing environment variables.
Tracing for Kafka clients
Client applications, such as Kafka producers and consumers, can also be set up so that transactions are monitored. Clients are configured with a tracing profile, and a tracer is initialized for the client application to use.