Chapter 7. Monitoring
Streams for Apache Kafka Proxy supports key observability features to help you understand the performance and health of your proxy instances.
The Streams for Apache Kafka Proxy and Streams for Apache Kafka Proxy Operator generate metrics for real-time monitoring and alerting, as well as logs that capture their actions and behavior. You can integrate these metrics with a monitoring system like Prometheus for ingestion and analysis, while configuring log levels to control the granularity of logged information.
7.1. Overview of proxy metrics Copy linkLink copied to clipboard!
The proxy provides metrics for both connections and messages. These metrics are categorized into downstream (client-side) and upstream (broker-side) groups They allow users to assess the impact of the proxy and its filters on their Kafka system.
- Connection metrics count the connections made from the downstream (incoming connections from the clients) and the connection made by the proxy to upstream (outgoing connections to the Kafka brokers).
- Message metrics count the number of Kafka protocol request and response messages that flow through the proxy.
7.1.1. Connection metrics Copy linkLink copied to clipboard!
Connection metrics count the TCP connections made from the client to the proxy (kroxylicious_client_to_proxy_request_total) and from the proxy to the broker (kroxylicious_proxy_to_server_connections_total). These metrics count connection attempts, so the connection count is incremented even if the connection attempt ultimately fails.
In addition to the count metrics, there are active connection gauge metrics that track the current number of open connections, and error metrics.
-
If an error occurs whilst the proxy is accepting a connection from the client the
kroxylicious_client_to_proxy_errors_totalmetric is incremented by one. -
If an error occurs whilst the proxy is attempting a connection to a broker the
kroxylicious_proxy_to_server_errors_totalmetric is incremented by one.
Connection and connection error metrics include the following labels: virtual_cluster (the virtual cluster’s name) and node_id (the broker’s node ID). When the client connects to the boostrap endpoint of the virtual cluster, a node ID value of bootstrap is recorded.
The kroxylicious_client_to_proxy_errors_total metric also counts connection errors that occur before a virtual cluster has been identified. For these specific errors, the virtual_cluster and node_id labels are set to an empty string ("").
Error conditions signaled within the Kafka protocol response (such as RESOURCE_NOT_FOUND or UNKNOWN_TOPIC_ID) are not classed as errors by these metrics.
7.1.1.1. Understanding connection counter vs gauge metrics Copy linkLink copied to clipboard!
The proxy provides both counter and gauge metrics for connections:
-
Connection counters (
kroxylicious_*_connections_total) track the total number of connection attempts over time. These values only increase and provide a historical view of connection activity. -
Active connection gauges (
kroxylicious_*_active_connections) show the current number of open connections at any given moment. These values increase when connections are established and decrease when connections are closed.
| Metric Name | Type | Labels | Description |
|---|---|---|---|
|
| Counter |
|
Incremented by one every time a connection is accepted from a client by the proxy. |
|
| Counter |
| Incremented by one every time a connection is closed due to any downstream error. |
|
| Counter |
|
Incremented by one every time a connection is made to the server from the proxy. |
|
| Counter |
| Incremented by one every time a connection is closed due to any upstream error. |
|
| Gauge |
|
Shows the current number of active TCP connections from clients to the proxy. |
|
| Gauge |
|
Shows the current number of active TCP connections from the proxy to servers. |
7.1.2. Message metrics Copy linkLink copied to clipboard!
Message metrics count and record the sizes of the Kafka protocol requests and responses that flow through the proxy.
Use these metrics to help understand:
- the number of messages flowing through the proxy.
- the overall volume of data through the proxy.
- the effect the filters are having on the messages.
Downstream metrics
-
kroxylicious_client_to_proxy_request_totalcounts requests as they arrive from the client. -
kroxylicious_proxy_to_client_response_totalcounts responses as they are returned to the client. -
kroxylicious_client_to_proxy_request_size_bytesis incremented by the size of each request as it arrives from the client. -
kroxylicious_proxy_to_client_response_size_bytesis incremented by the size of each response as it is returned to the client.
-
Upstream metrics
-
kroxylicious_proxy_to_server_request_totalcounts requests as they go to the broker. -
kroxylicious_server_to_proxy_response_totalcounts responses as they are returned by the broker. -
kroxylicious_proxy_to_server_request_size_bytesis incremented by the size of each request as it goes to the broker. -
kroxylicious_server_to_proxy_response_size_bytesis incremented by the size of each response as it is returned by the broker.
-
The size recorded is the encoded size of the protocol message. It includes the 4 byte message size.
Filters can alter the flow of messages through the proxy or the content of the message. This is apparent through the metrics.
- If a filter sends a short-circuit, or closes a connection the downstream message counters will exceed the upstream counters.
- If a filter changes the size of the message, the downstream size metrics will be different to the upstream size metrics.
Figure 7.1. Downstream and upstream message metrics in the proxy
Message metrics include the following labels: virtual_cluster (the virtual cluster’s name), node_id (the broker’s node ID), api_key (the message type), api_version, and decoded (a flag indicating if the message was decoded by the proxy).
When the client connects to the boostrap endpoint of the virtual cluster, metrics are recorded with a node ID value of bootstrap.
| Metric Name | Type | Labels | Description |
|---|---|---|---|
|
| Counter |
| Incremented by one every time a request arrives at the proxy from a client. |
|
| Counter |
| Incremented by one every time a request goes from the proxy to a server. |
|
| Counter |
| Incremented by one every time a response arrives at the proxy from a server. |
|
| Counter |
| Incremented by one every time a response goes from the proxy to a client. |
|
| Distribution |
| Incremented by the size of the message each time a request arrives at the proxy from a client. |
|
| Distribution |
| Incremented by the size of the message each time a request goes from the proxy to a server. |
|
| Distribution |
| Incremented by the size of the message each time a response arrives at the proxy from a server. |
|
| Distribution |
| Incremented by the size of the message each time a response goes from the proxy to a client. |
7.2. Overview of operator metrics Copy linkLink copied to clipboard!
The Streams for Apache Kafka Proxy Operator is implemented using the Java Operator SDK. The Java Operator SDK exposes metrics that allow its behavior to be understood. These metrics are enabled by default in the Streams for Apache Kafka Proxy Operator.
Refer to the Java Operator SDK metric documentation to learn more about metrics.
7.3. Integrating with OpenShift user-workload monitoring Copy linkLink copied to clipboard!
When Streams for Apache Kafka Proxy is deployed on OpenShift Container Platform, Prometheus metrics can be collected through OpenShift’s monitoring for user-defined projects.
This OpenShift feature provides a dedicated Prometheus instance that enables developers to monitor workloads in their own projects (namespaces). The monitoring stack automatically scrapes metrics from eligible targets defined by custom resources such as PodMonitor.
Streams for Apache Kafka Proxy integrates with this monitoring stack by using a PodMonitor custom resource. To enable metric scraping, define a PodMonitor in the same namespace as the proxy deployment. This resource identifies the proxy pods and exposes their /metrics endpoints to the Prometheus instance, without requiring manual Prometheus configuration.
7.4. Ingesting metrics Copy linkLink copied to clipboard!
Metrics from the Streams for Apache Kafka Proxy and Streams for Apache Kafka Proxy Operator can be ingested into your Prometheus instance. The proxy and the operator each expose an HTTP endpoint for Prometheus metrics at the /metrics address. The endpoint does not require authentication.
For the Proxy, the port that exposes the scrape endpoint is named management. For the Operator, the port is named http.
Prometheus can be configured to ingest the metrics from the scrape endpoints.
This guide assumes monitoring for user-defined projects is enabled on your OpenShift cluster. For more information, see the Openshift Monitoring guide.
7.4.1. Ingesting operator metrics Copy linkLink copied to clipboard!
This procedure describes how to ingest metrics from the Streams for Apache Kafka Proxy Operator into Prometheus.
Prerequisites
- Streams for Apache Kafka Proxy Operator is installed.
- Monitoring for user-defined projects is enabled on your OpenShift cluster and a Prometheus instance has been created. For more information, see the Openshift Monitoring guide.
Procedure
Apply the PodMonitor configuration:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The Prometheus Operator reconfigures Prometheus automatically. Prometheus begins to regularly to scrape the Streams for Apache Kafka Proxy Operator’s metric.
Check the metrics are being ingested using a PromQL query such as:
operator_sdk_reconciliations_queue_size_kafkaproxyreconciler{kind="KafkaProxy", group="kroxylicious.io"}operator_sdk_reconciliations_queue_size_kafkaproxyreconciler{kind="KafkaProxy", group="kroxylicious.io"}Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.4.2. Ingesting proxy metrics Copy linkLink copied to clipboard!
This procedure describes how to ingest metrics from the Streams for Apache Kafka Proxy into Prometheus.
Prerequisites
- Streams for Apache Kafka Proxy Operator is installed.
- An instance of Streams for Apache Kafka Proxy deployed by the operator.
- Monitoring for user-defined projects is enabled on your OpenShift cluster and a Prometheus instance has been created. For more information, see the Openshift Monitoring guide.
Procedure
Apply the PodMonitor configuration:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The Prometheus Operator reconfigures Prometheus automatically. Prometheus begins to regularly to scrape the proxy’s metric.
Check the metrics are being ingested using a PromQL query such as:
kroxylicious_build_info
kroxylicious_build_infoCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.5. Setting log levels Copy linkLink copied to clipboard!
You can independently control the logging level of both the Streams for Apache Kafka Proxy Operator and the Streams for Apache Kafka Proxy.
In both cases, logging levels are controlled using two environment variables:
-
KROXYLICIOUS_APP_LOG_LEVELcontrols the logging of the application (io.kroxyliciousloggers). It defaults toINFO. -
KROXYLICIOUS_ROOT_LOG_LEVELcontrols the logging level at the root. It defaults toWARN.
When trying to diagnose a problem, start first by raising the logging level of KROXYLICIOUS_APP_LOG_LEVEL. If more detailed diagnostics are required, try raising the KROXYLICIOUS_ROOT_LOG_LEVEL. Both the proxy and operator use Apache Log4J2 and use logging levels understood by it: TRACE, DEBUG, INFO, WARN, and ERROR.
WARNING: Running the operator or the proxy at elevated logging levels, such as DEBUG or TRACE, can generate a large volume of logs, which may consume significant storage and affect performance. Run at these levels only as long as necessary.
7.5.1. Overriding proxy logging levels Copy linkLink copied to clipboard!
This procedure describes how to override the logging level of the Streams for Apache Kafka Proxy.
Prerequisites
- An instance of Streams for Apache Kafka Proxy deployed by the Streams for Apache Kafka Proxy Operator.
Procedure
Apply the
KROXYLICIOUS_APP_LOG_LEVELorKROXYLICIOUS_ROOT_LOG_LEVELenvironment variable to the proxy’s OpenShiftDeploymentresource:oc set env -n <namespace> deployment <deployment_name> KROXYLICIOUS_APP_LOG_LEVEL=DEBUG
oc set env -n <namespace> deployment <deployment_name> KROXYLICIOUS_APP_LOG_LEVEL=DEBUGCopy to Clipboard Copied! Toggle word wrap Toggle overflow The
Deploymentresource has the same name as theKafkaProxy.OpenShift recreates the proxy pod automatically.
Verify that the new logging level has taken affect:
oc logs -f -n <namespace> deployment/<deployment_name>
oc logs -f -n <namespace> deployment/<deployment_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.5.1.1. Reverting proxy logging levels Copy linkLink copied to clipboard!
This procedure describes how to revert the logging level of the Streams for Apache Kafka Proxy back to its defaults.
Prerequisites
- An instance of Streams for Apache Kafka Proxy deployed by the Streams for Apache Kafka Proxy Operator.
Procedure
Remove the
KROXYLICIOUS_APP_LOG_LEVELorKROXYLICIOUS_ROOT_LOG_LEVELenvironment variable from the proxy’s OpenShift Deployment:oc set env -n <namespace> deployment <deployment_name> KROXYLICIOUS_APP_LOG_LEVEL-
oc set env -n <namespace> deployment <deployment_name> KROXYLICIOUS_APP_LOG_LEVEL-Copy to Clipboard Copied! Toggle word wrap Toggle overflow OpenShift recreates the proxy pod automatically.
Verify that the logging level has reverted to its default:
oc logs -f -n <namespace> deployment/<deployment_name>
oc logs -f -n <namespace> deployment/<deployment_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.5.2. Overriding operator logging levels (operator installed by OLM) Copy linkLink copied to clipboard!
This procedure describes how to override the logging level of the Streams for Apache Kafka Proxy Operator. It applies when the operator was installed by OLM.
Prerequisites
- Streams for Apache Kafka Proxy Operator installed using OLM.
Procedure
Identify the name of the Subscription resource that has installed the operator and its namespace:
oc get subscriptions.operators.coreos.com --all-namespaces | grep kroxylicious
oc get subscriptions.operators.coreos.com --all-namespaces | grep kroxyliciousCopy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the
KROXYLICIOUS_APP_LOG_LEVELorKROXYLICIOUS_ROOT_LOG_LEVELenvironment variable to the Subscription:oc patch subscription -n <namespace> <subscription_name> -p '{"spec":{"config":{"env":[{"name":"KROXYLICIOUS_APP_LOG_LEVEL","value":"DEBUG"}]}}}' --type=mergeoc patch subscription -n <namespace> <subscription_name> -p '{"spec":{"config":{"env":[{"name":"KROXYLICIOUS_APP_LOG_LEVEL","value":"DEBUG"}]}}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow OpenShift recreates the operator pod automatically.
Verify that the new logging level has taken affect:
oc logs -f -n <namespace> deployment/kroxylicious-operator
oc logs -f -n <namespace> deployment/kroxylicious-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.5.2.1. Reverting operator logging levels Copy linkLink copied to clipboard!
This procedure describes how to revert the logging level of the Streams for Apache Kafka Proxy Operator back to its defaults.
Prerequisites
- Streams for Apache Kafka Proxy Operator installed using OLM.
Procedure
Remove the
KROXYLICIOUS_APP_LOG_LEVELorKROXYLICIOUS_ROOT_LOG_LEVELenvironment variable from the Subcription:oc patch subscription <subscription_name> -p '{"spec":{"config":{"env":[]}}}' --type=mergeoc patch subscription <subscription_name> -p '{"spec":{"config":{"env":[]}}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow OpenShift recreates the operator pod automatically
Verify that the logging level has reverted to its default:
oc logs -f -n <namespace> deployment/kroxylicious-operator
oc logs -f -n <namespace> deployment/kroxylicious-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.5.3. Overriding the operator logging level (operator installed by bundle) Copy linkLink copied to clipboard!
This procedure describes how to override the logging level of the Streams for Apache Kafka Proxy Operator. It applies when the operator was installed from the YAML bundle.
Prerequisites
- Streams for Apache Kafka Proxy Operator installed from the YAML bundle.
Procedure
Apply the
KROXYLICIOUS_APP_LOG_LEVELorKROXYLICIOUS_ROOT_LOG_LEVELenvironment variable to the operator’s OpenShift Deployment:oc set env -n kroxylicious-operator deployment kroxylicious-operator KROXYLICIOUS_APP_LOG_LEVEL=DEBUG
oc set env -n kroxylicious-operator deployment kroxylicious-operator KROXYLICIOUS_APP_LOG_LEVEL=DEBUGCopy to Clipboard Copied! Toggle word wrap Toggle overflow OpenShift recreates the operator pod automatically.
Verify that the new logging level has taken affect:
oc logs -f -n kroxylicious-operator deployment/kroxylicious-operator
oc logs -f -n kroxylicious-operator deployment/kroxylicious-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.5.3.1. Reverting operator logging levels Copy linkLink copied to clipboard!
This procedure describes how to revert the logging level of the Streams for Apache Kafka Proxy Operator back to its defaults.
Prerequisites
- Streams for Apache Kafka Proxy Operator installed from the YAML bundle.
Procedure
Remove the
KROXYLICIOUS_APP_LOG_LEVELorKROXYLICIOUS_ROOT_LOG_LEVELenvironment variable from the proxy’s OpenShift Deployment:oc set env -n kroxylicious-operator deployment kroxylicious-operator KROXYLICIOUS_APP_LOG_LEVEL-
oc set env -n kroxylicious-operator deployment kroxylicious-operator KROXYLICIOUS_APP_LOG_LEVEL-Copy to Clipboard Copied! Toggle word wrap Toggle overflow OpenShift recreates the operator pod automatically
Verify that the logging level has reverted to its default:
oc logs -f -n kroxylicious-operator deployment/kroxylicious-operator
oc logs -f -n kroxylicious-operator deployment/kroxylicious-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow