Chapter 11. Monitoring the Network Observability Operator
You can use the web console to monitor alerts related to the health of the Network Observability Operator.
11.1. Health dashboards Copy linkLink copied to clipboard!
Metrics about health and resource usage of the Network Observability Operator are located in the Observe
- Flows per second
- Sampling
- Errors last minute
- Dropped flows per second
- Flowlogs-pipeline statistics
- Flowlogs-pipleine statistics views
- eBPF agent statistics views
- Operator statistics
- Resource usage
11.2. Health alerts Copy linkLink copied to clipboard!
A health alert banner that directs you to the dashboard can appear on the Network Traffic and Home pages if an alert is triggered. Alerts are generated in the following cases:
-
The
NetObservLokiError
alert occurs if theflowlogs-pipeline
workload is dropping flows because of Loki errors, such as if the Loki ingestion rate limit has been reached. -
The
NetObservNoFlows
alert occurs if no flows are ingested for a certain amount of time. -
The
NetObservFlowsDropped
alert occurs if the Network Observability eBPF agent hashmap table is full, and the eBPF agent processes flows with degraded performance, or when the capacity limiter is triggered.
11.3. Viewing health information Copy linkLink copied to clipboard!
You can access metrics about health and resource usage of the Network Observability Operator from the Dashboards page in the web console.
Prerequisites
- You have the Network Observability Operator installed.
-
You have access to the cluster as a user with the
cluster-admin
role or with view permissions for all projects.
Procedure
-
From the Administrator perspective in the web console, navigate to Observe
Dashboards. - From the Dashboards dropdown, select Netobserv/Health.
- View the metrics about the health of the Operator that are displayed on the page.
11.3.1. Disabling health alerts Copy linkLink copied to clipboard!
You can opt out of health alerting by editing the FlowCollector
resource:
-
In the web console, navigate to Operators
Installed Operators. - Under the Provided APIs heading for the NetObserv Operator, select Flow Collector.
- Select cluster then select the YAML tab.
Add
spec.processor.metrics.disableAlerts
to disable health alerts, as in the following YAML sample:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- You can specify one or a list with both types of alerts to disable.
11.4. Using the eBPF agent alert Copy linkLink copied to clipboard!
An alert, NetObservAgentFlowsDropped
, is triggered when the network observability eBPF agent hashmap table is full or when the capacity limiter is triggered. If you see this alert, consider increasing the cacheMaxFlows
in the FlowCollector
, as shown in the following example.
Increasing the cacheMaxFlows
might increase the memory usage of the eBPF agent.
Procedure
-
In the web console, navigate to Operators
Installed Operators. - Under the Provided APIs heading for the Network Observability Operator, select Flow Collector.
- Select cluster, and then select the YAML tab.
-
Increase the
spec.agent.ebpf.cacheMaxFlows
value, as shown in the following YAML sample:
- 1
- Increase the
cacheMaxFlows
value from its value at the time of theNetObservAgentFlowsDropped
alert.