Home
Products
OpenShift Container Platform
4.11
Network Observability
Chapter 8. Monitoring the Network Observability Operator

Chapter 8. Monitoring the Network Observability Operator

You can use the web console to monitor alerts related to the health of the Network Observability Operator.

8.1. Viewing health information
Copy link

You can access metrics about health and resource usage of the Network Observability Operator from the Dashboards page in the web console. A health alert banner that directs you to the dashboard can appear on the Network Traffic and Home pages in the event that an alert is triggered. Alerts are generated in the following cases:

The NetObservLokiError alert occurs if the flowlogs-pipeline workload is dropping flows because of Loki errors, such as if the Loki ingestion rate limit has been reached.
The NetObservNoFlows alert occurs if no flows are ingested for a certain amount of time.

Prerequisites

You have the Network Observability Operator installed.
You have access to the cluster as a user with the cluster-admin role or with view permissions for all projects.

Procedure

From the Administrator perspective in the web console, navigate to Observe Dashboards.
From the Dashboards dropdown, select Netobserv/Health. Metrics about the health of the Operator are displayed on the page.

8.1.1. Disabling health alerts
Copy link

You can opt out of health alerting by editing the FlowCollector resource:

In the web console, navigate to Operators Installed Operators.
Under the Provided APIs heading for the NetObserv Operator, select Flow Collector.
Select cluster then select the YAML tab.
Add spec.processor.metrics.disableAlerts to disable health alerts, as in the following YAML sample:

apiVersion: flows.netobserv.io/v1alpha1
kind: FlowCollector
metadata:
  name: cluster
spec:
  processor:
    metrics:
      disableAlerts: [NetObservLokiError, NetObservNoFlows]

1: You can specify one or a list with both types of alerts to disable.

8.2. Creating Loki rate limit alerts for the NetObserv dashboard
Copy link

You can create custom rules for the Netobserv dashboard metrics to trigger alerts when Loki rate limits have been reached.

An example of an alerting rule configuration YAML file is as follows:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: loki-alerts
  namespace: openshift-operators-redhat
spec:
  groups:
  - name: LokiRateLimitAlerts
    rules:
    - alert: LokiTenantRateLimit
      annotations:
        message: |-
          {{ $labels.job }} {{ $labels.route }} is experiencing 429 errors.
        summary: "At any number of requests are responded with the rate limit error code."
      expr: sum(irate(loki_request_duration_seconds_count{status_code="429"}[1m])) by (job, namespace, route) / sum(irate(loki_request_duration_seconds_count[1m])) by (job, namespace, route) * 100 > 0
      for: 10s
      labels:
        severity: warning

Chapter 8. Monitoring the Network Observability Operator

8.1. Viewing health information
Copy link

8.1.1. Disabling health alerts
Copy link

8.2. Creating Loki rate limit alerts for the NetObserv dashboard
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat

Making open source more inclusive

About Red Hat Documentation

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 8. Monitoring the Network Observability Operator

8.1. Viewing health informationCopy linkLink copied to clipboard!

8.1.1. Disabling health alertsCopy linkLink copied to clipboard!

8.2. Creating Loki rate limit alerts for the NetObserv dashboardCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat

Making open source more inclusive

About Red Hat Documentation

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

8.1. Viewing health information
Copy link

8.1.1. Disabling health alerts
Copy link

8.2. Creating Loki rate limit alerts for the NetObserv dashboard
Copy link