Questo contenuto non è disponibile nella lingua selezionata.

Chapter 11. Troubleshooting Network Observability


To assist in troubleshooting Network Observability issues, you can perform some troubleshooting actions.

11.1. Using the must-gather tool

You can use the must-gather tool to collect information about the Network Observability Operator resources and cluster-wide resources, such as pod logs,

FlowCollector
, and
webhook
configurations.

Procedure

  1. Navigate to the directory where you want to store the must-gather data.
  2. Run the following command to collect cluster-wide must-gather resources:

    $ oc adm must-gather
     --image-stream=openshift/must-gather \
     --image=quay.io/netobserv/must-gather

Manually configure the network traffic menu entry in the OpenShift Container Platform console when the network traffic menu entry is not listed in Observe menu in the OpenShift Container Platform console.

Prerequisites

  • You have installed OpenShift Container Platform version 4.10 or newer.

Procedure

  1. Check if the

    spec.consolePlugin.register
    field is set to
    true
    by running the following command:

    $ oc -n netobserv get flowcollector cluster -o yaml

    Example output

    apiVersion: flows.netobserv.io/v1alpha1
    kind: FlowCollector
    metadata:
      name: cluster
    spec:
      consolePlugin:
        register: false

  2. Optional: Add the

    netobserv-plugin
    plugin by manually editing the Console Operator config:

    $ oc edit console.operator.openshift.io cluster

    Example output

    ...
    spec:
      plugins:
      - netobserv-plugin
    ...

  3. Optional: Set the

    spec.consolePlugin.register
    field to
    true
    by running the following command:

    $ oc -n netobserv edit flowcollector cluster -o yaml

    Example output

    apiVersion: flows.netobserv.io/v1alpha1
    kind: FlowCollector
    metadata:
      name: cluster
    spec:
      consolePlugin:
        register: true

  4. Ensure the status of console pods is

    running
    by running the following command:

    $ oc get pods -n openshift-console -l app=console
  5. Restart the console pods by running the following command:

    $ oc delete pods -n openshift-console -l app=console
  6. Clear your browser cache and history.
  7. Check the status of Network Observability plugin pods by running the following command:

    $ oc get pods -n netobserv -l app=netobserv-plugin

    Example output

    NAME                                READY   STATUS    RESTARTS   AGE
    netobserv-plugin-68c7bbb9bb-b69q6   1/1     Running   0          21s

  8. Check the logs of the Network Observability plugin pods by running the following command:

    $ oc logs -n netobserv -l app=netobserv-plugin

    Example output

    time="2022-12-13T12:06:49Z" level=info msg="Starting netobserv-console-plugin [build version: , build date: 2022-10-21 15:15] at log level info" module=main
    time="2022-12-13T12:06:49Z" level=info msg="listening on https://:9001" module=server

11.3. Flowlogs-Pipeline does not consume network flows after installing Kafka

If you deployed the flow collector first with

deploymentModel: KAFKA
and then deployed Kafka, the flow collector might not connect correctly to Kafka. Manually restart the flow-pipeline pods where Flowlogs-pipeline does not consume network flows from Kafka.

Procedure

  1. Delete the flow-pipeline pods to restart them by running the following command:

    $ oc delete pods -n netobserv -l app=flowlogs-pipeline-transformer

11.4. Failing to see network flows from both br-int and br-ex interfaces

br-ex` and

br-int
are virtual bridge devices operated at OSI layer 2. The eBPF agent works at the IP and TCP levels, layers 3 and 4 respectively. You can expect that the eBPF agent captures the network traffic passing through
br-ex
and
br-int
, when the network traffic is processed by other interfaces such as physical host or virtual pod interfaces. If you restrict the eBPF agent network interfaces to attach only to
br-ex
and
br-int
, you do not see any network flow.

Manually remove the part in the

interfaces
or
excludeInterfaces
that restricts the network interfaces to
br-int
and
br-ex
.

Procedure

  1. Remove the

    interfaces: [ 'br-int', 'br-ex' ]
    field. This allows the agent to fetch information from all the interfaces. Alternatively, you can specify the Layer-3 interface for example,
    eth0
    . Run the following command:

    $ oc edit -n netobserv flowcollector.yaml -o yaml

    Example output

    apiVersion: flows.netobserv.io/v1alpha1
    kind: FlowCollector
    metadata:
      name: cluster
    spec:
      agent:
        type: EBPF
        ebpf:
          interfaces: [ 'br-int', 'br-ex' ] 
    1

    1
    Specifies the network interfaces.

11.5. Network Observability controller manager pod runs out of memory

You can increase memory limits for the Network Observability operator by editing the

spec.config.resources.limits.memory
specification in the
Subscription
object.

Procedure

  1. In the web console, navigate to Operators Installed Operators
  2. Click Network Observability and then select Subscription.
  3. From the Actions menu, click Edit Subscription.

    1. Alternatively, you can use the CLI to open the YAML configuration for the

      Subscription
      object by running the following command:

      $ oc edit subscription netobserv-operator -n openshift-netobserv-operator
  4. Edit the

    Subscription
    object to add the
    config.resources.limits.memory
    specification and set the value to account for your memory requirements. See the Additional resources for more information about resource considerations:

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: netobserv-operator
      namespace: openshift-netobserv-operator
    spec:
      channel: stable
      config:
        resources:
          limits:
            memory: 800Mi     
    1
    
          requests:
            cpu: 100m
            memory: 100Mi
      installPlanApproval: Automatic
      name: netobserv-operator
      source: redhat-operators
      sourceNamespace: openshift-marketplace
      startingCSV: <network_observability_operator_latest_version> 
    2
    1
    For example, you can increase the memory limit to 800Mi.
    2
    This value should not be edited, but note that it changes depending on the most current release of the Operator.

11.6. Troubleshooting Loki ResourceExhausted error

Loki may return a

ResourceExhausted
error when network flow data sent by Network Observability exceeds the configured maximum message size. If you are using the Red Hat Loki Operator, this maximum message size is configured to 100 MiB.

Procedure

  1. Navigate to Operators Installed Operators, viewing All projects from the Project drop-down menu.
  2. In the Provided APIs list, select the Network Observability Operator.
  3. Click the Flow Collector then the YAML view tab.

    1. If you are using the Loki Operator, check that the
      spec.loki.batchSize
      value does not exceed 98 MiB.
    2. If you are using a Loki installation method that is different from the Red Hat Loki Operator, such as Grafana Loki, verify that the
      grpc_server_max_recv_msg_size
      Grafana Loki server setting is higher than the
      FlowCollector
      resource
      spec.loki.batchSize
      value. If it is not, you must either increase the
      grpc_server_max_recv_msg_size
      value, or decrease the
      spec.loki.batchSize
      value so that it is lower than the limit.
  4. Click Save if you edited the FlowCollector.

11.7. Resource troubleshooting

11.8. LokiStack rate limit errors

A rate-limit placed on the Loki tenant can result in potential temporary loss of data and a 429 error:

Per stream rate limit exceeded (limit:xMB/sec) while attempting to ingest for stream
. You might consider having an alert set to notify you of this error. For more information, see "Creating Loki rate limit alerts for the NetObserv dashboard" in the Additional resources of this section.

You can update the LokiStack CRD with the

perStreamRateLimit
and
perStreamRateLimitBurst
specifications, as shown in the following procedure.

Procedure

  1. Navigate to Operators Installed Operators, viewing All projects from the Project dropdown.
  2. Look for Loki Operator, and select the LokiStack tab.
  3. Create or edit an existing LokiStack instance using the YAML view to add the

    perStreamRateLimit
    and
    perStreamRateLimitBurst
    specifications:

    apiVersion: loki.grafana.com/v1
    kind: LokiStack
    metadata:
      name: loki
      namespace: netobserv
    spec:
      limits:
        global:
          ingestion:
            perStreamRateLimit: 6        
    1
    
            perStreamRateLimitBurst: 30  
    2
    
      tenants:
        mode: openshift-network
      managementState: Managed
    1
    The default value for perStreamRateLimit is 3.
    2
    The default value for perStreamRateLimitBurst is 15.
  4. Click Save.

Verification

Once you update the

perStreamRateLimit
and
perStreamRateLimitBurst
specifications, the pods in your cluster restart and the 429 rate-limit error no longer occurs.

Red Hat logoGithubredditYoutubeTwitter

Formazione

Prova, acquista e vendi

Community

Informazioni sulla documentazione di Red Hat

Aiutiamo gli utenti Red Hat a innovarsi e raggiungere i propri obiettivi con i nostri prodotti e servizi grazie a contenuti di cui possono fidarsi. Esplora i nostri ultimi aggiornamenti.

Rendiamo l’open source più inclusivo

Red Hat si impegna a sostituire il linguaggio problematico nel codice, nella documentazione e nelle proprietà web. Per maggiori dettagli, visita il Blog di Red Hat.

Informazioni su Red Hat

Forniamo soluzioni consolidate che rendono più semplice per le aziende lavorare su piattaforme e ambienti diversi, dal datacenter centrale all'edge della rete.

Theme

© 2026 Red Hat
Torna in cima