Questo contenuto non è disponibile nella lingua selezionata.
Chapter 3. Troubleshooting OVN-Kubernetes
OVN-Kubernetes has many sources of built-in health checks and logs. Follow the instructions in these sections to examine your cluster. If a support case is necessary, follow the support guide to collect additional information through a must-gather. Only use the -- gather_network_logs when instructed by support.
3.1. Monitoring OVN-Kubernetes health by using readiness probes Copia collegamentoCollegamento copiato negli appunti!
The ovnkube-control-plane and ovnkube-node pods have containers configured with readiness probes.
Prerequisites
-
Access to the OpenShift CLI (
oc). -
You have access to the cluster with
cluster-adminprivileges. -
You have installed
jq.
Procedure
Review the details of the
ovnkube-nodereadiness probe by running the following command:$ oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-node \ -o json | jq '.items[0].spec.containers[] | .name,.readinessProbe'The readiness probe for the northbound and southbound database containers in the
ovnkube-nodepod checks for the health of the databases and theovnkube-controllercontainer.The
ovnkube-controllercontainer in theovnkube-nodepod has a readiness probe to verify the presence of the OVN-Kubernetes CNI configuration file, the absence of which would indicate that the pod is not running or is not ready to accept requests to configure pods.Show all events including the probe failures, for the namespace by using the following command:
$ oc get events -n openshift-ovn-kubernetesShow the events for just a specific pod:
$ oc describe pod ovnkube-node-9lqfk -n openshift-ovn-kubernetesShow the messages and statuses from the cluster network operator:
$ oc get co/network -o json | jq '.status.conditions[]'Show the
readystatus of each container inovnkube-nodepods by running the following script:$ for p in $(oc get pods --selector app=ovnkube-node -n openshift-ovn-kubernetes \ -o jsonpath='{range.items[*]}{" "}{.metadata.name}'); do echo === $p ===; \ oc get pods -n openshift-ovn-kubernetes $p -o json | jq '.status.containerStatuses[] | .name, .ready'; \ doneNoteThe expectation is all container statuses are reporting as
true. Failure of a readiness probe sets the status tofalse.
3.2. Viewing OVN-Kubernetes alerts in the console Copia collegamentoCollegamento copiato negli appunti!
The Alerting UI provides detailed information about alerts and their governing alerting rules and silences.
Prerequisites
- You have access to the cluster as a developer or as a user with view permissions for the project that you are viewing metrics for.
Procedure (UI)
-
In the Administrator perspective, select Observe
Alerting. The three main pages in the Alerting UI in this perspective are the Alerts, Silences, and Alerting Rules pages. -
View the rules for OVN-Kubernetes alerts by selecting Observe
Alerting Alerting Rules.
3.3. Viewing OVN-Kubernetes alerts in the CLI Copia collegamentoCollegamento copiato negli appunti!
You can get information about alerts and their governing alerting rules and silences from the command line.
Prerequisites
-
Access to the cluster as a user with the
cluster-adminrole. -
The OpenShift CLI (
oc) installed. -
You have installed
jq.
Procedure
View active or firing alerts by running the following commands.
Set the alert manager route environment variable by running the following command:
$ ALERT_MANAGER=$(oc get route alertmanager-main -n openshift-monitoring \ -o jsonpath='{@.spec.host}')Issue a
curlrequest to the alert manager route API by running the following command, replacing$ALERT_MANAGERwith the URL of yourAlertmanagerinstance:$ curl -s -k -H "Authorization: Bearer $(oc create token prometheus-k8s -n openshift-monitoring)" https://$ALERT_MANAGER/api/v1/alerts | jq '.data[] | "\(.labels.severity) \(.labels.alertname) \(.labels.pod) \(.labels.container) \(.labels.endpoint) \(.labels.instance)"'
View alerting rules by running the following command:
$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s 'http://localhost:9090/api/v1/rules' | jq '.data.groups[].rules[] | select(((.name|contains("ovn")) or (.name|contains("OVN")) or (.name|contains("Ovn")) or (.name|contains("North")) or (.name|contains("South"))) and .type=="alerting")'
3.4. Viewing the OVN-Kubernetes logs using the CLI Copia collegamentoCollegamento copiato negli appunti!
You can view the logs for each of the pods in the ovnkube-master and ovnkube-node pods using the OpenShift CLI (oc).
Prerequisites
-
Access to the cluster as a user with the
cluster-adminrole. -
Access to the OpenShift CLI (
oc). -
You have installed
jq.
Procedure
View the log for a specific pod:
$ oc logs -f <pod_name> -c <container_name> -n <namespace>where:
-f- Optional: Specifies that the output follows what is being written into the logs.
<pod_name>- Specifies the name of the pod.
<container_name>- Optional: Specifies the name of a container. When a pod has more than one container, you must specify the container name.
<namespace>- Specify the namespace the pod is running in.
For example:
$ oc logs ovnkube-node-5dx44 -n openshift-ovn-kubernetes$ oc logs -f ovnkube-node-5dx44 -c ovnkube-controller -n openshift-ovn-kubernetesThe contents of log files are printed out.
Examine the most recent entries in all the containers in the
ovnkube-nodepods:$ for p in $(oc get pods --selector app=ovnkube-node -n openshift-ovn-kubernetes \ -o jsonpath='{range.items[*]}{" "}{.metadata.name}'); \ do echo === $p ===; for container in $(oc get pods -n openshift-ovn-kubernetes $p \ -o json | jq -r '.status.containerStatuses[] | .name');do echo ---$container---; \ oc logs -c $container $p -n openshift-ovn-kubernetes --tail=5; done; doneView the last 5 lines of every log in every container in an
ovnkube-nodepod using the following command:$ oc logs -l app=ovnkube-node -n openshift-ovn-kubernetes --all-containers --tail 5
3.5. Viewing the OVN-Kubernetes logs using the web console Copia collegamentoCollegamento copiato negli appunti!
You can view the logs for each of the pods in the ovnkube-master and ovnkube-node pods in the web console.
Prerequisites
-
Access to the OpenShift CLI (
oc).
Procedure
-
In the OpenShift Container Platform console, navigate to Workloads
Pods or navigate to the pod through the resource you want to investigate. -
Select the
openshift-ovn-kubernetesproject from the drop-down menu. - Click the name of the pod you want to investigate.
-
Click Logs. By default for the
ovnkube-masterthe logs associated with thenorthdcontainer are displayed. - Use the down-down menu to select logs for each container in turn.
3.5.1. Changing the OVN-Kubernetes log levels Copia collegamentoCollegamento copiato negli appunti!
The default log level for OVN-Kubernetes is 4. To debug OVN-Kubernetes, set the log level to 5. Follow this procedure to increase the log level of the OVN-Kubernetes to help you debug an issue.
Prerequisites
-
You have access to the cluster with
cluster-adminprivileges. - You have access to the OpenShift Container Platform web console.
Procedure
Run the following command to get detailed information for all pods in the OVN-Kubernetes project:
$ oc get po -o wide -n openshift-ovn-kubernetesExample output
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ovnkube-control-plane-65497d4548-9ptdr 2/2 Running 2 (128m ago) 147m 10.0.0.3 ci-ln-3njdr9b-72292-5nwkp-master-0 <none> <none> ovnkube-control-plane-65497d4548-j6zfk 2/2 Running 0 147m 10.0.0.5 ci-ln-3njdr9b-72292-5nwkp-master-2 <none> <none> ovnkube-node-5dx44 8/8 Running 0 146m 10.0.0.3 ci-ln-3njdr9b-72292-5nwkp-master-0 <none> <none> ovnkube-node-dpfn4 8/8 Running 0 146m 10.0.0.4 ci-ln-3njdr9b-72292-5nwkp-master-1 <none> <none> ovnkube-node-kwc9l 8/8 Running 0 134m 10.0.128.2 ci-ln-3njdr9b-72292-5nwkp-worker-a-2fjcj <none> <none> ovnkube-node-mcrhl 8/8 Running 0 134m 10.0.128.4 ci-ln-3njdr9b-72292-5nwkp-worker-c-v9x5v <none> <none> ovnkube-node-nsct4 8/8 Running 0 146m 10.0.0.5 ci-ln-3njdr9b-72292-5nwkp-master-2 <none> <none> ovnkube-node-zrj9f 8/8 Running 0 134m 10.0.128.3 ci-ln-3njdr9b-72292-5nwkp-worker-b-v78h7 <none> <none>Create a
ConfigMapfile similar to the following example and use a filename such asenv-overrides.yaml:Example
ConfigMapfilekind: ConfigMap apiVersion: v1 metadata: name: env-overrides namespace: openshift-ovn-kubernetes data: ci-ln-3njdr9b-72292-5nwkp-master-0: |1 # This sets the log level for the ovn-kubernetes node process: OVN_KUBE_LOG_LEVEL=5 # You might also/instead want to enable debug logging for ovn-controller: OVN_LOG_LEVEL=dbg ci-ln-3njdr9b-72292-5nwkp-master-2: | # This sets the log level for the ovn-kubernetes node process: OVN_KUBE_LOG_LEVEL=5 # You might also/instead want to enable debug logging for ovn-controller: OVN_LOG_LEVEL=dbg _master: |2 # This sets the log level for the ovn-kubernetes master process as well as the ovn-dbchecker: OVN_KUBE_LOG_LEVEL=5 # You might also/instead want to enable debug logging for northd, nbdb and sbdb on all masters: OVN_LOG_LEVEL=dbgApply the
ConfigMapfile by using the following command:$ oc apply -n openshift-ovn-kubernetes -f env-overrides.yamlExample output
configmap/env-overrides.yaml createdRestart the
ovnkubepods to apply the new log level by using the following commands:$ oc delete pod -n openshift-ovn-kubernetes \ --field-selector spec.nodeName=ci-ln-3njdr9b-72292-5nwkp-master-0 -l app=ovnkube-node$ oc delete pod -n openshift-ovn-kubernetes \ --field-selector spec.nodeName=ci-ln-3njdr9b-72292-5nwkp-master-2 -l app=ovnkube-node$ oc delete pod -n openshift-ovn-kubernetes -l app=ovnkube-nodeTo verify that the `ConfigMap`file has been applied to all nodes for a specific pod, run the following command:
$ oc logs -n openshift-ovn-kubernetes --all-containers --prefix ovnkube-node-<xxxx> | grep -E -m 10 '(Logging config:|vconsole|DBG)'where:
<XXXX>Specifies the random sequence of letters for a pod from the previous step.
Example output
[pod/ovnkube-node-2cpjc/sbdb] + exec /usr/share/ovn/scripts/ovn-ctl --no-monitor '--ovn-sb-log=-vconsole:info -vfile:off -vPATTERN:console:%D{%Y-%m-%dT%H:%M:%S.###Z}|%05N|%c%T|%p|%m' run_sb_ovsdb [pod/ovnkube-node-2cpjc/ovnkube-controller] I1012 14:39:59.984506 35767 config.go:2247] Logging config: {File: CNIFile:/var/log/ovn-kubernetes/ovn-k8s-cni-overlay.log LibovsdbFile:/var/log/ovnkube/libovsdb.log Level:5 LogFileMaxSize:100 LogFileMaxBackups:5 LogFileMaxAge:0 ACLLoggingRateLimit:20} [pod/ovnkube-node-2cpjc/northd] + exec ovn-northd --no-chdir -vconsole:info -vfile:off '-vPATTERN:console:%D{%Y-%m-%dT%H:%M:%S.###Z}|%05N|%c%T|%p|%m' --pidfile /var/run/ovn/ovn-northd.pid --n-threads=1 [pod/ovnkube-node-2cpjc/nbdb] + exec /usr/share/ovn/scripts/ovn-ctl --no-monitor '--ovn-nb-log=-vconsole:info -vfile:off -vPATTERN:console:%D{%Y-%m-%dT%H:%M:%S.###Z}|%05N|%c%T|%p|%m' run_nb_ovsdb [pod/ovnkube-node-2cpjc/ovn-controller] 2023-10-12T14:39:54.552Z|00002|hmap|DBG|lib/shash.c:114: 1 bucket with 6+ nodes, including 1 bucket with 6 nodes (32 nodes total across 32 buckets) [pod/ovnkube-node-2cpjc/ovn-controller] 2023-10-12T14:39:54.553Z|00003|hmap|DBG|lib/shash.c:114: 1 bucket with 6+ nodes, including 1 bucket with 6 nodes (64 nodes total across 64 buckets) [pod/ovnkube-node-2cpjc/ovn-controller] 2023-10-12T14:39:54.553Z|00004|hmap|DBG|lib/shash.c:114: 1 bucket with 6+ nodes, including 1 bucket with 7 nodes (32 nodes total across 32 buckets) [pod/ovnkube-node-2cpjc/ovn-controller] 2023-10-12T14:39:54.553Z|00005|reconnect|DBG|unix:/var/run/openvswitch/db.sock: entering BACKOFF [pod/ovnkube-node-2cpjc/ovn-controller] 2023-10-12T14:39:54.553Z|00007|reconnect|DBG|unix:/var/run/openvswitch/db.sock: entering CONNECTING [pod/ovnkube-node-2cpjc/ovn-controller] 2023-10-12T14:39:54.553Z|00008|ovsdb_cs|DBG|unix:/var/run/openvswitch/db.sock: SERVER_SCHEMA_REQUESTED -> SERVER_SCHEMA_REQUESTED at lib/ovsdb-cs.c:423
Optional: Check the
ConfigMapfile has been applied by running the following command:for f in $(oc -n openshift-ovn-kubernetes get po -l 'app=ovnkube-node' --no-headers -o custom-columns=N:.metadata.name) ; do echo "---- $f ----" ; oc -n openshift-ovn-kubernetes exec -c ovnkube-controller $f -- pgrep -a -f init-ovnkube-controller | grep -P -o '^.*loglevel\s+\d' ; doneExample output
---- ovnkube-node-2dt57 ---- 60981 /usr/bin/ovnkube --init-ovnkube-controller xpst8-worker-c-vmh5n.c.openshift-qe.internal --init-node xpst8-worker-c-vmh5n.c.openshift-qe.internal --config-file=/run/ovnkube-config/ovnkube.conf --ovn-empty-lb-events --loglevel 4 ---- ovnkube-node-4zznh ---- 178034 /usr/bin/ovnkube --init-ovnkube-controller xpst8-master-2.c.openshift-qe.internal --init-node xpst8-master-2.c.openshift-qe.internal --config-file=/run/ovnkube-config/ovnkube.conf --ovn-empty-lb-events --loglevel 4 ---- ovnkube-node-548sx ---- 77499 /usr/bin/ovnkube --init-ovnkube-controller xpst8-worker-a-fjtnb.c.openshift-qe.internal --init-node xpst8-worker-a-fjtnb.c.openshift-qe.internal --config-file=/run/ovnkube-config/ovnkube.conf --ovn-empty-lb-events --loglevel 4 ---- ovnkube-node-6btrf ---- 73781 /usr/bin/ovnkube --init-ovnkube-controller xpst8-worker-b-p8rww.c.openshift-qe.internal --init-node xpst8-worker-b-p8rww.c.openshift-qe.internal --config-file=/run/ovnkube-config/ovnkube.conf --ovn-empty-lb-events --loglevel 4 ---- ovnkube-node-fkc9r ---- 130707 /usr/bin/ovnkube --init-ovnkube-controller xpst8-master-0.c.openshift-qe.internal --init-node xpst8-master-0.c.openshift-qe.internal --config-file=/run/ovnkube-config/ovnkube.conf --ovn-empty-lb-events --loglevel 5 ---- ovnkube-node-tk9l4 ---- 181328 /usr/bin/ovnkube --init-ovnkube-controller xpst8-master-1.c.openshift-qe.internal --init-node xpst8-master-1.c.openshift-qe.internal --config-file=/run/ovnkube-config/ovnkube.conf --ovn-empty-lb-events --loglevel 4
3.6. Checking the OVN-Kubernetes pod network connectivity Copia collegamentoCollegamento copiato negli appunti!
The connectivity check controller, in OpenShift Container Platform 4.10 and later, orchestrates connection verification checks in your cluster. These include Kubernetes API, OpenShift API and individual nodes. The results for the connection tests are stored in PodNetworkConnectivity objects in the openshift-network-diagnostics namespace. Connection tests are performed every minute in parallel.
Prerequisites
-
Access to the OpenShift CLI (
oc). -
Access to the cluster as a user with the
cluster-adminrole. -
You have installed
jq.
Procedure
To list the current
PodNetworkConnectivityCheckobjects, enter the following command:$ oc get podnetworkconnectivitychecks -n openshift-network-diagnosticsView the most recent success for each connection object by using the following command:
$ oc get podnetworkconnectivitychecks -n openshift-network-diagnostics \ -o json | jq '.items[]| .spec.targetEndpoint,.status.successes[0]'View the most recent failures for each connection object by using the following command:
$ oc get podnetworkconnectivitychecks -n openshift-network-diagnostics \ -o json | jq '.items[]| .spec.targetEndpoint,.status.failures[0]'View the most recent outages for each connection object by using the following command:
$ oc get podnetworkconnectivitychecks -n openshift-network-diagnostics \ -o json | jq '.items[]| .spec.targetEndpoint,.status.outages[0]'The connectivity check controller also logs metrics from these checks into Prometheus.
View all the metrics by running the following command:
$ oc exec prometheus-k8s-0 -n openshift-monitoring -- \ promtool query instant http://localhost:9090 \ '{component="openshift-network-diagnostics"}'View the latency between the source pod and the openshift api service for the last 5 minutes:
$ oc exec prometheus-k8s-0 -n openshift-monitoring -- \ promtool query instant http://localhost:9090 \ '{component="openshift-network-diagnostics"}'
3.7. Checking OVN-Kubernetes network traffic with OVS sampling using the CLI Copia collegamentoCollegamento copiato negli appunti!
Checking OVN-Kubernetes network traffic with OVS sampling is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
OVN-Kubernetes network traffic can be viewed with OVS sampling via the CLI for the following network APIs:
-
NetworkPolicy -
AdminNetworkPolicy -
BaselineNetworkPolicy -
UserDefinedNetworkisolation -
EgressFirewall - Multicast ACLs.
Scripts for these networking events are found in the /usr/bin/ovnkube-observ path of each OVN-Kubernetes node.
Although both the Network Observability Operator and checking OVN-Kubernetes network traffic with OVS sampling are good for debuggability, the Network Observability Operator is intended for observing network events. Alternatively, checking OVN-Kubernetes network traffic with OVS sampling using the CLI is intended to help with packet tracing; it can also be used while the Network Observability Operator is installed, however that is not a requirement.
Administrators can add the --add-ovs-collect option to view network traffic across the node, or pass in additional flags to filter result for specific pods. Additional flags can be found in the "OVN-Kubernetes network traffic with OVS sampling flags" section.
Use the following procedure to view OVN-Kubernetes network traffic using the CLI.
Prerequisites
-
You are logged in to the cluster as a user with
cluster-adminprivileges. - You have created a source pod and a destination pod and ran traffic between them.
-
You have created at least one of the following network APIs:
NetworkPolicy,AdminNetworkPolicy,BaselineNetworkPolicy,UserDefinedNetworkisolation, multicast, or egress firewalls.
Procedure
To enable the
OVNObservabilitywith OVS sampling feature, enableTechPreviewNoUpgradefeature set in theFeatureGateCR namedclusterby entering the following command:$ oc patch --type=merge --patch '{"spec": {"featureSet": "TechPreviewNoUpgrade"}}' featuregate/clusterExample output
featuregate.config.openshift.io/cluster patchedConfirm that the
OVNObservabilityfeature is enabled by entering the following command:$ oc get featuregate cluster -o yamlExample output
featureGates: # ... enabled: - name: OVNObservabilityObtain a list of the pods inside of the namespace in which you have created one of the relevant network APIs by entering the following command. Note the
NODEname of the pods, as they are used in the following step.$ oc get pods -n <namespace> -o wideExample output
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES destination-pod 1/1 Running 0 53s 10.131.0.23 ci-ln-1gqp7b2-72292-bb9dv-worker-a-gtmpc <none> <none> source-pod 1/1 Running 0 56s 10.131.0.22 ci-ln-1gqp7b2-72292-bb9dv-worker-a-gtmpc <none> <none>Obtain a list of OVN-Kubernetes pods and locate the pod that shares the same
NODEas the pods from the previous step by entering the following command:$ oc get pods -n openshift-ovn-kubernetes -o wideExample output
NAME ... READY STATUS RESTARTS AGE IP NODE NOMINATED NODE ovnkube-node-jzn5b 8/8 Running 1 (34m ago) 37m 10.0.128.2 ci-ln-1gqp7b2-72292-bb9dv-worker-a-gtmpc <none> ...Open a bash shell inside of the
ovnkube-nodepod by entering the following command:$ oc exec -it <pod_name> -n openshift-ovn-kubernetes -- bashWhile inside of the
ovnkube-nodepod, you can run theovnkube-observ -add-ovs-collectorscript to show network events using the OVS collector. For example:# /usr/bin/ovnkube-observ -add-ovs-collectorExample output
... 2024/12/02 19:41:41.327584 OVN-K message: Allowed by default allow from local node policy, direction ingress 2024/12/02 19:41:41.327593 src=10.131.0.2, dst=10.131.0.6 2024/12/02 19:41:41.327692 OVN-K message: Allowed by default allow from local node policy, direction ingress 2024/12/02 19:41:41.327715 src=10.131.0.6, dst=10.131.0.2 ...You can filter the content by type, such as source pods, by entering the following command with the
-filter-src-ipflag and your pod’s IP address. For example:# /usr/bin/ovnkube-observ -add-ovs-collector -filter-src-ip <pod_ip_address>Example output
... Found group packets, id 14 2024/12/10 16:27:12.456473 OVN-K message: Allowed by admin network policy allow-egress-group1, direction Egress 2024/12/10 16:27:12.456570 src=10.131.0.22, dst=10.131.0.23 2024/12/10 16:27:14.484421 OVN-K message: Allowed by admin network policy allow-egress-group1, direction Egress 2024/12/10 16:27:14.484428 src=10.131.0.22, dst=10.131.0.23 2024/12/10 16:27:12.457222 OVN-K message: Allowed by network policy test:allow-ingress-from-specific-pod, direction Ingress 2024/12/10 16:27:12.457228 src=10.131.0.22, dst=10.131.0.23 2024/12/10 16:27:12.457288 OVN-K message: Allowed by network policy test:allow-ingress-from-specific-pod, direction Ingress 2024/12/10 16:27:12.457299 src=10.131.0.22, dst=10.131.0.23 ...For a full list of flags that can be passed in with
/usr/bin/ovnkube-observ, see "OVN-Kubernetes network traffic with OVS sampling flags".
3.7.1. OVN-Kubernetes network traffic with OVS sampling flags Copia collegamentoCollegamento copiato negli appunti!
The following flags are available to view OVN-Kubernetes network traffic by using the CLI. Append these flags to the following syntax in your terminal after you have opened a bash shell inside of the ovnkube-node pod:
Command syntax
# /usr/bin/ovnkube-observ <flag>
| Flag | Description |
|---|---|
|
|
Returns a complete list flags that can be used with the |
|
| Add OVS collector to enable sampling. Use with caution. Make sure no one else is using observability. |
|
|
Enrich samples with NBDB data. Defaults to |
|
| Filter only packets to a given destination IP. |
|
| Filters only packets from a given source IP. |
|
| Print raw sample cookie with psample group_id. |
|
| Output file to write the samples to. |
|
| Print full received packet. When false, only source and destination IPs are printed with every sample. |