Chapter 6. Monitoring and Troubleshooting networks
The diagnostic process of monitoring and troubleshooting network connectivity in Red Hat OpenStack Platform is similar to the diagnostic process for physical networks. If you use VLANs, you can consider the virtual infrastructure as a trunked extension of the physical network, rather than a wholly separate environment. There are some differences between troubleshooting an ML2/OVS network and the default, ML2/OVN network.
6.1. Basic ping testing
The ping
command is a useful tool for analyzing network connectivity problems. The results serve as a basic indicator of network connectivity, but might not entirely exclude all connectivity issues, such as a firewall blocking the actual application traffic. The ping command sends traffic to specific destinations, and then reports back whether the attempts were successful.
The ping command is an ICMP operation. To use ping
, you must allow ICMP traffic to traverse any intermediary firewalls.
Ping tests are most useful when run from the machine experiencing network issues, so it may be necessary to connect to the command line via the VNC management console if the machine seems to be completely offline.
For example, the following ping test command validates multiple layers of network infrastructure in order to succeed; name resolution, IP routing, and network switching must all function correctly:
$ ping www.example.com PING e1890.b.akamaiedge.net (125.56.247.214) 56(84) bytes of data. 64 bytes from a125-56.247-214.deploy.akamaitechnologies.com (125.56.247.214): icmp_seq=1 ttl=54 time=13.4 ms 64 bytes from a125-56.247-214.deploy.akamaitechnologies.com (125.56.247.214): icmp_seq=2 ttl=54 time=13.5 ms 64 bytes from a125-56.247-214.deploy.akamaitechnologies.com (125.56.247.214): icmp_seq=3 ttl=54 time=13.4 ms ^C
You can terminate the ping command with Ctrl-c, after which a summary of the results is presented. Zero percent packet loss indicates that the connection was stable and did not time out.
--- e1890.b.akamaiedge.net ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 13.461/13.498/13.541/0.100 ms
The results of a ping test can be very revealing, depending on which destination you test. For example, in the following diagram VM1 is experiencing some form of connectivity issue. The possible destinations are numbered in blue, and the conclusions drawn from a successful or failed result are presented:
The internet - a common first step is to send a ping test to an internet location, such as www.example.com.
- Success: This test indicates that all the various network points in between the machine and the Internet are functioning correctly. This includes the virtual and physical network infrastructure.
- Failure: There are various ways in which a ping test to a distant internet location can fail. If other machines on your network are able to successfully ping the internet, that proves the internet connection is working, and the issue is likely within the configuration of the local machine.
Physical router - This is the router interface that the network administrator designates to direct traffic onward to external destinations.
- Success: Ping tests to the physical router can determine whether the local network and underlying switches are functioning. These packets do not traverse the router, so they do not prove whether there is a routing issue present on the default gateway.
- Failure: This indicates that the problem lies between VM1 and the default gateway. The router/switches might be down, or you may be using an incorrect default gateway. Compare the configuration with that on another server that you know is functioning correctly. Try pinging another server on the local network.
Neutron router - This is the virtual SDN (Software-defined Networking) router that Red Hat OpenStack Platform uses to direct the traffic of virtual machines.
- Success: Firewall is allowing ICMP traffic, the Networking node is online.
- Failure: Confirm whether ICMP traffic is permitted in the security group of the instance. Check that the Networking node is online, confirm that all the required services are running, and review the L3 agent log (/var/log/neutron/l3-agent.log).
Physical switch - The physical switch manages traffic between nodes on the same physical network.
- Success: Traffic sent by a VM to the physical switch must pass through the virtual network infrastructure, indicating that this segment is functioning correctly.
- Failure: Check that the physical switch port is configured to trunk the required VLANs.
VM2 - Attempt to ping a VM on the same subnet, on the same Compute node.
- Success: The NIC driver and basic IP configuration on VM1 are functional.
- Failure: Validate the network configuration on VM1. Or, firewall on VM2 might simply be blocking ping traffic. In addition, verify the virtual switching configuration and review the Open vSwitch log files.
6.2. Viewing current port status
A basic troubleshooting task is to create an inventory of all of the ports attached to a router and determine the port status (DOWN
or ACTIVE
).
Procedure
To view all the ports that attach to the router named r1, run the following command:
# openstack port list --router r1
Sample output
+--------------------------------------+------+-------------------+--------------------------------------------------------------------------------------+ | id | name | mac_address | fixed_ips | +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------------+ | b58d26f0-cc03-43c1-ab23-ccdb1018252a | | fa:16:3e:94:a7:df | {"subnet_id": "a592fdba-babd-48e0-96e8-2dd9117614d3", "ip_address": "192.168.200.1"} | | c45e998d-98a1-4b23-bb41-5d24797a12a4 | | fa:16:3e:ee:6a:f7 | {"subnet_id": "43f8f625-c773-4f18-a691-fd4ebfb3be54", "ip_address": "172.24.4.225"} | +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------------+
To view the details of each port, run the following command. Include the port ID of the port that you want to view. The result includes the port status, indicated in the following example as having an
ACTIVE
state:# openstack port show b58d26f0-cc03-43c1-ab23-ccdb1018252a
Sample output
+-----------------------+--------------------------------------------------------------------------------------+ | Field | Value | +-----------------------+--------------------------------------------------------------------------------------+ | admin_state_up | True | | allowed_address_pairs | | | binding:host_id | node.example.com | | binding:profile | {} | | binding:vif_details | {"port_filter": true, "ovs_hybrid_plug": true} | | binding:vif_type | ovs | | binding:vnic_type | normal | | device_id | 49c6ebdc-0e62-49ad-a9ca-58cea464472f | | device_owner | network:router_interface | | extra_dhcp_opts | | | fixed_ips | {"subnet_id": "a592fdba-babd-48e0-96e8-2dd9117614d3", "ip_address": "192.168.200.1"} | | id | b58d26f0-cc03-43c1-ab23-ccdb1018252a | | mac_address | fa:16:3e:94:a7:df | | name | | | network_id | 63c24160-47ac-4140-903d-8f9a670b0ca4 | | security_groups | | | status | ACTIVE | | tenant_id | d588d1112e0f496fb6cac22f9be45d49 | +-----------------------+--------------------------------------------------------------------------------------+
- Perform step 2 for each port to determine its status.
6.3. Troubleshooting connectivity to VLAN provider networks
OpenStack Networking can trunk VLAN networks through to the SDN switches. Support for VLAN-tagged provider networks means that virtual instances can integrate with server subnets in the physical network.
Procedure
Ping the gateway with
ping <gateway-IP-address>
.Consider this example, in which a network is created with these commands:
# openstack network create --provider-network-type vlan --provider-physical-network phy-eno1 --provider-segment 120 provider # openstack subnet create --no-dhcp --allocation-pool start=192.168.120.1,end=192.168.120.153 --gateway 192.168.120.254 --network provider public_subnet
In this example, the gateway IP address is 192.168.120.254.
$ ping 192.168.120.254
If the ping fails, do the following:
Confirm that you have network flow for the associated VLAN.
It is possible that the VLAN ID has not been set. In this example, OpenStack Networking is configured to trunk VLAN 120 to the provider network. (See --provider:segmentation_id=120 in the example in step 1.)
Confirm the VLAN flow on the bridge interface using the command,
ovs-ofctl dump-flows <bridge-name>
.In this example the bridge is named br-ex:
# ovs-ofctl dump-flows br-ex NXST_FLOW reply (xid=0x4): cookie=0x0, duration=987.521s, table=0, n_packets=67897, n_bytes=14065247, idle_age=0, priority=1 actions=NORMAL cookie=0x0, duration=986.979s, table=0, n_packets=8, n_bytes=648, idle_age=977, priority=2,in_port=12 actions=drop
6.4. Reviewing the VLAN configuration and log files
To help validate or troubleshoot a deployment, you can:
- verify the registration and status of Red Hat Openstack Platform (RHOSP) Networking service (neutron) agents.
- validate network configuration values such as VLAN ranges.
Procedure
Use the
openstack network agent list
command to verify that the RHOSP Networking service agents are up and registered with the correct host names.(overcloud)[stack@undercloud~]$ openstack network agent list +--------------------------------------+--------------------+-----------------------+-------+----------------+ | id | agent_type | host | alive | admin_state_up | +--------------------------------------+--------------------+-----------------------+-------+----------------+ | a08397a8-6600-437d-9013-b2c5b3730c0c | Metadata agent | rhelosp.example.com | :-) | True | | a5153cd2-5881-4fc8-b0ad-be0c97734e6a | L3 agent | rhelosp.example.com | :-) | True | | b54f0be7-c555-43da-ad19-5593a075ddf0 | DHCP agent | rhelosp.example.com | :-) | True | | d2be3cb0-4010-4458-b459-c5eb0d4d354b | Open vSwitch agent | rhelosp.example.com | :-) | True | +--------------------------------------+--------------------+-----------------------+-------+----------------+
-
Review
/var/log/containers/neutron/openvswitch-agent.log
. Look for confirmation that the creation process used theovs-ofctl
command to configure VLAN trunking. -
Validate
external_network_bridge
in the/etc/neutron/l3_agent.ini
file. If there is a hardcoded value in theexternal_network_bridge
parameter, you cannot use a provider network with the L3-agent, and you cannot create the necessary flows. Theexternal_network_bridge
value must be in the format `external_network_bridge = "" `. -
Check the
network_vlan_ranges
value in the/etc/neutron/plugin.ini
file. For provider networks, do not specify the numeric VLAN ID. Specify IDs only when using VLAN isolated project networks. -
Validate the
OVS agent configuration file bridge mappings
, to confirm that the bridge mapped tophy-eno1
exists and is properly connected toeno1
.
6.5. Performing basic ICMP testing within the ML2/OVN namespace
As a basic troubleshooting step, you can attempt to ping an instance from an OVN metadata interface that is on the same layer 2 network.
Prerequisites
- RHOSP deployment, with ML2/OVN as the Networking service (neutron) default mechanism driver.
Procedure
- Log in to the overcloud using your Red Hat OpenStack Platform credentials.
-
Run the
openstack server list
command to obtain the name of a VM instance. Run the
openstack server show
command to determine the Compute node on which the instance is running.Example
$ openstack server show my_instance -c OS-EXT-SRV-ATTR:host \ -c addresses
Sample output
+----------------------+-------------------------------------------------+ | Field | Value | +----------------------+-------------------------------------------------+ | OS-EXT-SRV-ATTR:host | compute0.ctlplane.example.com | | addresses | finance-network1=192.0.2.2; provider- | | | storage=198.51.100.13 | +----------------------+-------------------------------------------------+
Log in to the Compute node host.
Example
$ ssh tripleo-admin@compute0.ctlplane
Run the
ip netns list
command to see the OVN metadata namespaces.Sample output
ovnmeta-07384836-6ab1-4539-b23a-c581cf072011 (id: 1) ovnmeta-df9c28ea-c93a-4a60-b913-1e611d6f15aa (id: 0)
Using the metadata namespace run an
ip netns exec
command to ping the associated network.Example
$ sudo ip netns exec ovnmeta-df9c28ea-c93a-4a60-b913-1e611d6f15aa \ ping 192.0.2.2
Sample output
PING 192.0.2.2 (192.0.2.2) 56(84) bytes of data. 64 bytes from 192.0.2.2: icmp_seq=1 ttl=64 time=0.470 ms 64 bytes from 192.0.2.2: icmp_seq=2 ttl=64 time=0.483 ms 64 bytes from 192.0.2.2: icmp_seq=3 ttl=64 time=0.183 ms 64 bytes from 192.0.2.2: icmp_seq=4 ttl=64 time=0.296 ms 64 bytes from 192.0.2.2: icmp_seq=5 ttl=64 time=0.307 ms ^C --- 192.0.2.2 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 122ms rtt min/avg/max/mdev = 0.183/0.347/0.483/0.116 ms
Additional resources
- server show in the Command line interface reference
6.6. Troubleshooting from within project networks (ML2/OVS)
In Red Hat Openstack Platform (RHOSP) ML2/OVS networks, all project traffic is contained within network namespaces so that projects can configure networks without interfering with each other. For example, network namespaces allow different projects to have the same subnet range of 192.168.1.1/24 without interference between them.
Prerequisites
- RHOSP deployment, with ML2/OVS as the Networking service (neutron) default mechanism driver.
Procedure
Determine which network namespace contains the network, by listing all of the project networks using the
openstack network list
command:$ openstack network list
In this output, note that the ID for the
web-servers
network (9cb32fe0-d7fb-432c-b116-f483c6497b08
). The command appends the network ID to the network namespace, which enables you to identify the namespace in the next step.Sample output
+--------------------------------------+-------------+-------------------------------------------------------+ | id | name | subnets | +--------------------------------------+-------------+-------------------------------------------------------+ | 9cb32fe0-d7fb-432c-b116-f483c6497b08 | web-servers | 453d6769-fcde-4796-a205-66ee01680bba 192.168.212.0/24 | | a0cc8cdd-575f-4788-a3e3-5df8c6d0dd81 | private | c1e58160-707f-44a7-bf94-8694f29e74d3 10.0.0.0/24 | | baadd774-87e9-4e97-a055-326bb422b29b | private | 340c58e1-7fe7-4cf2-96a7-96a0a4ff3231 192.168.200.0/24 | | 24ba3a36-5645-4f46-be47-f6af2a7d8af2 | public | 35f3d2cb-6e4b-4527-a932-952a395c4bb3 172.24.4.224/28 | +--------------------------------------+-------------+-------------------------------------------------------+
List all the network namespaces using the
ip netns list
command:# ip netns list
The output contains a namespace that matches the
web-servers
network ID.In this output, the namespace is
qdhcp-9cb32fe0-d7fb-432c-b116-f483c6497b08
.Sample output
qdhcp-9cb32fe0-d7fb-432c-b116-f483c6497b08 qrouter-31680a1c-9b3e-4906-bd69-cb39ed5faa01 qrouter-62ed467e-abae-4ab4-87f4-13a9937fbd6b qdhcp-a0cc8cdd-575f-4788-a3e3-5df8c6d0dd81 qrouter-e9281608-52a6-4576-86a6-92955df46f56
Examine the configuration of the
web-servers
network by running commands within the namespace, prefixing the troubleshooting commands withip netns exec <namespace>
.In this example, the
route -n
command is used.Example
# ip netns exec qrouter-62ed467e-abae-4ab4-87f4-13a9937fbd6b route -n
Sample output
Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 172.24.4.225 0.0.0.0 UG 0 0 0 qg-8d128f89-87 172.24.4.224 0.0.0.0 255.255.255.240 U 0 0 0 qg-8d128f89-87 192.168.200.0 0.0.0.0 255.255.255.0 U 0 0 0 qr-8efd6357-96
6.7. Performing advanced ICMP testing within the namespace (ML2/OVS)
You can troubleshoot Red Hat Openstack Platform (RHOSP) ML2/OVS networks, using a combination of tcpdump
and ping
commands.
Prerequisites
- RHOSP deployment, with ML2/OVS as the Networking service (neutron) default mechanism driver.
Procedure
Capture ICMP traffic using the
tcpdump
command:Example
# ip netns exec qrouter-62ed467e-abae-4ab4-87f4-13a9937fbd6b tcpdump -qnntpi any icmp
In a separate command line window, perform a ping test to an external network:
Example
# ip netns exec qrouter-62ed467e-abae-4ab4-87f4-13a9937fbd6b ping www.example.com
In the terminal running the tcpdump session, observe detailed results of the ping test.
Sample output
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes IP (tos 0xc0, ttl 64, id 55447, offset 0, flags [none], proto ICMP (1), length 88) 172.24.4.228 > 172.24.4.228: ICMP host 192.168.200.20 unreachable, length 68 IP (tos 0x0, ttl 64, id 22976, offset 0, flags [DF], proto UDP (17), length 60) 172.24.4.228.40278 > 192.168.200.21: [bad udp cksum 0xfa7b -> 0xe235!] UDP, length 32
When you perform a tcpdump
analysis of traffic, you see the responding packets heading to the router interface rather than to the VM instance. This is expected behavior, as the qrouter
performs Destination Network Address Translation (DNAT) on the return packets.
6.8. Creating aliases for OVN troubleshooting commands
You run OVN commands, such as ovn-nbctl show
, in the ovn_controller
container. The container runs on the Controller node and Compute nodes. To simplify your access to the commands, create and source a script that defines aliases.
Prerequisites
- Red Hat OpenStack Platform deployment with ML2/OVN as the default mechanism driver.
Procedure
Log in to the Controller host as a user that has the necessary privileges to access the OVN containers.
Example
$ ssh tripleo-admin@controller-0.ctlplane
Create a shell script file that contains the
ovn
commands that you want to run.Example
vi ~/bin/ovn-alias.sh
Add the
ovn
commands, and save the script file.Example
In this example, the
ovn-sbctl
,ovn-nbctl
, andovn-trace
commands have been added to an alias file:REMOTE_IP=$(sudo ovs-vsctl get open . external_ids:ovn-remote) NBDB=$(echo $REMOTE_IP | sed 's/6642/6641/g') SBDB=$REMOTE_IP alias ovn-sbctl="sudo podman exec ovn_controller ovn-sbctl --db=$SBDB" alias ovn-nbctl="sudo podman exec ovn_controller ovn-nbctl --db=$NBDB" alias ovn-trace="sudo podman exec ovn_controller ovn-trace --db=$SBDB"
- Repeat the steps in this procedure on the Compute host.
Validation
Source the script file.
Example
# source ovn-alias.sh
Run a command to confirm that your script file works properly.
Example
# ovn-nbctl show
Sample output
switch 26ce22db-1795-41bd-b561-9827cbd81778 (neutron-f8e79863-6c58-43d0-8f7d-8ec4a423e13b) (aka internal_network) port 1913c3ae-8475-4b60-a479-df7bcce8d9c8 addresses: ["fa:16:3e:33:c1:fc 192.168.254.76"] port 1aabaee3-b944-4da2-bf0a-573215d3f3d9 addresses: ["fa:16:3e:16:cb:ce 192.168.254.74"] port 7e000980-59f9-4a0f-b76a-4fdf4e86f27b type: localport addresses: ["fa:16:3e:c9:30:ed 192.168.254.2"]
Additional resources
-
ovn-nbctl --help
command -
ovn-sbctl --help
command -
ovn-trace --help
command
6.9. Monitoring OVN logical flows
OVN uses logical flows that are tables of flows with a priority, match, and actions. These logical flows are distributed to the ovn-controller
running on each Red Hat Openstack Platform (RHOSP) Compute node. Use the ovn-sbctl lflow-list
command on the Controller node to view the full set of logical flows.
Prerequisites
- RHOSP deployment with ML2/OVN as the Networking service (neutron) default mechanism driver.
Create an alias file for the OVN database commands.
See, Section 6.8, “Creating aliases for OVN troubleshooting commands”.
Procedure
Log in to the Controller host as a user that has the necessary privileges to access the OVN containers.
Example
$ ssh tripleo-admin@controller-0.ctlplane
Source the alias file for the OVN database commands.
For more information, see Section 6.8, “Creating aliases for OVN troubleshooting commands”.
Example
source ~/ovn-alias.sh
View the logical flows:
$ ovn-sbctl lflow-list
Inspect the output.
Sample output
Datapath: "sw0" (d7bf4a7b-e915-4502-8f9d-5995d33f5d10) Pipeline: ingress table=0 (ls_in_port_sec_l2 ), priority=100 , match=(eth.src[40]), action=(drop;) table=0 (ls_in_port_sec_l2 ), priority=100 , match=(vlan.present), action=(drop;) table=0 (ls_in_port_sec_l2 ), priority=50 , match=(inport == "sw0-port1" && eth.src == {00:00:00:00:00:01}), action=(next;) table=0 (ls_in_port_sec_l2 ), priority=50 , match=(inport == "sw0-port2" && eth.src == {00:00:00:00:00:02}), action=(next;) table=1 (ls_in_port_sec_ip ), priority=0 , match=(1), action=(next;) table=2 (ls_in_port_sec_nd ), priority=90 , match=(inport == "sw0-port1" && eth.src == 00:00:00:00:00:01 && arp.sha == 00:00:00:00:00:01), action=(next;) table=2 (ls_in_port_sec_nd ), priority=90 , match=(inport == "sw0-port1" && eth.src == 00:00:00:00:00:01 && ip6 && nd && ((nd.sll == 00:00:00:00:00:00 || nd.sll == 00:00:00:00:00:01) || ((nd.tll == 00:00:00:00:00:00 || nd.tll == 00:00:00:00:00:01)))), action=(next;) table=2 (ls_in_port_sec_nd ), priority=90 , match=(inport == "sw0-port2" && eth.src == 00:00:00:00:00:02 && arp.sha == 00:00:00:00:00:02), action=(next;) table=2 (ls_in_port_sec_nd ), priority=90 , match=(inport == "sw0-port2" && eth.src == 00:00:00:00:00:02 && ip6 && nd && ((nd.sll == 00:00:00:00:00:00 || nd.sll == 00:00:00:00:00:02) || ((nd.tll == 00:00:00:00:00:00 || nd.tll == 00:00:00:00:00:02)))), action=(next;) table=2 (ls_in_port_sec_nd ), priority=80 , match=(inport == "sw0-port1" && (arp || nd)), action=(drop;) table=2 (ls_in_port_sec_nd ), priority=80 , match=(inport == "sw0-port2" && (arp || nd)), action=(drop;) table=2 (ls_in_port_sec_nd ), priority=0 , match=(1), action=(next;) table=3 (ls_in_pre_acl ), priority=0, match=(1), action=(next;) table=4 (ls_in_pre_lb ), priority=0 , match=(1), action=(next;) table=5 (ls_in_pre_stateful ), priority=100 , match=(reg0[0] == 1), action=(ct_next;) table=5 (ls_in_pre_stateful ), priority=0 , match=(1), action=(next;) table=6 (ls_in_acl ), priority=0 , match=(1), action=(next;) table=7 (ls_in_qos_mark ), priority=0 , match=(1), action=(next;) table=8 (ls_in_lb ), priority=0 , match=(1), action=(next;) table=9 (ls_in_stateful ), priority=100 , match=(reg0[1] == 1), action=(ct_commit(ct_label=0/1); next;) table=9 (ls_in_stateful ), priority=100 , match=(reg0[2] == 1), action=(ct_lb;) table=9 (ls_in_stateful ), priority=0 , match=(1), action=(next;) table=10(ls_in_arp_rsp ), priority=0 , match=(1), action=(next;) table=11(ls_in_dhcp_options ), priority=0 , match=(1), action=(next;) table=12(ls_in_dhcp_response), priority=0 , match=(1), action=(next;) table=13(ls_in_l2_lkup ), priority=100 , match=(eth.mcast), action=(outport = "_MC_flood"; output;) table=13(ls_in_l2_lkup ), priority=50 , match=(eth.dst == 00:00:00:00:00:01), action=(outport = "sw0-port1"; output;) table=13(ls_in_l2_lkup ), priority=50 , match=(eth.dst == 00:00:00:00:00:02), action=(outport = "sw0-port2"; output;) Datapath: "sw0" (d7bf4a7b-e915-4502-8f9d-5995d33f5d10) Pipeline: egress table=0 (ls_out_pre_lb ), priority=0 , match=(1), action=(next;) table=1 (ls_out_pre_acl ), priority=0 , match=(1), action=(next;) table=2 (ls_out_pre_stateful), priority=100 , match=(reg0[0] == 1), action=(ct_next;) table=2 (ls_out_pre_stateful), priority=0 , match=(1), action=(next;) table=3 (ls_out_lb ), priority=0 , match=(1), action=(next;) table=4 (ls_out_acl ), priority=0 , match=(1), action=(next;) table=5 (ls_out_qos_mark ), priority=0 , match=(1), action=(next;) table=6 (ls_out_stateful ), priority=100 , match=(reg0[1] == 1), action=(ct_commit(ct_label=0/1); next;) table=6 (ls_out_stateful ), priority=100 , match=(reg0[2] == 1), action=(ct_lb;) table=6 (ls_out_stateful ), priority=0 , match=(1), action=(next;) table=7 (ls_out_port_sec_ip ), priority=0 , match=(1), action=(next;) table=8 (ls_out_port_sec_l2 ), priority=100 , match=(eth.mcast), action=(output;) table=8 (ls_out_port_sec_l2 ), priority=50 , match=(outport == "sw0-port1" && eth.dst == {00:00:00:00:00:01}), action=(output;) table=8 (ls_out_port_sec_l2 ), priority=50 , match=(outport == "sw0-port2" && eth.dst == {00:00:00:00:00:02}), action=(output;)
Key differences between OVN and OpenFlow include:
- OVN ports are logical entities that reside somewhere on a network, not physical ports on a single switch.
- OVN gives each table in the pipeline a name in addition to its number. The name describes the purpose of that stage in the pipeline.
- The OVN match syntax supports complex Boolean expressions.
- The actions supported in OVN logical flows extend beyond those of OpenFlow. You can implement higher level features, such as DHCP, in the OVN logical flow syntax.
Run an OVN trace.
The
ovn-trace
command can simulate how a packet travels through the OVN logical flows, or help you determine why a packet is dropped. Provide theovn-trace
command with the following parameters:- DATAPATH
- The logical switch or logical router where the simulated packet starts.
- MICROFLOW
The simulated packet, in the syntax used by the
ovn-sb
database.Example
This example displays the
--minimal
output option on a simulated packet and shows that the packet reaches its destination:$ ovn-trace --minimal sw0 'inport == "sw0-port1" && eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02'
Sample output
# reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,dl_type=0x0000 output("sw0-port2");
Example
In more detail, the
--summary
output for this same simulated packet shows the full execution pipeline:$ ovn-trace --summary sw0 'inport == "sw0-port1" && eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02'
Sample output
The sample output shows:
-
The packet enters the
sw0
network from thesw0-port1
port and runs the ingress pipeline. -
The
outport
variable is set tosw0-port2
indicating that the intended destination for this packet issw0-port2
. -
The packet is output from the ingress pipeline, which brings it to the egress pipeline for
sw0
with theoutport
variable set tosw0-port2
. The output action is executed in the egress pipeline, which outputs the packet to the current value of the
outport
variable, which issw0-port2
.# reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,dl_type=0x0000 ingress(dp="sw0", inport="sw0-port1") { outport = "sw0-port2"; output; egress(dp="sw0", inport="sw0-port1", outport="sw0-port2") { output; /* output to "sw0-port2", type "" */; }; };
-
The packet enters the
Additional resources
- Section 6.8, “Creating aliases for OVN troubleshooting commands”
-
ovn-sbctl --help
command -
ovn-trace --help
command
6.10. Monitoring OpenFlows
You can use ovs-ofctl dump-flows
command to monitor the OpenFlow flows on a logical switch in your Red Hat Openstack Platform (RHOSP) network.
Prerequisites
- RHOSP deployment with ML2/OVN as the Networking service (neutron) default mechanism driver.
Procedure
Log in to the Controller host as a user that has the necessary privileges to access the OVN containers.
Example
$ ssh tripleo-admin@controller-0.ctlplane
Run the
ovs-ofctl dump-flows
command.Example
$ sudo ovs-ofctl dump-flows br-int
Inspect the output, which resembles the following output.
Sample output
$ ovs-ofctl dump-flows br-int NXST_FLOW reply (xid=0x4): cookie=0x0, duration=72.132s, table=0, n_packets=0, n_bytes=0, idle_age=72, priority=10,in_port=1,dl_src=00:00:00:00:00:01 actions=resubmit(,1) cookie=0x0, duration=60.565s, table=0, n_packets=0, n_bytes=0, idle_age=60, priority=10,in_port=2,dl_src=00:00:00:00:00:02 actions=resubmit(,1) cookie=0x0, duration=28.127s, table=0, n_packets=0, n_bytes=0, idle_age=28, priority=0 actions=drop cookie=0x0, duration=13.887s, table=1, n_packets=0, n_bytes=0, idle_age=13, priority=0,in_port=1 actions=output:2 cookie=0x0, duration=4.023s, table=1, n_packets=0, n_bytes=0, idle_age=4, priority=0,in_port=2 actions=output:1
Additional resources
-
ovs-ofctl --help
command
6.11. Monitoring OVN database status
You can use the ovs-appctl
command to monitor connections between OVN database servers.
Prerequisites
- RHOSP deployment with ML2/OVN as the Networking service (neutron) default mechanism driver.
Procedure
Log in to a Controller host as a user that has the necessary privileges to access the OVN containers.
Monitoring from a server on a single Controller host provides the information you need to to verify basic cluster health and to diagnose many types of problems. For a very thorough analysis, perform this procedure on all Controllers.
Example
$ ssh tripleo-admin@compute-0
Run the
ovs-appctl
command.Example: northbound database
$ ovs-appctl -t /var/lib/openvswitch/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
Example: southbound database
ovs-appctl -t /var/lib/openvswitch/ovn/ovnsb_db.ctl cluster/status OVN_Southbound
Inspect the output, which resembles the following output.
Sample output: southbound database
This sample output was generated on server 1114, which was a follower at the time.
1114 Name: OVN_Southbound Cluster ID: 017a (017add73-58f1-4fcd-ae35-bacc0f07ce57) Server ID: 1114 (1114865d-4f42-443a-b758-d4431fc35748) Address: tcp:[fd00:fd00:fd00:2000::4a]:6644 Status: cluster member Role: follower Term: 90 Leader: ca6e Vote: ca6e Last Election started 27881511 ms ago, reason: leadership_transfer Last Election won: 27881503 ms ago Election timer: 16000 Log: [51470, 51737] Entries not yet committed: 0 Entries not yet applied: 0 Connections: ->ca6e ->0f90 <-ca6e <-0f90 Disconnections: 0 Servers: 1114 (1114 at tcp:[fd00:fd00:fd00:2000::4a]:6644) (self) ca6e (ca6e at tcp:[fd00:fd00:fd00:2000::18f]:6644) last msg 5141 ms ago 0f90 (0f90 at tcp:[fd00:fd00:fd00:2000::2e0]:6644) last msg 22106129 ms ago
Diagnostic indications from sample output
A right-pointing arrow (→) represents outbound connection from this server to another A left-pointing arrow (←) represents inbound connection from another server to this server.
- All servers are active and connected
Connections: ->ca6e ->0f90 <-ca6e <-0f90
This three-node cluster appears healthy. The server 1114 has inbound and outbound connections with the other two servers, ca6e and 0f90.
- A server is disconnected from the cluster
Connections: ->ca6e (->0f90) <-ca6e
The incoming connection from server 0f90 is not listed. The parenthesis around the outgoing connection indicate that outbound messages to 0f90 failed. For most situations, connecting to any server in the cluster provides enough information to determine whether there are issues with the cluster. Running the diagnostics on all servers provides more detailed information and might detect problems that you cannot detect from a single server.
- The cluster has lost quorum
Role: candidate ... Leader: unknown
This server is a candidate and the leader is unknown.
- The ovsdb-server is down on this node
2024-03-27T22:10:28Z|00001|unixctl|WARN|failed to connect to /var/lib/openvswitch/ovn/ovnsb_db.ctl ovs-appctl: cannot connect to "/var/lib/openvswitch/ovn/ovnsb_db.ctl" (Connection refused) <exits with non-zero status>
In this case, you cannot get all the information you need from a single server. For example, you cannot determine whether the other servers are running. If the server is down, run ovs-appctl on another server.
- Time since last message to leader from each follower (only updated on leader)
Servers: 1114 (1114 at tcp:[fd00:fd00:fd00:2000::4a]:6644) next_index=51737 match_index=51736 last msg 224 ms ago ca6e (ca6e at tcp:[fd00:fd00:fd00:2000::18f]:6644) (self) next_index=51470 match_index=51736 0f90 (0f90 at tcp:[fd00:fd00:fd00:2000::2e0]:6644) next_index=51737 match_index=51736 last msg 224 ms ago
Log on to the cluster leader host and run ovs-appctl. Note that a new leader can be elected at any time.
Additional resources
-
ovs-appctl --help
command
6.12. Validating your ML2/OVN deployment
Validating the ML2/OVN networks on your Red Hat OpenStack Platform (RHOSP) deployment consists of creating a test network and subnet and performing diagnostic tasks such as verifying that specfic containers are running.
Prerequisites
- New deployment of RHOSP, with ML2/OVN as the Networking service (neutron) default mechanism driver.
Create an alias file for the OVN database commands.
See, Section 6.8, “Creating aliases for OVN troubleshooting commands”.
Procedure
Create a test network and subnet.
NETWORK_ID=\ $(openstack network create internal_network | awk '/\| id/ {print $4}') openstack subnet create internal_subnet \ --network $NETWORK_ID \ --dns-nameserver 8.8.8.8 \ --subnet-range 192.168.254.0/24
If you encounter errors, perform the steps that follow.
Verify that the relevant containers are running on the Controller host:
Log in to the Controller host as a user that has the necessary privileges to access the OVN containers.
Example
$ ssh tripleo-admin@controller-0.ctlplane
Enter the following command:
$ sudo podman ps -a --format="{{.Names}}"|grep ovn
As shown in the following sample, the output should list the OVN containers:
Sample output
container-puppet-ovn_controller ovn_cluster_north_db_server ovn_cluster_south_db_server ovn_cluster_northd ovn_controller
Verify that the relevant containers are running on the Compute host:
Log in to the Compute host as a user that has the necessary privileges to access the OVN containers.
Example
$ ssh tripleo-admin@compute-0.ctlplane
Enter the following command:
$ sudo podman ps -a --format="{{.Names}}"|grep ovn
As shown in the following sample, the output should list the OVN containers:
Sample output
container-puppet-ovn_controller ovn_metadata_agent ovn_controller
Inspect log files for error messages.
grep -r ERR /var/log/containers/openvswitch/ /var/log/containers/neutron/
Source an alias file to run the OVN database commands.
For more information, see Section 6.8, “Creating aliases for OVN troubleshooting commands”.
Example
$ source ~/ovn-alias.sh
Query the northbound and southbound databases to check for responsiveness.
# ovn-nbctl show # ovn-sbctl show
Attempt to ping an instance from an OVN metadata interface that is on the same layer 2 network.
For more information, see Section 6.5, “Performing basic ICMP testing within the ML2/OVN namespace”.
- If you need to contact Red Hat for support, perform the steps described in this Red Hat Solution, How to collect all required logs for Red Hat Support to investigate an OpenStack issue.
Additional resources
- network create in the Command line interface reference
- subnet create in the Command line interface reference
- Section 6.8, “Creating aliases for OVN troubleshooting commands”
-
ovn-nbctl --help
command -
ovn-sbctl --help
command
6.13. Setting the logging mode for ML2/OVN
Set ML2/OVN logging to debug mode for additional troubleshooting information. Set logging back to info mode to use less disk space when you do not need additional debugging information.
Prerequisites
- Red Hat OpenStack Platform deployment with ML2/OVN as the default mechanism driver.
Procedure
Log in to the Controller or Compute node where you want to set the logging mode as a user that has the necessary privileges to access the OVN containers.
Example
$ ssh tripleo-admin@controller-0.ctlplane
Set the ML2/OVN logging mode.
- Debug logging mode
$ sudo podman exec -it ovn_controller ovn-appctl -t ovn-controller vlog/set dbg
- Info logging mode
$ sudo podman exec -it ovn_controller ovn-appctl -t ovn-controller vlog/set info
Verification
Confirm that the
ovn-controller
container log now contains debug messages:$ sudo grep DBG /var/log/containers/openvswitch/ovn-controller.log
Sample output
You should see recent log messages that contain the string
|DBG|
:2022-09-29T20:52:54.638Z|00170|vconn(ovn_pinctrl0)|DBG|unix:/var/run/openvswitch/br-int.mgmt: received: OFPT_ECHO_REQUEST (OF1.5) (xid=0x0): 0 bytes of payload 2022-09-29T20:52:54.638Z|00171|vconn(ovn_pinctrl0)|DBG|unix:/var/run/openvswitch/br-int.mgmt: sent (Success): OFPT_ECHO_REPLY (OF1.5) (xid=0x0): 0 bytes of payload
Confirm that the ovn-controller container log contains a string similar to the following:
...received request vlog/set["info"], id=0
Additional resources
6.14. Fixing OVN controllers that fail to register on edge sites
- Issue
OVN controllers on Red Hat OpenStack Platform (RHOSP) edge sites fail to register.
NoteThis error can occur on RHOSP 17.1 ML2/OVN deployments that were updated from an earlier RHOSP version—RHOSP 16.1.7 and earlier or RHOSP 16.2.0.
- Sample error
The error encountered is similar to the following:
2021-04-12T09:14:48.994Z|04754|ovsdb_idl|WARN|transaction error: {"details":"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"10.14.2.7\") for index on columns \"type\" and \"ip\". First row, with UUID 3973cad5-eb8a-4f29-85c3-c105d861c0e0, was inserted by this transaction. Second row, with UUID f06b71a8-4162-475b-8542-d27db3a9097a, existed in the database before this transaction and was not modified by the transaction.","error":"constraint violation"}
- Cause
-
If the
ovn-controller
process replaces the hostname, it registers another chassis entry which includes another encap entry. For more information, see BZ#1948472. - Resolution
Follow these steps to resolve the problem:
If you have not already, create aliases for the necessary OVN database commands that you will use later in this procedure.
For more information, see Creating aliases for OVN troubleshooting commands.
Log in to the Controller host as a user that has the necessary privileges to access the OVN containers.
Example
$ ssh tripleo-admin@controller-0.ctlplane
-
Obtain the IP address from the
/var/log/containers/openvswitch/ovn-controller.log
Confirm that the IP address is correct:
ovn-sbctl list encap |grep -a3 <IP address from ovn-controller.log>
Delete the chassis that contains IP address:
ovn-sbctl chassis-del <chassis-id>
Check the
Chassis_Private
table to confirm that chassis has been removed:ovn-sbctl find Chassis_private chassis="[]"
If any entries are reported, remove them with the following command:
$ ovn-sbctl destroy Chassis_Private <listed_id>
Restart the following containers:
-
tripleo_ovn_controller
tripleo_ovn_metadata_agent
$ sudo systemctl restart tripleo_ovn_controller $ sudo systemctl restart tripleo_ovn_metadata_agent
-
Verification
Confirm that OVN agents are running:
$ openstack network agent list -c "Agent Type" -c State -c Binary
Sample output
+------------------------------+-------+----------------------------+ | Agent Type | State | Binary | +------------------------------+-------+----------------------------+ | OVN Controller Gateway agent | UP | ovn-controller | | OVN Controller Gateway agent | UP | ovn-controller | | OVN Controller agent | UP | ovn-controller | | OVN Metadata agent | UP | neutron-ovn-metadata-agent | | OVN Controller Gateway agent | UP | ovn-controller | +------------------------------+-------+----------------------------+
6.15. ML2/OVN log files
Log files track events related to the deployment and operation of the ML2/OVN mechanism driver.
Nodes | Log | Path /var/log/containers/openvswitch... |
---|---|---|
Controller, Compute, Networking | OVS northbound database server |
|
Controller | OVS northbound database server |
|
Controller | OVS southbound database server |
|
Controller | OVN northbound database server |
|