Search

Chapter 6. Monitoring and Troubleshooting networks

download PDF

The diagnostic process of monitoring and troubleshooting network connectivity in Red Hat OpenStack Platform is similar to the diagnostic process for physical networks. If you use VLANs, you can consider the virtual infrastructure as a trunked extension of the physical network, rather than a wholly separate environment. There are some differences between troubleshooting an ML2/OVS network and the default, ML2/OVN network.

6.1. Basic ping testing

The ping command is a useful tool for analyzing network connectivity problems. The results serve as a basic indicator of network connectivity, but might not entirely exclude all connectivity issues, such as a firewall blocking the actual application traffic. The ping command sends traffic to specific destinations, and then reports back whether the attempts were successful.

Note

The ping command is an ICMP operation. To use ping, you must allow ICMP traffic to traverse any intermediary firewalls.

Ping tests are most useful when run from the machine experiencing network issues, so it may be necessary to connect to the command line via the VNC management console if the machine seems to be completely offline.

For example, the following ping test command validates multiple layers of network infrastructure in order to succeed; name resolution, IP routing, and network switching must all function correctly:

$ ping www.example.com

PING e1890.b.akamaiedge.net (125.56.247.214) 56(84) bytes of data.
64 bytes from a125-56.247-214.deploy.akamaitechnologies.com (125.56.247.214): icmp_seq=1 ttl=54 time=13.4 ms
64 bytes from a125-56.247-214.deploy.akamaitechnologies.com (125.56.247.214): icmp_seq=2 ttl=54 time=13.5 ms
64 bytes from a125-56.247-214.deploy.akamaitechnologies.com (125.56.247.214): icmp_seq=3 ttl=54 time=13.4 ms
^C

You can terminate the ping command with Ctrl-c, after which a summary of the results is presented. Zero percent packet loss indicates that the connection was stable and did not time out.

--- e1890.b.akamaiedge.net ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 13.461/13.498/13.541/0.100 ms

The results of a ping test can be very revealing, depending on which destination you test. For example, in the following diagram VM1 is experiencing some form of connectivity issue. The possible destinations are numbered in blue, and the conclusions drawn from a successful or failed result are presented:

A sample network
  1. The internet - a common first step is to send a ping test to an internet location, such as www.example.com.

    • Success: This test indicates that all the various network points in between the machine and the Internet are functioning correctly. This includes the virtual and physical network infrastructure.
    • Failure: There are various ways in which a ping test to a distant internet location can fail. If other machines on your network are able to successfully ping the internet, that proves the internet connection is working, and the issue is likely within the configuration of the local machine.
  2. Physical router - This is the router interface that the network administrator designates to direct traffic onward to external destinations.

    • Success: Ping tests to the physical router can determine whether the local network and underlying switches are functioning. These packets do not traverse the router, so they do not prove whether there is a routing issue present on the default gateway.
    • Failure: This indicates that the problem lies between VM1 and the default gateway. The router/switches might be down, or you may be using an incorrect default gateway. Compare the configuration with that on another server that you know is functioning correctly. Try pinging another server on the local network.
  3. Neutron router - This is the virtual SDN (Software-defined Networking) router that Red Hat OpenStack Platform uses to direct the traffic of virtual machines.

    • Success: Firewall is allowing ICMP traffic, the Networking node is online.
    • Failure: Confirm whether ICMP traffic is permitted in the security group of the instance. Check that the Networking node is online, confirm that all the required services are running, and review the L3 agent log (/var/log/neutron/l3-agent.log).
  4. Physical switch - The physical switch manages traffic between nodes on the same physical network.

    • Success: Traffic sent by a VM to the physical switch must pass through the virtual network infrastructure, indicating that this segment is functioning correctly.
    • Failure: Check that the physical switch port is configured to trunk the required VLANs.
  5. VM2 - Attempt to ping a VM on the same subnet, on the same Compute node.

    • Success: The NIC driver and basic IP configuration on VM1 are functional.
    • Failure: Validate the network configuration on VM1. Or, firewall on VM2 might simply be blocking ping traffic. In addition, verify the virtual switching configuration and review the Open vSwitch log files.

6.2. Viewing current port status

A basic troubleshooting task is to create an inventory of all of the ports attached to a router and determine the port status (DOWN or ACTIVE).

Procedure

  1. To view all the ports that attach to the router named r1, run the following command:

    #  openstack port list --router r1

    Sample output

    +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------------+
    | id                                   | name | mac_address       | fixed_ips                                                                            |
    +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------------+
    | b58d26f0-cc03-43c1-ab23-ccdb1018252a |      | fa:16:3e:94:a7:df | {"subnet_id": "a592fdba-babd-48e0-96e8-2dd9117614d3", "ip_address": "192.168.200.1"} |
    | c45e998d-98a1-4b23-bb41-5d24797a12a4 |      | fa:16:3e:ee:6a:f7 | {"subnet_id": "43f8f625-c773-4f18-a691-fd4ebfb3be54", "ip_address": "172.24.4.225"}  |
    +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------------+

  2. To view the details of each port, run the following command. Include the port ID of the port that you want to view. The result includes the port status, indicated in the following example as having an ACTIVE state:

    # openstack port show b58d26f0-cc03-43c1-ab23-ccdb1018252a

    Sample output

    +-----------------------+--------------------------------------------------------------------------------------+
    | Field                 | Value                                                                                |
    +-----------------------+--------------------------------------------------------------------------------------+
    | admin_state_up        | True                                                                                 |
    | allowed_address_pairs |                                                                                      |
    | binding:host_id       | node.example.com                                                      |
    | binding:profile       | {}                                                                                   |
    | binding:vif_details   | {"port_filter": true, "ovs_hybrid_plug": true}                                       |
    | binding:vif_type      | ovs                                                                                  |
    | binding:vnic_type     | normal                                                                               |
    | device_id             | 49c6ebdc-0e62-49ad-a9ca-58cea464472f                                                 |
    | device_owner          | network:router_interface                                                             |
    | extra_dhcp_opts       |                                                                                      |
    | fixed_ips             | {"subnet_id": "a592fdba-babd-48e0-96e8-2dd9117614d3", "ip_address": "192.168.200.1"} |
    | id                    | b58d26f0-cc03-43c1-ab23-ccdb1018252a                                                 |
    | mac_address           | fa:16:3e:94:a7:df                                                                    |
    | name                  |                                                                                      |
    | network_id            | 63c24160-47ac-4140-903d-8f9a670b0ca4                                                 |
    | security_groups       |                                                                                      |
    | status                | ACTIVE                                                                               |
    | tenant_id             | d588d1112e0f496fb6cac22f9be45d49                                                     |
    +-----------------------+--------------------------------------------------------------------------------------+

  3. Perform step 2 for each port to determine its status.

6.3. Troubleshooting connectivity to VLAN provider networks

OpenStack Networking can trunk VLAN networks through to the SDN switches. Support for VLAN-tagged provider networks means that virtual instances can integrate with server subnets in the physical network.

Procedure

  1. Ping the gateway with ping <gateway-IP-address>.

    Consider this example, in which a network is created with these commands:

    # openstack network create --provider-network-type vlan --provider-physical-network phy-eno1 --provider-segment 120 provider
    # openstack subnet create --no-dhcp --allocation-pool start=192.168.120.1,end=192.168.120.153 --gateway 192.168.120.254 --network  provider public_subnet

    In this example, the gateway IP address is 192.168.120.254.

    $ ping 192.168.120.254
  2. If the ping fails, do the following:

    1. Confirm that you have network flow for the associated VLAN.

      It is possible that the VLAN ID has not been set. In this example, OpenStack Networking is configured to trunk VLAN 120 to the provider network. (See --provider:segmentation_id=120 in the example in step 1.)

    2. Confirm the VLAN flow on the bridge interface using the command, ovs-ofctl dump-flows <bridge-name>.

      In this example the bridge is named br-ex:

      # ovs-ofctl dump-flows br-ex
      
       NXST_FLOW reply (xid=0x4):
        cookie=0x0, duration=987.521s, table=0, n_packets=67897, n_bytes=14065247, idle_age=0, priority=1 actions=NORMAL
        cookie=0x0, duration=986.979s, table=0, n_packets=8, n_bytes=648, idle_age=977, priority=2,in_port=12 actions=drop

6.4. Reviewing the VLAN configuration and log files

To help validate or troubleshoot a deployment, you can:

  • verify the registration and status of Red Hat Openstack Platform (RHOSP) Networking service (neutron) agents.
  • validate network configuration values such as VLAN ranges.

Procedure

  1. Use the openstack network agent list command to verify that the RHOSP Networking service agents are up and registered with the correct host names.

    (overcloud)[stack@undercloud~]$ openstack network agent list
    +--------------------------------------+--------------------+-----------------------+-------+----------------+
    | id                                   | agent_type         | host                  | alive | admin_state_up |
    +--------------------------------------+--------------------+-----------------------+-------+----------------+
    | a08397a8-6600-437d-9013-b2c5b3730c0c | Metadata agent     | rhelosp.example.com   | :-)   | True           |
    | a5153cd2-5881-4fc8-b0ad-be0c97734e6a | L3 agent           | rhelosp.example.com   | :-)   | True           |
    | b54f0be7-c555-43da-ad19-5593a075ddf0 | DHCP agent         | rhelosp.example.com   | :-)   | True           |
    | d2be3cb0-4010-4458-b459-c5eb0d4d354b | Open vSwitch agent | rhelosp.example.com   | :-)   | True           |
    +--------------------------------------+--------------------+-----------------------+-------+----------------+
  2. Review /var/log/containers/neutron/openvswitch-agent.log. Look for confirmation that the creation process used the ovs-ofctl command to configure VLAN trunking.
  3. Validate external_network_bridge in the /etc/neutron/l3_agent.ini file. If there is a hardcoded value in the external_network_bridge parameter, you cannot use a provider network with the L3-agent, and you cannot create the necessary flows. The external_network_bridge value must be in the format `external_network_bridge = "" `.
  4. Check the network_vlan_ranges value in the /etc/neutron/plugin.ini file. For provider networks, do not specify the numeric VLAN ID. Specify IDs only when using VLAN isolated project networks.
  5. Validate the OVS agent configuration file bridge mappings, to confirm that the bridge mapped to phy-eno1 exists and is properly connected to eno1.

6.5. Performing basic ICMP testing within the ML2/OVN namespace

As a basic troubleshooting step, you can attempt to ping an instance from an OVN metadata interface that is on the same layer 2 network.

Prerequisites

  • RHOSP deployment, with ML2/OVN as the Networking service (neutron) default mechanism driver.

Procedure

  1. Log in to the overcloud using your Red Hat OpenStack Platform credentials.
  2. Run the openstack server list command to obtain the name of a VM instance.
  3. Run the openstack server show command to determine the Compute node on which the instance is running.

    Example

    $ openstack server show my_instance -c OS-EXT-SRV-ATTR:host \
    -c addresses

    Sample output

    +----------------------+-------------------------------------------------+
    | Field                | Value                                           |
    +----------------------+-------------------------------------------------+
    | OS-EXT-SRV-ATTR:host | compute0.ctlplane.example.com                   |
    | addresses            | finance-network1=192.0.2.2; provider-           |
    |                      | storage=198.51.100.13                           |
    +----------------------+-------------------------------------------------+

  4. Log in to the Compute node host.

    Example

    $ ssh tripleo-admin@compute0.ctlplane

  5. Run the ip netns list command to see the OVN metadata namespaces.

    Sample output

    ovnmeta-07384836-6ab1-4539-b23a-c581cf072011 (id: 1)
    ovnmeta-df9c28ea-c93a-4a60-b913-1e611d6f15aa (id: 0)

  6. Using the metadata namespace run an ip netns exec command to ping the associated network.

    Example

    $ sudo ip netns exec ovnmeta-df9c28ea-c93a-4a60-b913-1e611d6f15aa \
    ping 192.0.2.2

    Sample output

    PING 192.0.2.2 (192.0.2.2) 56(84) bytes of data.
    64 bytes from 192.0.2.2: icmp_seq=1 ttl=64 time=0.470 ms
    64 bytes from 192.0.2.2: icmp_seq=2 ttl=64 time=0.483 ms
    64 bytes from 192.0.2.2: icmp_seq=3 ttl=64 time=0.183 ms
    64 bytes from 192.0.2.2: icmp_seq=4 ttl=64 time=0.296 ms
    64 bytes from 192.0.2.2: icmp_seq=5 ttl=64 time=0.307 ms
    ^C
    --- 192.0.2.2 ping statistics ---
    5 packets transmitted, 5 received, 0% packet loss, time 122ms
    rtt min/avg/max/mdev = 0.183/0.347/0.483/0.116 ms

Additional resources

6.6. Troubleshooting from within project networks (ML2/OVS)

In Red Hat Openstack Platform (RHOSP) ML2/OVS networks, all project traffic is contained within network namespaces so that projects can configure networks without interfering with each other. For example, network namespaces allow different projects to have the same subnet range of 192.168.1.1/24 without interference between them.

Prerequisites

  • RHOSP deployment, with ML2/OVS as the Networking service (neutron) default mechanism driver.

Procedure

  1. Determine which network namespace contains the network, by listing all of the project networks using the openstack network list command:

    $ openstack network list

    In this output, note that the ID for the web-servers network (9cb32fe0-d7fb-432c-b116-f483c6497b08). The command appends the network ID to the network namespace, which enables you to identify the namespace in the next step.

    Sample output

    +--------------------------------------+-------------+-------------------------------------------------------+
    | id                                   | name        | subnets                                               |
    +--------------------------------------+-------------+-------------------------------------------------------+
    | 9cb32fe0-d7fb-432c-b116-f483c6497b08 | web-servers | 453d6769-fcde-4796-a205-66ee01680bba 192.168.212.0/24 |
    | a0cc8cdd-575f-4788-a3e3-5df8c6d0dd81 | private     | c1e58160-707f-44a7-bf94-8694f29e74d3 10.0.0.0/24      |
    | baadd774-87e9-4e97-a055-326bb422b29b | private     | 340c58e1-7fe7-4cf2-96a7-96a0a4ff3231 192.168.200.0/24 |
    | 24ba3a36-5645-4f46-be47-f6af2a7d8af2 | public      | 35f3d2cb-6e4b-4527-a932-952a395c4bb3 172.24.4.224/28  |
    +--------------------------------------+-------------+-------------------------------------------------------+

  2. List all the network namespaces using the ip netns list command:

    # ip netns list

    The output contains a namespace that matches the web-servers network ID.

    In this output, the namespace is qdhcp-9cb32fe0-d7fb-432c-b116-f483c6497b08.

    Sample output

    qdhcp-9cb32fe0-d7fb-432c-b116-f483c6497b08
    qrouter-31680a1c-9b3e-4906-bd69-cb39ed5faa01
    qrouter-62ed467e-abae-4ab4-87f4-13a9937fbd6b
    qdhcp-a0cc8cdd-575f-4788-a3e3-5df8c6d0dd81
    qrouter-e9281608-52a6-4576-86a6-92955df46f56

  3. Examine the configuration of the web-servers network by running commands within the namespace, prefixing the troubleshooting commands with ip netns exec <namespace>.

    In this example, the route -n command is used.

    Example

    # ip netns exec qrouter-62ed467e-abae-4ab4-87f4-13a9937fbd6b route -n

    Sample output

    Kernel IP routing table
    Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
    0.0.0.0         172.24.4.225    0.0.0.0         UG    0      0        0 qg-8d128f89-87
    172.24.4.224    0.0.0.0         255.255.255.240 U     0      0        0 qg-8d128f89-87
    192.168.200.0   0.0.0.0         255.255.255.0   U     0      0        0 qr-8efd6357-96

6.7. Performing advanced ICMP testing within the namespace (ML2/OVS)

You can troubleshoot Red Hat Openstack Platform (RHOSP) ML2/OVS networks, using a combination of tcpdump and ping commands.

Prerequisites

  • RHOSP deployment, with ML2/OVS as the Networking service (neutron) default mechanism driver.

Procedure

  1. Capture ICMP traffic using the tcpdump command:

    Example

    # ip netns exec qrouter-62ed467e-abae-4ab4-87f4-13a9937fbd6b tcpdump -qnntpi any icmp

  2. In a separate command line window, perform a ping test to an external network:

    Example

    # ip netns exec qrouter-62ed467e-abae-4ab4-87f4-13a9937fbd6b ping www.example.com

  3. In the terminal running the tcpdump session, observe detailed results of the ping test.

    Sample output

    tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
    IP (tos 0xc0, ttl 64, id 55447, offset 0, flags [none], proto ICMP (1), length 88)
        172.24.4.228 > 172.24.4.228: ICMP host 192.168.200.20 unreachable, length 68
    	IP (tos 0x0, ttl 64, id 22976, offset 0, flags [DF], proto UDP (17), length 60)
        172.24.4.228.40278 > 192.168.200.21: [bad udp cksum 0xfa7b -> 0xe235!] UDP, length 32

Note

When you perform a tcpdump analysis of traffic, you see the responding packets heading to the router interface rather than to the VM instance. This is expected behavior, as the qrouter performs Destination Network Address Translation (DNAT) on the return packets.

6.8. Creating aliases for OVN troubleshooting commands

You run OVN commands, such as ovn-nbctl show, in the ovn_controller container. The container runs on the Controller node and Compute nodes. To simplify your access to the commands, create and source a script that defines aliases.

Prerequisites

  • Red Hat OpenStack Platform deployment with ML2/OVN as the default mechanism driver.

Procedure

  1. Log in to the Controller host as a user that has the necessary privileges to access the OVN containers.

    Example

    $ ssh tripleo-admin@controller-0.ctlplane

  2. Create a shell script file that contains the ovn commands that you want to run.

    Example

    vi ~/bin/ovn-alias.sh

  3. Add the ovn commands, and save the script file.

    Example

    In this example, the ovn-sbctl, ovn-nbctl, and ovn-trace commands have been added to an alias file:

    REMOTE_IP=$(sudo ovs-vsctl get open . external_ids:ovn-remote)
    NBDB=$(echo $REMOTE_IP | sed 's/6642/6641/g')
    SBDB=$REMOTE_IP
    alias ovn-sbctl="sudo podman exec ovn_controller ovn-sbctl --db=$SBDB"
    alias ovn-nbctl="sudo podman exec ovn_controller ovn-nbctl --db=$NBDB"
    alias ovn-trace="sudo podman exec ovn_controller ovn-trace --db=$SBDB"
  4. Repeat the steps in this procedure on the Compute host.

Validation

  1. Source the script file.

    Example

    # source ovn-alias.sh

  2. Run a command to confirm that your script file works properly.

    Example

    # ovn-nbctl show

    Sample output

    switch 26ce22db-1795-41bd-b561-9827cbd81778 (neutron-f8e79863-6c58-43d0-8f7d-8ec4a423e13b) (aka internal_network)
    	port 1913c3ae-8475-4b60-a479-df7bcce8d9c8
        	addresses: ["fa:16:3e:33:c1:fc 192.168.254.76"]
    	port 1aabaee3-b944-4da2-bf0a-573215d3f3d9
        	addresses: ["fa:16:3e:16:cb:ce 192.168.254.74"]
    	port 7e000980-59f9-4a0f-b76a-4fdf4e86f27b
        	type: localport
        	addresses: ["fa:16:3e:c9:30:ed 192.168.254.2"]

Additional resources

  • ovn-nbctl --help command
  • ovn-sbctl --help command
  • ovn-trace --help command

6.9. Monitoring OVN logical flows

OVN uses logical flows that are tables of flows with a priority, match, and actions. These logical flows are distributed to the ovn-controller running on each Red Hat Openstack Platform (RHOSP) Compute node. Use the ovn-sbctl lflow-list command on the Controller node to view the full set of logical flows.

Prerequisites

Procedure

  1. Log in to the Controller host as a user that has the necessary privileges to access the OVN containers.

    Example

    $ ssh tripleo-admin@controller-0.ctlplane

  2. Source the alias file for the OVN database commands.

    For more information, see Section 6.8, “Creating aliases for OVN troubleshooting commands”.

    Example

    source ~/ovn-alias.sh

  3. View the logical flows:

    $ ovn-sbctl lflow-list
  4. Inspect the output.

    Sample output

    Datapath: "sw0" (d7bf4a7b-e915-4502-8f9d-5995d33f5d10)  Pipeline: ingress
      table=0 (ls_in_port_sec_l2  ), priority=100  , match=(eth.src[40]), action=(drop;)
      table=0 (ls_in_port_sec_l2  ), priority=100  , match=(vlan.present), action=(drop;)
      table=0 (ls_in_port_sec_l2  ), priority=50   , match=(inport == "sw0-port1" && eth.src == {00:00:00:00:00:01}), action=(next;)
      table=0 (ls_in_port_sec_l2  ), priority=50   , match=(inport == "sw0-port2" && eth.src == {00:00:00:00:00:02}), action=(next;)
      table=1 (ls_in_port_sec_ip  ), priority=0    , match=(1), action=(next;)
      table=2 (ls_in_port_sec_nd  ), priority=90   , match=(inport == "sw0-port1" && eth.src == 00:00:00:00:00:01 && arp.sha == 00:00:00:00:00:01), action=(next;)
      table=2 (ls_in_port_sec_nd  ), priority=90   , match=(inport == "sw0-port1" && eth.src == 00:00:00:00:00:01 && ip6 && nd && ((nd.sll == 00:00:00:00:00:00 || nd.sll == 00:00:00:00:00:01) || ((nd.tll == 00:00:00:00:00:00 || nd.tll == 00:00:00:00:00:01)))), action=(next;)
      table=2 (ls_in_port_sec_nd  ), priority=90   , match=(inport == "sw0-port2" && eth.src == 00:00:00:00:00:02 && arp.sha == 00:00:00:00:00:02), action=(next;)
      table=2 (ls_in_port_sec_nd  ), priority=90   , match=(inport == "sw0-port2" && eth.src == 00:00:00:00:00:02 && ip6 && nd && ((nd.sll == 00:00:00:00:00:00 || nd.sll == 00:00:00:00:00:02) || ((nd.tll == 00:00:00:00:00:00 || nd.tll == 00:00:00:00:00:02)))), action=(next;)
      table=2 (ls_in_port_sec_nd  ), priority=80   , match=(inport == "sw0-port1" && (arp || nd)), action=(drop;)
      table=2 (ls_in_port_sec_nd  ), priority=80   , match=(inport == "sw0-port2" && (arp || nd)), action=(drop;)
      table=2 (ls_in_port_sec_nd  ), priority=0    , match=(1), action=(next;)
      table=3 (ls_in_pre_acl      ), priority=0, match=(1), action=(next;)
      table=4 (ls_in_pre_lb       ), priority=0    , match=(1), action=(next;)
      table=5 (ls_in_pre_stateful ), priority=100  , match=(reg0[0] == 1), action=(ct_next;)
      table=5 (ls_in_pre_stateful ), priority=0    , match=(1), action=(next;)
      table=6 (ls_in_acl          ), priority=0    , match=(1), action=(next;)
      table=7 (ls_in_qos_mark     ), priority=0    , match=(1), action=(next;)
      table=8 (ls_in_lb           ), priority=0    , match=(1), action=(next;)
      table=9 (ls_in_stateful     ), priority=100  , match=(reg0[1] == 1), action=(ct_commit(ct_label=0/1); next;)
      table=9 (ls_in_stateful     ), priority=100  , match=(reg0[2] == 1), action=(ct_lb;)
      table=9 (ls_in_stateful     ), priority=0    , match=(1), action=(next;)
      table=10(ls_in_arp_rsp      ), priority=0    , match=(1), action=(next;)
      table=11(ls_in_dhcp_options ), priority=0    , match=(1), action=(next;)
      table=12(ls_in_dhcp_response), priority=0    , match=(1), action=(next;)
      table=13(ls_in_l2_lkup      ), priority=100  , match=(eth.mcast), action=(outport = "_MC_flood"; output;)
      table=13(ls_in_l2_lkup      ), priority=50   , match=(eth.dst == 00:00:00:00:00:01), action=(outport = "sw0-port1"; output;)
      table=13(ls_in_l2_lkup      ), priority=50   , match=(eth.dst == 00:00:00:00:00:02), action=(outport = "sw0-port2"; output;)
    Datapath: "sw0" (d7bf4a7b-e915-4502-8f9d-5995d33f5d10)  Pipeline: egress
      table=0 (ls_out_pre_lb      ), priority=0    , match=(1), action=(next;)
      table=1 (ls_out_pre_acl     ), priority=0    , match=(1), action=(next;)
      table=2 (ls_out_pre_stateful), priority=100  , match=(reg0[0] == 1), action=(ct_next;)
      table=2 (ls_out_pre_stateful), priority=0    , match=(1), action=(next;)
      table=3 (ls_out_lb          ), priority=0    , match=(1), action=(next;)
      table=4 (ls_out_acl         ), priority=0    , match=(1), action=(next;)
      table=5 (ls_out_qos_mark    ), priority=0    , match=(1), action=(next;)
      table=6 (ls_out_stateful    ), priority=100  , match=(reg0[1] == 1), action=(ct_commit(ct_label=0/1); next;)
      table=6 (ls_out_stateful    ), priority=100  , match=(reg0[2] == 1), action=(ct_lb;)
      table=6 (ls_out_stateful    ), priority=0    , match=(1), action=(next;)
      table=7 (ls_out_port_sec_ip ), priority=0    , match=(1), action=(next;)
      table=8 (ls_out_port_sec_l2 ), priority=100  , match=(eth.mcast), action=(output;)
      table=8 (ls_out_port_sec_l2 ), priority=50   , match=(outport == "sw0-port1" && eth.dst == {00:00:00:00:00:01}), action=(output;)
      table=8 (ls_out_port_sec_l2 ), priority=50   , match=(outport == "sw0-port2" && eth.dst == {00:00:00:00:00:02}), action=(output;)

    Key differences between OVN and OpenFlow include:

    • OVN ports are logical entities that reside somewhere on a network, not physical ports on a single switch.
    • OVN gives each table in the pipeline a name in addition to its number. The name describes the purpose of that stage in the pipeline.
    • The OVN match syntax supports complex Boolean expressions.
    • The actions supported in OVN logical flows extend beyond those of OpenFlow. You can implement higher level features, such as DHCP, in the OVN logical flow syntax.
  5. Run an OVN trace.

    The ovn-trace command can simulate how a packet travels through the OVN logical flows, or help you determine why a packet is dropped. Provide the ovn-trace command with the following parameters:

    DATAPATH
    The logical switch or logical router where the simulated packet starts.
    MICROFLOW

    The simulated packet, in the syntax used by the ovn-sb database.

    Example

    This example displays the --minimal output option on a simulated packet and shows that the packet reaches its destination:

    $ ovn-trace --minimal sw0 'inport == "sw0-port1" && eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02'

    Sample output

    #  reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,dl_type=0x0000
        output("sw0-port2");

    Example

    In more detail, the --summary output for this same simulated packet shows the full execution pipeline:

    $ ovn-trace --summary sw0 'inport == "sw0-port1" && eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02'

    Sample output

    The sample output shows:

    • The packet enters the sw0 network from the sw0-port1 port and runs the ingress pipeline.
    • The outport variable is set to sw0-port2 indicating that the intended destination for this packet is sw0-port2.
    • The packet is output from the ingress pipeline, which brings it to the egress pipeline for sw0 with the outport variable set to sw0-port2.
    • The output action is executed in the egress pipeline, which outputs the packet to the current value of the outport variable, which is sw0-port2.

      #  reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,dl_type=0x0000
      ingress(dp="sw0", inport="sw0-port1") {
          outport = "sw0-port2";
          output;
          egress(dp="sw0", inport="sw0-port1", outport="sw0-port2") {
              output;
              /* output to "sw0-port2", type "" */;
          };
      };

Additional resources

6.10. Monitoring OpenFlows

You can use ovs-ofctl dump-flows command to monitor the OpenFlow flows on a logical switch in your Red Hat Openstack Platform (RHOSP) network.

Prerequisites

  • RHOSP deployment with ML2/OVN as the Networking service (neutron) default mechanism driver.

Procedure

  1. Log in to the Controller host as a user that has the necessary privileges to access the OVN containers.

    Example

    $ ssh tripleo-admin@controller-0.ctlplane

  2. Run the ovs-ofctl dump-flows command.

    Example

    $ sudo ovs-ofctl dump-flows br-int

  3. Inspect the output, which resembles the following output.

    Sample output

    $ ovs-ofctl dump-flows br-int
    NXST_FLOW reply (xid=0x4):
     cookie=0x0, duration=72.132s, table=0, n_packets=0, n_bytes=0, idle_age=72, priority=10,in_port=1,dl_src=00:00:00:00:00:01 actions=resubmit(,1)
     cookie=0x0, duration=60.565s, table=0, n_packets=0, n_bytes=0, idle_age=60, priority=10,in_port=2,dl_src=00:00:00:00:00:02 actions=resubmit(,1)
     cookie=0x0, duration=28.127s, table=0, n_packets=0, n_bytes=0, idle_age=28, priority=0 actions=drop
     cookie=0x0, duration=13.887s, table=1, n_packets=0, n_bytes=0, idle_age=13, priority=0,in_port=1 actions=output:2
     cookie=0x0, duration=4.023s, table=1, n_packets=0, n_bytes=0, idle_age=4, priority=0,in_port=2 actions=output:1

Additional resources

  • ovs-ofctl --help command

6.11. Monitoring OVN database status

You can use the ovs-appctl command to monitor connections between OVN database servers.

Prerequisites

  • RHOSP deployment with ML2/OVN as the Networking service (neutron) default mechanism driver.

Procedure

  1. Log in to a Controller host as a user that has the necessary privileges to access the OVN containers.

    Monitoring from a server on a single Controller host provides the information you need to to verify basic cluster health and to diagnose many types of problems. For a very thorough analysis, perform this procedure on all Controllers.

    Example

    $ ssh tripleo-admin@compute-0

  2. Run the ovs-appctl command.

    Example: northbound database

    $ ovs-appctl -t /var/lib/openvswitch/ovn/ovnnb_db.ctl cluster/status OVN_Northbound

    Example: southbound database

    ovs-appctl -t /var/lib/openvswitch/ovn/ovnsb_db.ctl cluster/status OVN_Southbound

  3. Inspect the output, which resembles the following output.

    Sample output: southbound database

    This sample output was generated on server 1114, which was a follower at the time.

    1114
    Name: OVN_Southbound
    Cluster ID: 017a (017add73-58f1-4fcd-ae35-bacc0f07ce57)
    Server ID: 1114 (1114865d-4f42-443a-b758-d4431fc35748)
    Address: tcp:[fd00:fd00:fd00:2000::4a]:6644
    Status: cluster member
    Role: follower
    Term: 90
    Leader: ca6e
    Vote: ca6e
    
    Last Election started 27881511 ms ago, reason: leadership_transfer
    Last Election won: 27881503 ms ago
    Election timer: 16000
    Log: [51470, 51737]
    Entries not yet committed: 0
    Entries not yet applied: 0
    Connections: ->ca6e ->0f90 <-ca6e <-0f90
    Disconnections: 0
    Servers:
        1114 (1114 at tcp:[fd00:fd00:fd00:2000::4a]:6644) (self)
        ca6e (ca6e at tcp:[fd00:fd00:fd00:2000::18f]:6644) last msg 5141 ms ago
        0f90 (0f90 at tcp:[fd00:fd00:fd00:2000::2e0]:6644) last msg 22106129 ms ago

    Diagnostic indications from sample output

    A right-pointing arrow (→) represents outbound connection from this server to another A left-pointing arrow (←) represents inbound connection from another server to this server.

    All servers are active and connected

    Connections: ->ca6e ->0f90 <-ca6e <-0f90

    This three-node cluster appears healthy. The server 1114 has inbound and outbound connections with the other two servers, ca6e and 0f90.

    A server is disconnected from the cluster

    Connections: ->ca6e (->0f90) <-ca6e

    The incoming connection from server 0f90 is not listed. The parenthesis around the outgoing connection indicate that outbound messages to 0f90 failed. For most situations, connecting to any server in the cluster provides enough information to determine whether there are issues with the cluster. Running the diagnostics on all servers provides more detailed information and might detect problems that you cannot detect from a single server.

    The cluster has lost quorum
    Role: candidate
    ...
    Leader: unknown

    This server is a candidate and the leader is unknown.

    The ovsdb-server is down on this node
    2024-03-27T22:10:28Z|00001|unixctl|WARN|failed to connect to /var/lib/openvswitch/ovn/ovnsb_db.ctl
    ovs-appctl: cannot connect to "/var/lib/openvswitch/ovn/ovnsb_db.ctl" (Connection refused)
    
    <exits with non-zero status>

    In this case, you cannot get all the information you need from a single server. For example, you cannot determine whether the other servers are running. If the server is down, run ovs-appctl on another server.

    Time since last message to leader from each follower (only updated on leader)
    Servers:
        1114 (1114 at tcp:[fd00:fd00:fd00:2000::4a]:6644) next_index=51737 match_index=51736 last msg 224 ms ago
        ca6e (ca6e at tcp:[fd00:fd00:fd00:2000::18f]:6644) (self) next_index=51470 match_index=51736
        0f90 (0f90 at tcp:[fd00:fd00:fd00:2000::2e0]:6644) next_index=51737 match_index=51736 last msg 224 ms ago

    Log on to the cluster leader host and run ovs-appctl. Note that a new leader can be elected at any time.

Additional resources

  • ovs-appctl --help command

6.12. Validating your ML2/OVN deployment

Validating the ML2/OVN networks on your Red Hat OpenStack Platform (RHOSP) deployment consists of creating a test network and subnet and performing diagnostic tasks such as verifying that specfic containers are running.

Prerequisites

Procedure

  1. Create a test network and subnet.

    NETWORK_ID=\
    $(openstack network create internal_network | awk '/\| id/ {print $4}')
    
    openstack subnet create internal_subnet \
    --network $NETWORK_ID \
    --dns-nameserver 8.8.8.8 \
    --subnet-range 192.168.254.0/24

    If you encounter errors, perform the steps that follow.

  2. Verify that the relevant containers are running on the Controller host:

    1. Log in to the Controller host as a user that has the necessary privileges to access the OVN containers.

      Example

      $ ssh tripleo-admin@controller-0.ctlplane

    2. Enter the following command:

      $ sudo podman ps -a --format="{{.Names}}"|grep ovn

      As shown in the following sample, the output should list the OVN containers:

      Sample output

      container-puppet-ovn_controller
      ovn_cluster_north_db_server
      ovn_cluster_south_db_server
      ovn_cluster_northd
      ovn_controller

  3. Verify that the relevant containers are running on the Compute host:

    1. Log in to the Compute host as a user that has the necessary privileges to access the OVN containers.

      Example

      $ ssh tripleo-admin@compute-0.ctlplane

    2. Enter the following command:

      $ sudo podman ps -a --format="{{.Names}}"|grep ovn

      As shown in the following sample, the output should list the OVN containers:

      Sample output

      container-puppet-ovn_controller
      ovn_metadata_agent
      ovn_controller

  4. Inspect log files for error messages.

    grep -r ERR /var/log/containers/openvswitch/ /var/log/containers/neutron/
  5. Source an alias file to run the OVN database commands.

    For more information, see Section 6.8, “Creating aliases for OVN troubleshooting commands”.

    Example

    $ source ~/ovn-alias.sh

  6. Query the northbound and southbound databases to check for responsiveness.

    # ovn-nbctl show
    # ovn-sbctl show
  7. Attempt to ping an instance from an OVN metadata interface that is on the same layer 2 network.

    For more information, see Section 6.5, “Performing basic ICMP testing within the ML2/OVN namespace”.

  8. If you need to contact Red Hat for support, perform the steps described in this Red Hat Solution, How to collect all required logs for Red Hat Support to investigate an OpenStack issue.

Additional resources

6.13. Setting the logging mode for ML2/OVN

Set ML2/OVN logging to debug mode for additional troubleshooting information. Set logging back to info mode to use less disk space when you do not need additional debugging information.

Prerequisites

  • Red Hat OpenStack Platform deployment with ML2/OVN as the default mechanism driver.

Procedure

  1. Log in to the Controller or Compute node where you want to set the logging mode as a user that has the necessary privileges to access the OVN containers.

    Example

    $ ssh tripleo-admin@controller-0.ctlplane

  2. Set the ML2/OVN logging mode.

    Debug logging mode
    $ sudo podman exec -it ovn_controller ovn-appctl -t ovn-controller vlog/set dbg
    Info logging mode
    $ sudo podman exec -it ovn_controller ovn-appctl -t ovn-controller vlog/set info

Verification

  • Confirm that the ovn-controller container log now contains debug messages:

    $ sudo grep DBG /var/log/containers/openvswitch/ovn-controller.log

    Sample output

    You should see recent log messages that contain the string |DBG|:

    2022-09-29T20:52:54.638Z|00170|vconn(ovn_pinctrl0)|DBG|unix:/var/run/openvswitch/br-int.mgmt: received: OFPT_ECHO_REQUEST (OF1.5) (xid=0x0): 0 bytes of payload
    2022-09-29T20:52:54.638Z|00171|vconn(ovn_pinctrl0)|DBG|unix:/var/run/openvswitch/br-int.mgmt: sent (Success): OFPT_ECHO_REPLY (OF1.5) (xid=0x0): 0 bytes of payload
  • Confirm that the ovn-controller container log contains a string similar to the following:

    ...received request vlog/set["info"], id=0

6.14. Fixing OVN controllers that fail to register on edge sites

Issue

OVN controllers on Red Hat OpenStack Platform (RHOSP) edge sites fail to register.

Note

This error can occur on RHOSP 17.1 ML2/OVN deployments that were updated from an earlier RHOSP version—​RHOSP 16.1.7 and earlier or RHOSP 16.2.0.

Sample error

The error encountered is similar to the following:

2021-04-12T09:14:48.994Z|04754|ovsdb_idl|WARN|transaction error: {"details":"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"10.14.2.7\") for index on columns \"type\" and \"ip\".  First row, with UUID 3973cad5-eb8a-4f29-85c3-c105d861c0e0, was inserted by this transaction.  Second row, with UUID f06b71a8-4162-475b-8542-d27db3a9097a, existed in the database before this transaction and was not modified by the transaction.","error":"constraint violation"}
Cause
If the ovn-controller process replaces the hostname, it registers another chassis entry which includes another encap entry. For more information, see BZ#1948472.
Resolution

Follow these steps to resolve the problem:

  1. If you have not already, create aliases for the necessary OVN database commands that you will use later in this procedure.

    For more information, see Creating aliases for OVN troubleshooting commands.

  2. Log in to the Controller host as a user that has the necessary privileges to access the OVN containers.

    Example

    $ ssh tripleo-admin@controller-0.ctlplane

  3. Obtain the IP address from the /var/log/containers/openvswitch/ovn-controller.log
  4. Confirm that the IP address is correct:

    ovn-sbctl list encap |grep -a3 <IP address from ovn-controller.log>
  5. Delete the chassis that contains IP address:

    ovn-sbctl chassis-del <chassis-id>
  6. Check the Chassis_Private table to confirm that chassis has been removed:

    ovn-sbctl find Chassis_private chassis="[]"
  7. If any entries are reported, remove them with the following command:

    $ ovn-sbctl destroy Chassis_Private <listed_id>
  8. Restart the following containers:

    • tripleo_ovn_controller
    • tripleo_ovn_metadata_agent

      $ sudo systemctl restart tripleo_ovn_controller
      $ sudo systemctl restart tripleo_ovn_metadata_agent

Verification

  • Confirm that OVN agents are running:

    $ openstack network agent list -c "Agent Type" -c State -c Binary

    Sample output

    +------------------------------+-------+----------------------------+
    | Agent Type                   | State | Binary                     |
    +------------------------------+-------+----------------------------+
    | OVN Controller Gateway agent | UP    | ovn-controller             |
    | OVN Controller Gateway agent | UP    | ovn-controller             |
    | OVN Controller agent         | UP    | ovn-controller             |
    | OVN Metadata agent           | UP    | neutron-ovn-metadata-agent |
    | OVN Controller Gateway agent | UP    | ovn-controller             |
    +------------------------------+-------+----------------------------+

6.15. ML2/OVN log files

Log files track events related to the deployment and operation of the ML2/OVN mechanism driver.

Table 6.1. ML2/OVN log files per node
NodesLogPath /var/log/containers/openvswitch...

Controller, Compute, Networking

OVS northbound database server

.../ovn-controller.log

Controller

OVS northbound database server

.../ovsdb-server-nb.log

Controller

OVS southbound database server

.../ovsdb-server-sb.log

Controller

OVN northbound database server

.../ovn-northd.log

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.