Este contenido no está disponible en el idioma seleccionado.

Chapter 2. Troubleshooting node network configuration


If the node network configuration encounters an issue, the policy is automatically rolled back and the enactments report failure. This includes issues such as:

  • The configuration fails to be applied on the host.
  • The host loses connection to the default gateway.
  • The host loses connection to the API server.

You can apply changes to the node network configuration across your entire cluster by applying a node network configuration policy.

If you applied an incorrect configuration, you can use the following example to troubleshoot and correct the failed node network policy. The example attempts to apply a Linux bridge policy to a cluster that has three control plane nodes and three compute nodes. The policy is not applied because the policy references the wrong interface.

To find an error, you need to investigate the available NMState resources. You can then update the policy with the correct configuration.

Prerequisites

  • You ensured that an ens01 interface does not exist on your Linux system.

Procedure

  1. Create a policy on your cluster. The following example creates a simple bridge, br1 that has ens01 as its member:

    apiVersion: nmstate.io/v1
    kind: NodeNetworkConfigurationPolicy
    metadata:
      name: ens01-bridge-testfail
    spec:
      desiredState:
        interfaces:
          - name: br1
            description: Linux bridge with the wrong port
            type: linux-bridge
            state: up
            ipv4:
              dhcp: true
              enabled: true
            bridge:
              options:
                stp:
                  enabled: false
              port:
                - name: ens01
    # ...
    Copy to Clipboard Toggle word wrap
  2. Apply the policy to your network interface:

    $ oc apply -f ens01-bridge-testfail.yaml
    Copy to Clipboard Toggle word wrap

    Example output

    nodenetworkconfigurationpolicy.nmstate.io/ens01-bridge-testfail created
    Copy to Clipboard Toggle word wrap

  3. Verify the status of the policy by running the following command:

    $ oc get nncp
    Copy to Clipboard Toggle word wrap

    The output shows that the policy failed:

    Example output

    NAME                    STATUS
    ens01-bridge-testfail   FailedToConfigure
    Copy to Clipboard Toggle word wrap

    The policy status alone does not indicate if it failed on all nodes or a subset of nodes.

  4. List the node network configuration enactments to see if the policy was successful on any of the nodes. If the policy failed for only a subset of nodes, the output suggests that the problem is with a specific node configuration. If the policy failed on all nodes, the output suggests that the problem is with the policy.

    $ oc get nnce
    Copy to Clipboard Toggle word wrap

    The output shows that the policy failed on all nodes:

    Example output

    NAME                                         STATUS
    control-plane-1.ens01-bridge-testfail        FailedToConfigure
    control-plane-2.ens01-bridge-testfail        FailedToConfigure
    control-plane-3.ens01-bridge-testfail        FailedToConfigure
    compute-1.ens01-bridge-testfail              FailedToConfigure
    compute-2.ens01-bridge-testfail              FailedToConfigure
    compute-3.ens01-bridge-testfail              FailedToConfigure
    Copy to Clipboard Toggle word wrap

  5. View one of the failed enactments. The following command uses the output tool jsonpath to filter the output:

    $ oc get nnce compute-1.ens01-bridge-testfail -o jsonpath='{.status.conditions[?(@.type=="Failing")].message}'
    Copy to Clipboard Toggle word wrap

    Example output

    [2024-10-10T08:40:46Z INFO  nmstatectl] Nmstate version: 2.2.37
    NmstateError: InvalidArgument: Controller interface br1 is holding unknown port ens01
    Copy to Clipboard Toggle word wrap

    The previous example shows the output from an InvalidArgument error that indicates that the ens01 is an unknown port. For this example, you might need to change the port configuration in the policy configuration file.

  6. To ensure that the policy is configured properly, view the network configuration for one or all of the nodes by requesting the NodeNetworkState object. The following command returns the network configuration for the control-plane-1 node:

    $ oc get nns control-plane-1 -o yaml
    Copy to Clipboard Toggle word wrap

    The output shows that the interface name on the nodes is ens1 but the failed policy incorrectly uses ens01:

    Example output

       - ipv4:
    # ...
          name: ens1
          state: up
          type: ethernet
    Copy to Clipboard Toggle word wrap

  7. Correct the error by editing the existing policy:

    $ oc edit nncp ens01-bridge-testfail
    Copy to Clipboard Toggle word wrap
    # ...
              port:
                - name: ens1
    Copy to Clipboard Toggle word wrap

    Save the policy to apply the correction.

  8. Check the status of the policy to ensure it updated successfully:

    $ oc get nncp
    Copy to Clipboard Toggle word wrap

    Example output

    NAME                    STATUS
    ens01-bridge-testfail   SuccessfullyConfigured
    Copy to Clipboard Toggle word wrap

    The updated policy is successfully configured on all nodes in the cluster.

If you experience DNS connectivity issues when configuring nmstate in a disconnected environment, you can configure the DNS server to resolve the list of name servers for the domain root-servers.net.

Important

Ensure that the DNS server includes a name server (NS) entry for the root-servers.net zone. The DNS server does not need to forward a query to an upstream resolver, but the server must return a correct answer for the NS query.

2.2.1. Configuring the bind9 DNS named server

For a cluster configured to query a bind9 DNS server, you can add the root-servers.net zone to a configuration file that contains at least one NS record. For example you can use the /var/named/named.localhost as a zone file that already matches this criteria.

Procedure

  1. Add the root-servers.net zone at the end of the /etc/named.conf configuration file by running the following command:

    $ cat >> /etc/named.conf <<EOF
    zone "root-servers.net" IN {
        	type master;
        	file "named.localhost";
    };
    EOF
    Copy to Clipboard Toggle word wrap
  2. Restart the named service by running the following command:

    $ systemctl restart named
    Copy to Clipboard Toggle word wrap
  3. Confirm that the root-servers.net zone is present by running the following command:

    $ journalctl -u named|grep root-servers.net
    Copy to Clipboard Toggle word wrap

    Example output

    Jul 03 15:16:26 rhel-8-10 bash[xxxx]: zone root-servers.net/IN: loaded serial 0
    Jul 03 15:16:26 rhel-8-10 named[xxxx]: zone root-servers.net/IN: loaded serial 0
    Copy to Clipboard Toggle word wrap

  4. Verify that the DNS server can resolve the NS record for the root-servers.net domain by running the following command:

    $ host -t NS root-servers.net. 127.0.0.1
    Copy to Clipboard Toggle word wrap

    Example output

    Using domain server:
    Name: 127.0.0.1
    Address: 127.0.0.53
    Aliases:
    root-servers.net name server root-servers.net.
    Copy to Clipboard Toggle word wrap

2.2.2. Configuring the dnsmasq DNS server

If you are using dnsmasq as the DNS server, you can delegate resolution of the root-servers.net domain to another DNS server, for example, by creating a new configuration file that resolves root-servers.net using a DNS server that you specify.

  1. Create a configuration file that delegates the domain root-servers.net to another DNS server by running the following command:

    $ echo 'server=/root-servers.net/<DNS_server_IP>'> /etc/dnsmasq.d/delegate-root-servers.net.conf
    Copy to Clipboard Toggle word wrap
  2. Restart the dnsmasq service by running the following command:

    $ systemctl restart dnsmasq
    Copy to Clipboard Toggle word wrap
  3. Confirm that the root-servers.net domain is delegated to another DNS server by running the following command:

    $ journalctl -u dnsmasq|grep root-servers.net
    Copy to Clipboard Toggle word wrap

    Example output

    Jul 03 15:31:25 rhel-8-10 dnsmasq[1342]: using nameserver 192.168.1.1#53 for domain root-servers.net
    Copy to Clipboard Toggle word wrap

  4. Verify that the DNS server can resolve the NS record for the root-servers.net domain by running the following command:

    $ host -t NS root-servers.net. 127.0.0.1
    Copy to Clipboard Toggle word wrap

    Example output

    Using domain server:
    Name: 127.0.0.1
    Address: 127.0.0.1#53
    Aliases:
    root-servers.net name server root-servers.net.
    Copy to Clipboard Toggle word wrap

Volver arriba
Red Hat logoGithubredditYoutubeTwitter

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Ayudamos a los usuarios de Red Hat a innovar y alcanzar sus objetivos con nuestros productos y servicios con contenido en el que pueden confiar. Explore nuestras recientes actualizaciones.

Hacer que el código abierto sea más inclusivo

Red Hat se compromete a reemplazar el lenguaje problemático en nuestro código, documentación y propiedades web. Para más detalles, consulte el Blog de Red Hat.

Acerca de Red Hat

Ofrecemos soluciones reforzadas que facilitan a las empresas trabajar en plataformas y entornos, desde el centro de datos central hasta el perímetro de la red.

Theme

© 2025 Red Hat