11.3. Troubleshooting node network configuration
If the node network configuration encounters an issue, the Policy is automatically rolled back and the Enactments report failure. This includes issues such as:
- The configuration fails to be applied on the host.
- The host loses connection to the default gateway.
- The host loses connection to the API server.
11.3.1. Troubleshooting an incorrect NodeNetworkConfigurationPolicy configuration
You can apply changes to the node network configuration across your entire cluster by applying a NodeNetworkConfigurationPolicy. If you apply an incorrect configuration, you can use the following example to troubleshoot and correct the failed network Policy.
In this example, a Linux bridge Policy is applied to an example cluster that has 3 master nodes and 3 worker nodes. The Policy fails to be applied because it references an incorrect interface. To find the error, investigate the available nmstate resources. You can then update the Policy with the correct configuration.
Procedure
Create a Policy and apply it to your cluster. The following example creates a simple bridge on the
ens01
interface:apiVersion: nmstate.io/v1alpha1 kind: NodeNetworkConfigurationPolicy metadata: name: ens01-bridge-testfail spec: desiredState: interfaces: - name: br1 description: Linux bridge with the wrong port type: linux-bridge state: up ipv4: dhcp: true enabled: true bridge: options: stp: enabled: false port: - name: ens01
$ oc apply -f ens01-bridge-testfail.yaml
Example output
nodenetworkconfigurationpolicy.nmstate.io/ens01-bridge-testfail created
Verify the status of the Policy by running the following command:
$ oc get nncp
The output shows that the Policy failed:
Example output
NAME STATUS ens01-bridge-testfail FailedToConfigure
However the Policy status alone does not indicate if it failed on all nodes or a subset of nodes.
List the Enactments to see if the Policy was successful on any of the nodes. If the Policy failed for only a subset it suggests the problem is with specific node configuration; if the Policy failed on all nodes it suggest the problem is with the Policy.
$ oc get nnce
The output shows that the Policy failed on all nodes:
Example output
NAME STATUS master-1.ens01-bridge-testfail FailedToConfigure master-2.ens01-bridge-testfail FailedToConfigure master-3.ens01-bridge-testfail FailedToConfigure worker-1.ens01-bridge-testfail FailedToConfigure worker-2.ens01-bridge-testfail FailedToConfigure worker-3.ens01-bridge-testfail FailedToConfigure
View one of the failed Enactments and look at the traceback. The following command uses the output tool
jsonpath
to filter the output:$ oc get nnce worker-1.ens01-bridge-testfail -o jsonpath='{.status.conditions[?(@.type=="Failing")].message}'
This command returns a large traceback that has been edited for brevity:
Example output
error reconciling NodeNetworkConfigurationPolicy at desired state apply: , failed to execute nmstatectl set --no-commit --timeout 480: 'exit status 1' '' ... libnmstate.error.NmstateVerificationError: desired ======= --- name: br1 type: linux-bridge state: up bridge: options: group-forward-mask: 0 mac-ageing-time: 300 multicast-snooping: true stp: enabled: false forward-delay: 15 hello-time: 2 max-age: 20 priority: 32768 port: - name: ens01 description: Linux bridge with the wrong port ipv4: address: [] auto-dns: true auto-gateway: true auto-routes: true dhcp: true enabled: true ipv6: enabled: false mac-address: 01-23-45-67-89-AB mtu: 1500 current ======= --- name: br1 type: linux-bridge state: up bridge: options: group-forward-mask: 0 mac-ageing-time: 300 multicast-snooping: true stp: enabled: false forward-delay: 15 hello-time: 2 max-age: 20 priority: 32768 port: [] description: Linux bridge with the wrong port ipv4: address: [] auto-dns: true auto-gateway: true auto-routes: true dhcp: true enabled: true ipv6: enabled: false mac-address: 01-23-45-67-89-AB mtu: 1500 difference ========== --- desired +++ current @@ -13,8 +13,7 @@ hello-time: 2 max-age: 20 priority: 32768 - port: - - name: ens01 + port: [] description: Linux bridge with the wrong port ipv4: address: [] line 651, in _assert_interfaces_equal\n current_state.interfaces[ifname],\nlibnmstate.error.NmstateVerificationError:
The
NmstateVerificationError
lists thedesired
Policy configuration, thecurrent
configuration of the Policy on the node, and thedifference
highlighting the parameters that do not match. In this example, theport
is included in thedifference
, which suggests that the problem is the port configuration in the Policy.To ensure that the Policy is configured properly, view the network configuration for one or all of the nodes by requesting the
NodeNetworkState
. The following command returns the network configuration for themaster-1
node:$ oc get nns master-1 -o yaml
The output shows that the interface name on the nodes is
ens1
but the failed Policy incorrectly usesens01
:Example output
- ipv4: ... name: ens1 state: up type: ethernet
Correct the error by editing the existing Policy:
$ oc edit nncp ens01-bridge-testfail
... port: - name: ens1
Save the Policy to apply the correction.
Check the status of the Policy to ensure it updated successfully:
$ oc get nncp
Example output
NAME STATUS ens01-bridge-testfail SuccessfullyConfigured
The updated Policy is successfully configured on all nodes in the cluster.