Chapter 4. Debugging recommendations and known issues
Review the following section for debugging suggestions that can help you troubleshoot your deployment.
4.1. Known issues Copy linkLink copied to clipboard!
The following list outlines existing current limitations.
- BZ#1857451 - Ansible forks value should have an upper limit and Current Calculation needs to change
-
By default, the Ansible playbooks in mistral are configured to use
10*CPU_COUNTforks in theansible.cfgfile. When you do not use the--limitoption to limit the Ansible execution to a specific node or set of nodes and the Ansible execution is set to run on all of the existing nodes, Ansible consumes almost 100% of memory utilisation.
4.2. Introspection debugging Copy linkLink copied to clipboard!
Review the following list of recommendations when you debug introspection.
- Check your introspection DHCP range and NICs in your
undercloud.conffile -
If any of these values are incorrect, fix them, and rerun the
openstack undercloud installcommand. - Ensure that you do not try to introspect more than your DHCP range of nodes can allow
- The DHCP lease for each node continues to be active for approximately two minutes after introspection finishes.
- Ensure that target nodes are responsive
- If all nodes fail introspection, ensure that you can ping target nodes over the native VLAN by using the configured NIC and that the out-of-band interface credentials and addresses are correct.
- Check the introspection commands in the console
- For debugging specific nodes, watch the console when the node boots and observe introspection commands to the node. If the node stops before it completes the PXE process, check the connectivity, IP allocation, and the network load. When a node exits the BIOS and boots the introspection image, failures are rare and almost exclusively related to connectivity issues. Ensure that the heartbeat from the introspection image is not interrupted on its way to the undercloud.
4.3. Deployment debugging Copy linkLink copied to clipboard!
Use the following recommendations when you debug a deployment.
- Inspect the DHCP servers that provide addresses on the provisioning network
Any additional DHCP servers that supply addresses on the provisioning network can prevent Red Hat OpenStack Platform director from inspecting and provisioning machines.
For DHCP or PXE introspection issues, enter the following command:
sudo tcpdump -i any port 67 or port 68 or port 69
$ sudo tcpdump -i any port 67 or port 68 or port 69Copy to Clipboard Copied! Toggle word wrap Toggle overflow For DHCP or PXE deployment issues, enter the following command:
sudo ip netns exec qdhcp tcpdump -i <interface> port 67 or port 68 or port 69
$ sudo ip netns exec qdhcp tcpdump -i <interface> port 67 or port 68 or port 69Copy to Clipboard Copied! Toggle word wrap Toggle overflow
- Check the state of your failed or foreign disks
-
For failed or foreign disks, check the state of your disks to ensure that, according to the out-of-band management of the machine, the state of the failed or foreign disks is set to
Up. Disks can exit theUpstate during a deployment cycle and change the order that your disks appear in the base operating system. - Use the following commands to debug failed overcloud deployments
-
openstack stack failures list overcloud -
heat resource-list -n5 overcloud | grep -i fail -
less /var/lib/mistral/config-download-latest/ansible.log
To review the output of the commands, log in to the node where the failure occurs and review the log files in
/var/log/and/var/log/containers/.-