11.2. Troubleshooting Hardware Introspection
The discovery and introspection process must run to completion. However, ironic's Discovery daemon (
ironic-inspector
) times out after a default 1 hour period if the discovery ramdisk provides no response. Sometimes this might indicate a bug in the discovery ramdisk but usually it happens due to an environment misconfiguration, particularly BIOS boot settings.
Here are some common scenarios where environment misconfiguration occurs and advice on how to diagnose and resolve them.
Errors with Starting Node Introspection
Normally the introspection process uses the
baremetal introspection
, which acts an an umbrella command for ironic's services. However, if running the introspection directly with ironic-inspector
, it might fail to discover nodes in the AVAILABLE state, which is meant for deployment and not for discovery. Change the node status to the MANAGEABLE state before discovery:
$ ironic node-set-provision-state [NODE UUID] manage
Then, when discovery completes, change back to AVAILABLE before provisioning:
$ ironic node-set-provision-state [NODE UUID] provide
Introspected node is not booting in PXE
Before a node reboots,
ironic-inspector
adds the MAC address of the node to the Undercloud firewall's ironic-inspector
chain. This allows the node to boot over PXE. To verify the correct configuration, run the following command:
$ sudo iptables -L
The output should display the following chain table with the MAC address:
Chain ironic-inspector (1 references) target prot opt source destination DROP all -- anywhere anywhere MAC xx:xx:xx:xx:xx:xx ACCEPT all -- anywhere anywhere
If the MAC address is not there, the most common cause is a corruption in the
ironic-inspector
cache, which is in an SQLite database. To fix it, delete the SQLite file:
$ sudo rm /var/lib/ironic-inspector/inspector.sqlite
And recreate it:
$ sudo ironic-inspector-dbsync --config-file /etc/ironic-inspector/inspector.conf upgrade $ sudo systemctl restart openstack-ironic-inspector
Stopping the Discovery Process
Currently
ironic-inspector
does not provide a direct means for stopping discovery. The recommended path is to wait until the process times out. If necessary, change the timeout
setting in /etc/ironic-inspector/inspector.conf
to change the timeout period to another period in minutes.
In worst case scenarios, you can stop discovery for all nodes using the following process:
Procedure 11.3. Stopping the Discovery Process
- Change the power state of each node to off:
$ ironic node-set-power-state [NODE UUID] off
- Remove
ironic-inspector
cache and restart it:$ rm /var/lib/ironic-inspector/inspector.sqlite $ sudo systemctl restart openstack-ironic-inspector
- Resynchronize the
ironic-inspector
cache:$ sudo ironic-inspector-dbsync --config-file /etc/ironic-inspector/inspector.conf upgrade
Accessing the Introspection Ramdisk
The introspection ramdisk uses a dynamic login element. This means you can provide either a temporary password or an SSH key to access the node during introspection debugging. Use the following process to set up ramdisk access:
- Provide a temporary password to the
openssl passwd -1
command to generate an MD5 hash. For example:$ openssl passwd -1 mytestpassword $1$enjRSyIw$/fYUpJwr6abFy/d.koRgQ/
- Edit the
/httpboot/inspector.ipxe
file, find the line starting withkernel
, and append therootpwd
parameter and the MD5 hash. For example:kernel http://192.2.0.1:8088/agent.kernel ipa-inspection-callback-url=http://192.168.0.1:5050/v1/continue ipa-inspection-collectors=default,extra-hardware,logs systemd.journald.forward_to_console=yes BOOTIF=${mac} ipa-debug=1 ipa-inspection-benchmarks=cpu,mem,disk rootpwd="$1$enjRSyIw$/fYUpJwr6abFy/d.koRgQ/" selinux=0
Alternatively, you can append thesshkey
parameter with your public SSH key.Note
Quotation marks are required for both therootpwd
andsshkey
parameters. - Start the introspection and find the IP address from either the
arp
command or the DHCP logs:$ arp $ sudo journalctl -u openstack-ironic-inspector-dnsmasq
- SSH as a root user with the temporary password or the SSH key.
$ ssh root@192.0.2.105
Checking the Introspection Storage
The director uses OpenStack Object Storage (swift) to save the hardware data obtained during the introspection process. If this service is not running, the introspection can fail. Check all services related to OpenStack Object Storage to ensure the service is running:
$ sudo systemctl list-units openstack-swift*