Search

Chapter 8. Troubleshooting the Bare Metal Service

download PDF

The following sections contain information and steps that may be useful for diagnosing issues in a setup with the Bare Metal service enabled.

8.1. PXE Boot Errors

Permission Denied Errors

If you are getting a permission denied error on the console of your Bare Metal service node, make sure you have applied the appropriate SELinux context to the /httpboot and /tftpboot directories as follows:

# semanage fcontext -a -t httpd_sys_content_t "/httpboot(/.*)?"
# restorecon -r -v /httpboot
# semanage fcontext -a -t tftpdir_t "/tftpboot(/.*)?"
# restorecon -r -v /tftpboot

Boot Process Freezes at /pxelinux.cfg/XX-XX-XX-XX-XX-XX

On the console of your node, if it looks like you are getting an IP address and then the process stops as shown below:

PXE Process Freezes During Boot

This indicates that you might be using the wrong PXE boot template in your ironic.conf file.

$ grep ^pxe_config_template ironic.conf
pxe_config_template=$pybasedir/drivers/modules/ipxe_config.template

The default template is pxe_config.template, so it is easy to miss the i to turn this into ipxe_config.template.

8.2. Login Errors After the Bare Metal Node Boots

When you try to log in at the login prompt on the console of the node with the root password that you set in the configurations steps, but are not able to, it indicates you are not booted in to the deployed image. You are probably stuck in the deploy-kernel/deploy-ramdisk image and the system has yet to get the correct image.

To fix this issue, verify the PXE Boot Configuration file in the /httpboot/pxelinux.cfg/MAC_ADDRESS on the Compute or Bare Metal service node and ensure that all the IP addresses listed in this file correspond to IP addresses on the Bare Metal network.

Note

The only network the Bare Metal service node knows about is the Bare Metal network. If one of the endpoints is not on the network, the endpoint will not be able to reach the Bare Metal service node as a part of the boot process.

For example, the kernel line in your file is as follows:

kernel http://192.168.200.2:8088/5a6cdbe3-2c90-4a90-b3c6-85b449b30512/deploy_kernel selinux=0 disk=cciss/c0d0,sda,hda,vda iscsi_target_iqn=iqn.2008-10.org.openstack:5a6cdbe3-2c90-4a90-b3c6-85b449b30512 deployment_id=5a6cdbe3-2c90-4a90-b3c6-85b449b30512 deployment_key=VWDYDVVEFCQJNOSTO9R67HKUXUGP77CK ironic_api_url=http://192.168.200.2:6385 troubleshoot=0 text nofb nomodeset vga=normal boot_option=netboot ip=${ip}:${next-server}:${gateway}:${netmask} BOOTIF=${mac}  ipa-api-url=http://192.168.200.2:6385 ipa-driver-name=pxe_ipmitool boot_mode=bios initrd=deploy_ramdisk coreos.configdrive=0 || goto deploy
Value in the above example kernel lineCorresponding information

http://192.168.200.2:8088

Parameter http_url in /etc/ironic/ironic.conf file. This IP address must be on the Bare Metal network.

5a6cdbe3-2c90-4a90-b3c6-85b449b30512

UUID of the baremetal node in ironic node-list.

deploy_kernel

This is the deploy kernel image in the Image service that is copied down as /httpboot/<NODE_UUID>/deploy_kernel.

http://192.168.200.2:6385

Parameter api_url in /etc/ironic/ironic.conf file. This IP address must be on the Bare Metal network.

pxe_impitool

The IPMI Driver in use by the Bare Metal service for this node.

deploy_ramdisk

This is the deploy ramdisk image in the Image service that is copied down as /httpboot/<NODE_UUID>/deploy_ramdisk.

If a value does not correspond between the /httpboot/pxelinux.cfg/MAC_ADDRESS and the ironic.conf file:

  1. Update the value in the ironic.conf file
  2. Restart the Bare Metal service
  3. Re-deploy the Bare Metal instance

8.3. Boot-to-disk errors on deployed nodes

With certain hardware, you might experience a problem with deployed nodes where the nodes cannot boot from disk during successive boot operations as part of a deployment. This usually happens because the BMC does not honor the persistent boot settings that director requests on the nodes. Instead, the nodes boot from a PXE target.

In this case, you must update the boot order in the BIOS of the nodes. Set the HDD to be the first boot device, and then PXE as a later option, so that the nodes boot from disk by default, but can boot from the network during introspection or deployment as necessary.

Note

This error mostly applies to nodes that use LegacyBIOS firmware.

8.4. The Bare Metal Service Is Not Getting the Right Hostname

If the Bare Metal service is not getting the right hostname, it means that cloud-init is failing. To fix this, connect the Bare Metal subnet to a router in the OpenStack Networking service. The requests to the meta-data agent should now be routed correctly.

8.5. Invalid OpenStack Identity Service Credentials When Executing Bare Metal Service Commands

If you are having trouble authenticating to the Identity service, check the identity_uri parameter in the ironic.conf file and make sure you remove the /v2.0 from the keystone AdminURL. For example, identity_uri should be set to http://IP:PORT.

8.6. Hardware Enrollment

Issues with enrolled hardware can be caused by incorrect node registration details. Ensure that property names and values have been entered correctly. Incorrect or mistyped property names will be successfully added to the node’s details, but will be ignored.

Update a node’s details. This example updates the amount of memory the node is registered to use to 2 GB:

$ openstack baremetal node set --property memory_mb=2048 NODE_UUID

8.7. No Valid Host Errors

If the Compute scheduler cannot find a suitable Bare Metal node on which to boot an instance, a NoValidHost error can be seen in /var/log/nova/nova-conductor.log or immediately upon launch failure in the dashboard. This is usually caused by a mismatch between the resources Compute expects and the resources the Bare Metal node provides.

  1. Check the hypervisor resources that are available:

    $ openstack hypervisor stats show

    The resources reported here should match the resources that the Bare Metal nodes provide.

  2. Check that Compute recognizes the Bare Metal nodes as hypervisors:

    $ openstack hypervisor list

    The nodes, identified by UUID, should appear in the list.

  3. Check the details for a Bare Metal node:

    $ openstack baremetal node list
    $ openstack baremetal node show NODE_UUID

    Verify that the node’s details match those reported by Compute.

  4. Check that the selected flavor does not exceed the available resources of the Bare Metal nodes:

    $ openstack flavor show FLAVOR_NAME
  5. Check the output of openstack baremetal node list to ensure that Bare Metal nodes are not in maintenance mode. Remove maintenance mode if necessary:

    $ openstack baremetal node maintenance unset NODE_UUID
  6. Check the output of openstack baremetal node list to ensure that Bare Metal nodes are in an available state. Move the node to available if necessary:

    $ openstack baremetal node provide NODE_UUID
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.