B.8. PXE Boot (or DHCP) on Guest Failed
- Symptom
- A guest virtual machine starts successfully, but is then either unable to acquire an IP address from DHCP or boot using the PXE protocol, or both. There are two common causes of this error: having a long forward delay time set for the bridge, and when the iptables package and kernel do not support checksum mangling rules.
- Long forward delay time on bridge
- Investigation
- This is the most common cause of this error. If the guest network interface is connecting to a bridge device that has STP (Spanning Tree Protocol) enabled, as well as a long forward delay set, the bridge will not forward network packets from the guest virtual machine onto the bridge until at least that number of forward delay seconds have elapsed since the guest connected to the bridge. This delay allows the bridge time to watch traffic from the interface and determine the MAC addresses behind it, and prevent forwarding loops in the network topology.If the forward delay is longer than the timeout of the guest's PXE or DHCP client, then the client's operation will fail, and the guest will either fail to boot (in the case of PXE) or fail to acquire an IP address (in the case of DHCP).
- Solution
- If this is the case, change the forward delay on the bridge to 0, or disable STP on the bridge.
Note
This solution applies only if the bridge is not used to connect multiple networks, but just to connect multiple endpoints to a single network (the most common use case for bridges used by libvirt).If the guest has interfaces connecting to a libvirt-managed virtual network, edit the definition for the network, and restart it. For example, edit the default network with the following command:# virsh net-edit default
Add the following attributes to the<bridge>
element:<name_of_bridge='virbr0'
delay='0' stp='on'
/>Note
delay='0'
andstp='on'
are the default settings for virtual networks, so this step is only necessary if the configuration has been modified from the default.If the guest interface is connected to a host bridge that was configured outside of libvirt, change the delay setting.Add or edit the following lines in the/etc/sysconfig/network-scripts/ifcfg-name_of_bridge
file to turn STP on with a 0 second delay:STP=on DELAY=0
After changing the configuration file, restart the bridge device:/sbin/ifdown name_of_bridge /sbin/ifup name_of_bridge
Note
If name_of_bridge is not the root bridge in the network, that bridge's delay will eventually reset to the delay time configured for the root bridge. In this case, the only solution is to disable STP completely on name_of_bridge.
- The iptables package and kernel do not support checksum mangling rules
- Investigation
- This message is only a problem if all four of the following conditions are true:
- The guest is using virtio network devices.If so, the configuration file will contain
model type='virtio'
- The host has the
vhost-net
module loaded.This is true if
does not return an empty result.ls
/dev/vhost-net
- The guest is attempting to get an IP address from a DHCP server that is running directly on the host.
- The iptables version on the host is older than 1.4.10.iptables 1.4.10 was the first version to add the
libxt_CHECKSUM
extension. This is the case if the following message appears in the libvirtd logs:warning: Could not add rule to fixup DHCP response checksums on network default warning: May need to update iptables package and kernel to support CHECKSUM rule.
Important
Unless all of the other three conditions in this list are also true, the above warning message can be disregarded, and is not an indicator of any other problems.
When these conditions occur, UDP packets sent from the host to the guest have uncomputed checksums. This makes the host's UDP packets seem invalid to the guest's network stack. - Solution
- To solve this problem, invalidate any of the four points above. The best solution is to update the host iptables and kernel to iptables-1.4.10 or later where possible. Otherwise, the most specific fix is to disable the
vhost-net
driver for this particular guest. To do this, edit the guest configuration with this command:virsh edit name_of_guest
Change or add a<driver>
line to the<interface>
section:<interface type='network'> <model type='virtio'/> <driver name='qemu'/> ... </interface>
Save the changes, shut down the guest, and then restart it.If this problem is still not resolved, the issue may be due to a conflict between firewalld and the default libvirt network.To fix this, stop firewalld with theservice firewalld stop
command, then restart libvirt with theservice libvirtd restart
command.