Chapter 8. Configuring PCI passthrough
You can use PCI passthrough to attach a physical PCI device, such as a graphics card or a network device, to an instance. If you use PCI passthrough for a device, the instance reserves exclusive access to the device for performing tasks, and the device is not available to the host.
Using PCI passthrough with routed provider networks
The Compute service does not support single networks that span multiple provider networks. When a network contains multiple physical networks, the Compute service only uses the first physical network. Therefore, if you are using routed provider networks you must use the same physical_network name across all the Compute nodes.
If you use routed provider networks with VLAN or flat networks, you must use the same physical_network name for all segments. You then create multiple segments for the network and map the segments to the appropriate subnets.
To enable your cloud users to create instances with PCI devices attached, you must complete the following:
- Designate Compute nodes for PCI passthrough.
- Configure the Compute nodes for PCI passthrough that have the required PCI devices.
- Deploy the overcloud.
- Create a flavor for launching instances with PCI devices attached.
Prerequisites
- The Compute nodes have the required PCI devices.
8.1. Designating Compute nodes for PCI passthrough Copy linkLink copied to clipboard!
To designate Compute nodes for instances with physical PCI devices attached, you must create a new role file to configure the PCI passthrough role, and configure the bare metal nodes with a PCI passthrough resource class to use to tag the Compute nodes for PCI passthrough.
The following procedure applies to new overcloud nodes that have not yet been provisioned. To assign a resource class to an existing overcloud node that has already been provisioned, you must use the scale down procedure to unprovision the node, then use the scale up procedure to reprovision the node with the new resource class assignment. For more information, see Scaling overcloud nodes.
Procedure
-
Log in to the undercloud as the
stackuser. Source the
stackrcfile:source ~/stackrc
[stack@director ~]$ source ~/stackrcCopy to Clipboard Copied! Toggle word wrap Toggle overflow Generate a new roles data file named
roles_data_pci_passthrough.yamlthat includes theController,Compute, andComputePCIroles, along with any other roles that you need for the overcloud:openstack overcloud roles \ generate -o /home/stack/templates/roles_data_pci_passthrough.yaml \ Compute:ComputePCI Compute Controller
(undercloud)$ openstack overcloud roles \ generate -o /home/stack/templates/roles_data_pci_passthrough.yaml \ Compute:ComputePCI Compute ControllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Open
roles_data_pci_passthrough.yamland edit or add the following parameters and sections:Expand Section/Parameter Current value New value Role comment
Role: ComputeRole: ComputePCIRole name
name: Computename: ComputePCIdescriptionBasic Compute Node rolePCI Passthrough Compute Node roleHostnameFormatDefault%stackname%-novacompute-%index%%stackname%-novacomputepci-%index%deprecated_nic_config_namecompute.yamlcompute-pci-passthrough.yaml-
Register the PCI passthrough Compute nodes for the overcloud by adding them to your node definition template,
node.jsonornode.yaml. For more information, see Registering nodes for the overcloud in the Installing and managing Red Hat OpenStack Platform with director guide. Inspect the node hardware:
openstack overcloud node introspect \ --all-manageable --provide
(undercloud)$ openstack overcloud node introspect \ --all-manageable --provideCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information, see Creating an inventory of the bare-metal node hardware in the Installing and managing Red Hat OpenStack Platform with director guide.
Tag each bare metal node that you want to designate for PCI passthrough with a custom PCI passthrough resource class:
openstack baremetal node set \ --resource-class baremetal.PCI-PASSTHROUGH <node>
(undercloud)$ openstack baremetal node set \ --resource-class baremetal.PCI-PASSTHROUGH <node>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<node>with the ID of the bare metal node.Add the
ComputePCIrole to your node definition file,overcloud-baremetal-deploy.yaml, and define any predictive node placements, resource classes, network topologies, or other attributes that you want to assign to your nodes:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1 1
- You can reuse an existing network topology or create a new custom network interface template for the role. For more information, see Custom network interface templates in the Installing and managing Red Hat OpenStack Platform with director guide. If you do not define the network definitions by using the
network_configproperty, then the default network definitions are used.
For more information about the properties you can use to configure node attributes in your node definition file, see Bare metal node provisioning attributes. For an example node definition file, see Example node definition file.
Run the provisioning command to provision the new nodes for your role:
openstack overcloud node provision \ --stack <stack> \ [--network-config \]
(undercloud)$ openstack overcloud node provision \ --stack <stack> \ [--network-config \] --output /home/stack/templates/overcloud-baremetal-deployed.yaml \ /home/stack/templates/overcloud-baremetal-deploy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
<stack>with the name of the stack for which the bare-metal nodes are provisioned. If not specified, the default isovercloud. -
Include the
--network-configoptional argument to provide the network definitions to thecli-overcloud-node-network-config.yamlAnsible playbook. If you do not define the network definitions by using thenetwork_configproperty, then the default network definitions are used.
-
Replace
Monitor the provisioning progress in a separate terminal. When provisioning is successful, the node state changes from
availabletoactive:watch openstack baremetal node list
(undercloud)$ watch openstack baremetal node listCopy to Clipboard Copied! Toggle word wrap Toggle overflow If you did not run the provisioning command with the
--network-configoption, then configure the<Role>NetworkConfigTemplateparameters in yournetwork-environment.yamlfile to point to your NIC template files:parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputePCINetworkConfigTemplate: /home/stack/templates/nic-configs/<pci_passthrough_net_top>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2
parameter_defaults: ComputeNetworkConfigTemplate: /home/stack/templates/nic-configs/compute.j2 ComputePCINetworkConfigTemplate: /home/stack/templates/nic-configs/<pci_passthrough_net_top>.j2 ControllerNetworkConfigTemplate: /home/stack/templates/nic-configs/controller.j2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<pci_passthrough_net_top>with the name of the file that contains the network topology of theComputePCIrole, for example,compute.yamlto use the default network topology.
8.2. Configuring a PCI passthrough Compute node Copy linkLink copied to clipboard!
To enable your cloud users to create instances with PCI devices attached, you must configure both the Compute nodes that have the PCI devices and the Controller nodes.
Procedure
-
Create an environment file to configure the Controller node on the overcloud for PCI passthrough, for example,
pci_passthrough_controller.yaml. Add
PciPassthroughFilterto theNovaSchedulerEnabledFiltersparameter inpci_passthrough_controller.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To specify the PCI alias for the devices on the Controller node, add the following configuration to
pci_passthrough_controller.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow For more information about configuring the
device_typefield, see PCI passthrough device type field.NoteIf the
nova-apiservice is running in a role different from theControllerrole, replaceControllerExtraConfigwith the user role in the format<Role>ExtraConfig.Optional: To set a default NUMA affinity policy for PCI passthrough devices, add
numa_policyto thenova::pci::aliases:configuration from step 3:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
To configure the Compute node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthrough_compute.yaml. To specify the available PCIs for the devices on the Compute node, use the
vendor_idandproduct_idoptions to add all matching PCI devices to the pool of PCI devices available for passthrough to instances. For example, to add all Intel® Ethernet Controller X710 devices to the pool of PCI devices available for passthrough to instances, add the following configuration topci_passthrough_compute.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow For more information about how to configure
NovaPCIPassthrough, see Guidelines for configuringNovaPCIPassthrough.You must create a copy of the PCI alias on the Compute node for instance migration and resize operations. To specify the PCI alias for the devices on the PCI passthrough Compute node, add the following to
pci_passthrough_compute.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe Compute node aliases must be identical to the aliases on the Controller node. Therefore, if you added
numa_affinitytonova::pci::aliasesinpci_passthrough_controller.yaml, then you must also add it tonova::pci::aliasesinpci_passthrough_compute.yaml.To enable IOMMU in the server BIOS of the Compute nodes to support PCI passthrough, add the
KernelArgsparameter topci_passthrough_compute.yaml:Use the following
KernalArgssettings to enable an Intel IOMMU:parameter_defaults: ... ComputePCIParameters: KernelArgs: "intel_iommu=on iommu=pt vfio-pci.ids=<pci_device_id> rd.driver.pre=vfio-pci"parameter_defaults: ... ComputePCIParameters: KernelArgs: "intel_iommu=on iommu=pt vfio-pci.ids=<pci_device_id> rd.driver.pre=vfio-pci"Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Replace <pci_device_id> with the PCI device ID, for example, 10de:1eb8.
Use the following KernalArgs settings to enable an AMD IOMMU:
parameter_defaults: ... ComputePCIParameters: KernelArgs: "amd_iommu=on iommu=pt"parameter_defaults: ... ComputePCIParameters: KernelArgs: "amd_iommu=on iommu=pt"Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteWhen you first add the
KernelArgsparameter to the configuration of a role, the overcloud nodes are automatically rebooted. If required, you can disable the automatic rebooting of nodes and instead perform node reboots manually after each deployment. For more information, see Configuring manual node reboot to defineKernelArgs.
Add your custom environment files to the stack with your other environment files and deploy the overcloud:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create and configure the flavors that your cloud users can use to request the PCI devices. The following example requests two devices, each with a vendor ID of
8086and a product ID of1572, using the alias defined in step 7:openstack flavor set \ --property "pci_passthrough:alias"="a1:2" device_passthrough
(overcloud)$ openstack flavor set \ --property "pci_passthrough:alias"="a1:2" device_passthroughCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To override the default NUMA affinity policy for PCI passthrough devices, you can add the NUMA affinity policy property key to the flavor or the image:
To override the default NUMA affinity policy by using the flavor, add the
hw:pci_numa_affinity_policyproperty key:openstack flavor set \ --property "hw:pci_numa_affinity_policy"="required" \ device_passthrough
(overcloud)$ openstack flavor set \ --property "hw:pci_numa_affinity_policy"="required" \ device_passthroughCopy to Clipboard Copied! Toggle word wrap Toggle overflow For more information about the valid values for
hw:pci_numa_affinity_policy, see Flavor metadata.To override the default NUMA affinity policy by using the image, add the
hw_pci_numa_affinity_policyproperty key:openstack image set \ --property hw_pci_numa_affinity_policy=required \ device_passthrough_image
(overcloud)$ openstack image set \ --property hw_pci_numa_affinity_policy=required \ device_passthrough_imageCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf you set the NUMA affinity policy on both the image and the flavor then the property values must match. The flavor setting takes precedence over the image and default settings. Therefore, the configuration of the NUMA affinity policy on the image only takes effect if the property is not set on the flavor.
Verification
Create an instance with a PCI passthrough device:
openstack server create --flavor device_passthrough \ --image <image> --wait test-pci
$ openstack server create --flavor device_passthrough \ --image <image> --wait test-pciCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Log in to the instance as a cloud user. For more information, see Connecting to an instance.
To verify that the PCI device is accessible from the instance, enter the following command from the instance:
lspci -nn | grep <device_name>
$ lspci -nn | grep <device_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
8.3. PCI passthrough device type field Copy linkLink copied to clipboard!
The Compute service categorizes PCI devices into one of three types, depending on the capabilities the devices report. The following lists the valid values that you can set the device_type field to:
type-PF- The device supports SR-IOV and is the parent or root device. Specify this device type to passthrough a device that supports SR-IOV in its entirety.
type-VF- The device is a child device of a device that supports SR-IOV.
type-PCI-
The device does not support SR-IOV. This is the default device type if the
device_typefield is not set.
You must configure the Compute and Controller nodes with the same device_type.
8.4. Guidelines for configuring NovaPCIPassthrough Copy linkLink copied to clipboard!
-
Do not use the
devnameparameter when configuring PCI passthrough, as the device name of a NIC can change. Instead, usevendor_idandproduct_idbecause they are more stable, or use theaddressof the NIC. -
To pass through a specific Physical Function (PF), you can use the
addressparameter because the PCI address is unique to each device. Alternatively, you can use theproduct_idparameter to pass through a PF, but you must also specify theaddressof the PF if you have multiple PFs of the same type. -
To pass through all the Virtual Functions (VFs) specify only the
product_idandvendor_idof the VFs that you want to use for PCI passthrough. You must also specify theaddressof the VF if you are using SRIOV for NIC partitioning and you are running OVS on a VF. -
To pass through only the VFs for a PF but not the PF itself, you can use the
addressparameter to specify the PCI address of the PF andproduct_idto specify the product ID of the VF.
Configuring the address parameter
The address parameter specifies the PCI address of the device. You can set the value of the address parameter using either a String or a dict mapping.
- String format
If you specify the address using a string you can include wildcards (*), as shown in the following example:
NovaPCIPassthrough: - address: "*:0a:00.*" physical_network: physnet1NovaPCIPassthrough: - address: "*:0a:00.*" physical_network: physnet1Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Dictionary format
If you specify the address using the dictionary format you can include regular expression syntax, as shown in the following example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
The Compute service restricts the configuration of address fields to the following maximum values:
- domain - 0xFFFF
- bus - 0xFF
- slot - 0x1F
- function - 0x7
The Compute service supports PCI devices with a 16-bit address domain. The Compute service ignores PCI devices with a 32-bit address domain.