11.4. Enabling PCI passthrough for a GPU device
You can use PCI passthrough to attach a physical PCI device, such as a graphics card, to an instance. If you use PCI passthrough for a device, the instance reserves exclusive access to the device for performing tasks, and the device is not available to the host.
Prerequisites
-
The
pciutilspackage is installed on the physical servers that have the PCI cards. - The GPU driver is available to install on the GPU instances. For more information, see 「Building a custom GPU overcloud image」.
Procedure
To determine the vendor ID and product ID for each passthrough device type, run the following command on the physical server that has the PCI cards:
# lspci -nn | grep -i <gpu_name>For example, to determine the vendor and product ID for an NVIDIA GPU, run the following command:
# lspci -nn | grep -i nvidia 3b:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1) d8:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1db4] (rev a1)-
To configure the Controller node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthru_controller.yaml. Add
PciPassthroughFilterto theNovaSchedulerDefaultFiltersparameter inpci_passthru_controller.yaml:parameter_defaults: NovaSchedulerDefaultFilters: ['RetryFilter','AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']To specify the PCI alias for the devices on the Controller node, add the following to
pci_passthru_controller.yaml:ControllerExtraConfig: nova::pci::aliases: - name: "t4" product_id: "1eb8" vendor_id: "10de" - name: "v100" product_id: "1db4" vendor_id: "10de"注記If the
nova-apiservice is running in a role other than the Controller, then replaceControllerExtraConfigwith the user role, in the format<Role>ExtraConfig.-
To configure the Compute node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthru_compute.yaml. To specify the available PCIs for the devices on the Compute node, add the following to
pci_passthru_compute.yaml:parameter_defaults: NovaPCIPassthrough: - vendor_id: "10de" product_id: "1eb8"To enable IOMMU in the server BIOS of the Compute nodes to support PCI passthrough, add the
KernelArgsparameter topci_passthru_compute.yaml:parameter_defaults: ... ComputeParameters: KernelArgs: "intel_iommu=on iommu=pt"Deploy the overcloud, adding your custom environment files to the stack along with your other environment files:
(undercloud) $ openstack overcloud deploy --templates \ -e [your environment files] -e /home/stack/templates/pci_passthru_controller.yaml -e /home/stack/templates/pci_passthru_compute.yamlConfigure a flavor to request the PCI devices. The following example requests two devices, each with a vendor ID of
10deand a product ID of13f2:# openstack flavor set m1.large --property "pci_passthrough:alias"="t4:2"
Verification
Create an instance with a PCI passthrough device:
# openstack server create --flavor m1.large --image rhelgpu --wait test-pci- Log in to the instance as a cloud user.
Install the GPU driver on the instance. For example, run the following script to install an NVIDIA driver:
$ sh NVIDIA-Linux-x86_64-430.24-grid.runTo verify that the GPU is accessible from the instance, enter the following command from the instance:
$ lspci -nn | grep <gpu_name>To check the NVIDIA System Management Interface status, run the following command from the instance:
$ nvidia-smiExample output:
----------------------------------------------------------------------------- | NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 | |---------------------------------------------------------------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===========================================================================| | 0 Tesla T4 Off | 00000000:01:00.0 Off | 0 | | N/A 43C P0 20W / 70W | 0MiB / 15109MiB | 0% Default | --------------------------------------------------------------------------- ----------------------------------------------------------------------------- | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | -----------------------------------------------------------------------------