11.4. Enabling PCI passthrough for a GPU device
You can use PCI passthrough to attach a physical PCI device, such as a graphics card, to an instance. If you use PCI passthrough for a device, the instance reserves exclusive access to the device for performing tasks, and the device is not available to the host.
Prerequisites
-
The
pciutils
package is installed on the physical servers that have the PCI cards. - The GPU driver is available to install on the GPU instances. For more information, see 「Building a custom GPU overcloud image」.
Procedure
To determine the vendor ID and product ID for each passthrough device type, run the following command on the physical server that has the PCI cards:
# lspci -nn | grep -i <gpu_name>
For example, to determine the vendor and product ID for an NVIDIA GPU, run the following command:
# lspci -nn | grep -i nvidia 3b:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1) d8:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1db4] (rev a1)
-
To configure the Controller node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthru_controller.yaml
. Add
PciPassthroughFilter
to theNovaSchedulerDefaultFilters
parameter inpci_passthru_controller.yaml
:parameter_defaults: NovaSchedulerDefaultFilters: ['RetryFilter','AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']
To specify the PCI alias for the devices on the Controller node, add the following to
pci_passthru_controller.yaml
:ControllerExtraConfig: nova::pci::aliases: - name: "t4" product_id: "1eb8" vendor_id: "10de" - name: "v100" product_id: "1db4" vendor_id: "10de"
注記If the
nova-api
service is running in a role other than the Controller, then replaceControllerExtraConfig
with the user role, in the format<Role>ExtraConfig
.-
To configure the Compute node on the overcloud for PCI passthrough, create an environment file, for example,
pci_passthru_compute.yaml
. To specify the available PCIs for the devices on the Compute node, add the following to
pci_passthru_compute.yaml
:parameter_defaults: NovaPCIPassthrough: - vendor_id: "10de" product_id: "1eb8"
To enable IOMMU in the server BIOS of the Compute nodes to support PCI passthrough, add the
KernelArgs
parameter topci_passthru_compute.yaml
:parameter_defaults: ... ComputeParameters: KernelArgs: "intel_iommu=on iommu=pt"
Deploy the overcloud, adding your custom environment files to the stack along with your other environment files:
(undercloud) $ openstack overcloud deploy --templates \ -e [your environment files] -e /home/stack/templates/pci_passthru_controller.yaml -e /home/stack/templates/pci_passthru_compute.yaml
Configure a flavor to request the PCI devices. The following example requests two devices, each with a vendor ID of
10de
and a product ID of13f2
:# openstack flavor set m1.large --property "pci_passthrough:alias"="t4:2"
Verification
Create an instance with a PCI passthrough device:
# openstack server create --flavor m1.large --image rhelgpu --wait test-pci
- Log in to the instance as a cloud user.
Install the GPU driver on the instance. For example, run the following script to install an NVIDIA driver:
$ sh NVIDIA-Linux-x86_64-430.24-grid.run
To verify that the GPU is accessible from the instance, enter the following command from the instance:
$ lspci -nn | grep <gpu_name>
To check the NVIDIA System Management Interface status, run the following command from the instance:
$ nvidia-smi
Example output:
-----------------------------------------------------------------------------
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 | |-----------------------------------------------------
----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |=====================================================
======================| | 0 Tesla T4 Off | 00000000:01:00.0 Off | 0 | | N/A 43C P0 20W / 70W | 0MiB / 15109MiB | 0% Default |-------------------------------
--------------------------------------------
-----------------------------------------------------------------------------
| Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found |-----------------------------------------------------------------------------