主页
产品
Red Hat OpenStack Platform
13
Instances and Images Guide
Chapter 11. Configuring virtual GPUs for instances

此内容没有您所选择的语言版本。

Chapter 11. Configuring virtual GPUs for instances

To support GPU-based rendering on your instances, you can define and manage virtual GPU (vGPU) resources according to your available physical GPU devices and your hypervisor type. You can use this configuration to divide the rendering workloads between all your physical GPU devices more effectively, and to have more control over scheduling your vGPU-enabled instances.

To enable vGPU in OpenStack Compute, create flavors that your cloud users can use to create Red Hat Enterprise Linux (RHEL) instances with vGPU devices. Each instance can then support GPU workloads with virtual GPU devices that correspond to the physical GPU devices.

The OpenStack Compute service tracks the number of vGPU devices that are available for each GPU profile you define on each host. The Compute service schedules instances to these hosts based on the flavor, attaches the devices, and monitors usage on an ongoing basis. When an instance is deleted, the Compute service adds the vGPU devices back to the available pool.

11.1. Supported configurations and limitations
复制链接

Supported GPU cards

For a list of supported NVIDIA GPU cards, see Virtual GPU Software Supported Products on the NVIDIA website.

Limitations when using vGPU devices

You can enable only one vGPU type on each Compute node.
Each instance can use only one vGPU resource.
Live migration of vGPU between hosts is not supported.
Suspend operations on a vGPU-enabled instance is not supported due to a libvirt limitation. Instead, you can snapshot or shelve the instance.
Resize and cold migration operations on an instance with a vGPU flavor does not automatically re-allocate the vGPU resources to the instance. After you resize or migrate the instance, you must rebuild it manually to re-allocate the vGPU resources.
By default, vGPU types on Compute hosts are not exposed to API users. To grant access, add the hosts to a host aggregate. For more information, see Section 4.4, “Managing host aggregates”.
If you use NVIDIA accelerator hardware, you must comply with the NVIDIA licensing requirements. For example, NVIDIA vGPU GRID requires a licensing server. For more information about the NVIDIA licensing requirements, see NVIDIA License Server Release Notes on the NVIDIA website.

11.2. Configuring vGPU on the Compute nodes
复制链接

To enable your cloud users to create instances that use a virtual GPU (vGPU), you must configure the Compute nodes that have the physical GPUs:

Build a custom GPU-enabled overcloud image.
Prepare the GPU role, profile, and flavor for designating Compute nodes for vGPU.
Configure the Compute node for vGPU.
Deploy the overcloud.

Note

To use an NVIDIA GRID vGPU, you must comply with the NVIDIA GRID licensing requirements and you must have the URL of your self-hosted license server. For more information, see the NVIDIA License Server Release Notes web page.

11.2.1. Building a custom GPU overcloud image
复制链接

Perform the following steps on the director node to install the NVIDIA GRID host driver on an overcloud Compute image and upload the image to the OpenStack Image service (glance).

Procedure

Copy the overcloud image and add the gpu suffix to the copied image.
```
cp overcloud-full.qcow2 overcloud-full-gpu.qcow2
```
```
$ cp overcloud-full.qcow2 overcloud-full-gpu.qcow2
```
Copy to Clipboard Toggle word wrap
Install an ISO image generator tool from YUM.
```
sudo yum install genisoimage -y
```
```
$ sudo yum install genisoimage -y
```
Copy to Clipboard Toggle word wrap
Download the NVIDIA GRID host driver RPM package that corresponds to your GPU device from the NVIDIA website. To determine which driver you need, see the NVIDIA Driver Downloads Portal.
Note
You must be a registered NVIDIA customer to download the drivers from the portal.

Create an ISO image from the driver RPM package and save the image in the nvidia-host directory.

genisoimage -o nvidia-host.iso -R -J -V NVIDIA nvidia-host/

$ genisoimage -o nvidia-host.iso -R -J -V NVIDIA nvidia-host/
I: -input-charset not specified, using utf-8 (detected in locale settings)
  9.06% done, estimate finish Wed Oct 31 11:24:46 2018
 18.08% done, estimate finish Wed Oct 31 11:24:46 2018
 27.14% done, estimate finish Wed Oct 31 11:24:46 2018
 36.17% done, estimate finish Wed Oct 31 11:24:46 2018
 45.22% done, estimate finish Wed Oct 31 11:24:46 2018
 54.25% done, estimate finish Wed Oct 31 11:24:46 2018
 63.31% done, estimate finish Wed Oct 31 11:24:46 2018
 72.34% done, estimate finish Wed Oct 31 11:24:46 2018
 81.39% done, estimate finish Wed Oct 31 11:24:46 2018
 90.42% done, estimate finish Wed Oct 31 11:24:46 2018
 99.48% done, estimate finish Wed Oct 31 11:24:46 2018
Total translation table size: 0
Total rockridge attributes bytes: 358
Total directory bytes: 0
Path table size(bytes): 10
Max brk space used 0
55297 extents written (108 MB)

Copy to Clipboard

Toggle word wrap

Create a driver installation script that also disables the nouveau driver and generates a new initramfs. The following example script, install_nvidia.sh, disables the nouveau driver, generates a new initramfs, and installs the NVIDIA GRID host driver on the overcloud image:
```
NVIDIA GRID package
```
```
#/bin/bash

cat <<EOF >/etc/modprobe.d/disable-nouveau.conf
blacklist nouveau
options nouveau modeset=0
EOF
echo 'omit_drivers+=" nouveau "' > /etc/dracut.conf.d/disable-nouveau.conf
dracut -f

# NVIDIA GRID package
mkdir /tmp/mount
mount LABEL=NVIDIA /tmp/mount
rpm -ivh /tmp/mount/<host_driver>.rpm
```
Copy to Clipboard Toggle word wrap
- Replace <host_driver> with the host driver downloaded in step 3.

Customize the overcloud image by attaching the ISO image that you generated in step 4, and running the driver installation script that you created in step 5:

virt-customize --attach nvidia-packages.iso -a overcloud-full-gpu.qcow2  -v --run install_nvidia.sh

$ virt-customize --attach nvidia-packages.iso -a overcloud-full-gpu.qcow2  -v --run install_nvidia.sh
[   0.0] Examining the guest ...
libguestfs: launch: program=virt-customize
libguestfs: launch: version=1.36.10rhel=8,release=6.el8_5.2,libvirt
libguestfs: launch: backend registered: unix
libguestfs: launch: backend registered: uml
libguestfs: launch: backend registered: libvirt

Copy to Clipboard

Toggle word wrap

Relabel the customized image with SELinux:

virt-customize -a overcloud-full-gpu.qcow2 --selinux-relabel

$ virt-customize -a overcloud-full-gpu.qcow2 --selinux-relabel
[   0.0] Examining the guest ...
[   2.2] Setting a random seed
[   2.2] SELinux relabelling
[  27.4] Finishing off

Copy to Clipboard

Toggle word wrap

Prepare the custom image files for upload to the OpenStack Image Service:

mkdir /var/image/x86_64/image
guestmount -a overcloud-full-gpu.qcow2 -i --ro image
cp image/boot/vmlinuz-3.10.0-862.14.4.el8.x86_64 ./overcloud-full-gpu.vmlinuz
cp image/boot/initramfs-3.10.0-862.14.4.el8.x86_64.img ./overcloud-full-gpu.initrd

$ mkdir /var/image/x86_64/image
$ guestmount -a overcloud-full-gpu.qcow2 -i --ro image
$ cp image/boot/vmlinuz-3.10.0-862.14.4.el8.x86_64 ./overcloud-full-gpu.vmlinuz
$ cp image/boot/initramfs-3.10.0-862.14.4.el8.x86_64.img ./overcloud-full-gpu.initrd

Copy to Clipboard

Toggle word wrap

From the undercloud, upload the custom image to the OpenStack Image Service:

(undercloud) $ openstack overcloud image upload --update-existing --os-image-name overcloud-full-gpu.qcow2

(undercloud) $ openstack overcloud image upload --update-existing --os-image-name overcloud-full-gpu.qcow2

Copy to Clipboard

Toggle word wrap

11.2.2. Designating Compute nodes for vGPU
复制链接

To designate Compute nodes for vGPU workloads, you must create a new role file to configure the vGPU role, and configure a new flavor to use to tag the GPU-enabled Compute nodes.

Procedure

To create the new ComputeGpu role file, copy the file /usr/share/openstack-tripleo-heat-templates/roles/Compute.yaml to /usr/share/openstack-tripleo-heat-templates/roles/ComputeGpu.yaml and edit or add the following file sections:

Expand

Table 11.1. ComputeGpu role file edits
Section/Parameter	Current value	New value
Role comment	`Role: Compute`	`Role: ComputeGpu`
Role name	`name: Compute`	`name: ComputeGpu`
`description`	`Basic Compute Node role`	`GPU Compute Node role`
`ImageDefault`	n/a	`overcloud-full-gpu`
`HostnameFormatDefault`	`-compute-`	`-computegpu-`
`deprecated_nic_config_name`	`compute.yaml`	`compute-gpu.yaml`

The following example shows the ComputeGpu role details:

Role: ComputeGpu                                                  #

#####################################################################
# Role: ComputeGpu                                                  #
#####################################################################
- name: ComputeGpu
  description: |
    GPU Compute Node role
  CountDefault: 1
  ImageDefault: overcloud-full-gpu
  networks:
    - InternalApi
    - Tenant
    - Storage
  HostnameFormatDefault: '%stackname%-computegpu-%index%'
  RoleParametersDefault:
    TunedProfileName: "virtual-host"
  # Deprecated & backward-compatible values (FIXME: Make parameters consistent)
  # Set uses_deprecated_params to True if any deprecated params are used.
  uses_deprecated_params: True
  deprecated_param_image: 'NovaImage'
  deprecated_param_extraconfig: 'NovaComputeExtraConfig'
  deprecated_param_metadata: 'NovaComputeServerMetadata'
  deprecated_param_scheduler_hints: 'NovaComputeSchedulerHints'
  deprecated_param_ips: 'NovaComputeIPs'
  deprecated_server_resource_name: 'NovaCompute'
  deprecated_nic_config_name: 'compute-gpu.yaml'
  ServicesDefault:
    - OS::TripleO::Services::Aide
    - OS::TripleO::Services::AuditD
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::CephClient
    - OS::TripleO::Services::CephExternal
    - OS::TripleO::Services::CertmongerUser
    - OS::TripleO::Services::Collectd
    - OS::TripleO::Services::ComputeCeilometerAgent
    - OS::TripleO::Services::ComputeNeutronCorePlugin
    - OS::TripleO::Services::ComputeNeutronL3Agent
    - OS::TripleO::Services::ComputeNeutronMetadataAgent
    - OS::TripleO::Services::ComputeNeutronOvsAgent
    - OS::TripleO::Services::Docker
    - OS::TripleO::Services::Fluentd
    - OS::TripleO::Services::Ipsec
    - OS::TripleO::Services::Iscsid
    - OS::TripleO::Services::Kernel
    - OS::TripleO::Services::LoginDefs
    - OS::TripleO::Services::MetricsQdr
    - OS::TripleO::Services::MySQLClient
    - OS::TripleO::Services::NeutronBgpVpnBagpipe
    - OS::TripleO::Services::NeutronLinuxbridgeAgent
    - OS::TripleO::Services::NeutronVppAgent
    - OS::TripleO::Services::NovaCompute
    - OS::TripleO::Services::NovaLibvirt
    - OS::TripleO::Services::NovaLibvirtGuests
    - OS::TripleO::Services::NovaMigrationTarget
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::ContainersLogrotateCrond
    - OS::TripleO::Services::OpenDaylightOvs
    - OS::TripleO::Services::Rhsm
    - OS::TripleO::Services::RsyslogSidecar
    - OS::TripleO::Services::Securetty
    - OS::TripleO::Services::SensuClient
    - OS::TripleO::Services::SkydiveAgent
    - OS::TripleO::Services::Snmp
    - OS::TripleO::Services::Sshd
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::Tuned
    - OS::TripleO::Services::Vpp
    - OS::TripleO::Services::OVNController
    - OS::TripleO::Services::OVNMetadataAgent
    - OS::TripleO::Services::Ptp

Copy to Clipboard

Toggle word wrap

Generate a new roles data file named roles_data_gpu.yaml that includes the Controller, Compute, and ComputeGpu roles:

openstack overcloud roles \
  generate -o /home/stack/templates/roles_data_gpu.yaml \
  ComputeGpu Compute Controller

(undercloud) [stack@director templates]$ openstack overcloud roles \
  generate -o /home/stack/templates/roles_data_gpu.yaml \
  ComputeGpu Compute Controller

Copy to Clipboard

Toggle word wrap

Register the node for the overcloud. For more information, see Registering nodes for the overcloud in the Director Installation and Usage guide.
Inspect the node hardware. For more information, see Inspecting the hardware of nodes in the Director Installation and Usage guide.

Create the compute-vgpu-nvidia flavor to use to tag nodes that you want to designate for vGPU workloads:

openstack flavor create --id auto --ram 6144 --disk 40 --vcpus 4 compute-vgpu-nvidia

(undercloud) [stack@director templates]$ openstack flavor create --id auto --ram 6144 --disk 40 --vcpus 4 compute-vgpu-nvidia
+----------------------------+--------------------------------------+
| Field                      | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| disk                       | 40                                   |
| id                         | 9cb47954-be00-47c6-a57f-44db35be3e69 |
| name                       | compute-vgpu-nvidia                  |
| os-flavor-access:is_public | True                                 |
| properties                 |                                      |
| ram                        | 6144                                 |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 4                                    |
+----------------------------+--------------------------------------+

Copy to Clipboard

Toggle word wrap

Tag each node that you want to designate for GPU workloads with the compute-vgpu-nvidia profile.

openstack baremetal node set --property capabilities='profile:compute-vgpu-nvidia,boot_option:local' <node>

(undercloud) [stack@director templates]$ openstack baremetal node set --property capabilities='profile:compute-vgpu-nvidia,boot_option:local' <node>

Copy to Clipboard

Toggle word wrap

Replace <node> with the ID of the baremetal node.

To verify the role is created, enter the following command:
```
openstack overcloud profiles list
```
```
(undercloud) [stack@director templates]$ openstack overcloud profiles list
```
Copy to Clipboard Toggle word wrap

11.2.3. Configuring the Compute node for vGPU and deploying the overcloud
复制链接

You need to retrieve and assign the vGPU type that corresponds to the physical GPU device in your environment, and prepare the environment files to configure the Compute node for vGPU.

Procedure

Install Red Hat Enterprise Linux and the NVIDIA GRID driver on a temporary Compute node and launch the node. For more information about installing the NVIDIA GRID driver, see Section 11.2.1, “Building a custom GPU overcloud image”.

On the Compute node, locate the vGPU type of the physical GPU device that you want to enable. For libvirt, virtual GPUs are mediated devices, or mdev type devices. To discover the supported mdev devices, enter the following command:

ls /sys/class/mdev_bus/0000\:06\:00.0/mdev_supported_types/
cat /sys/class/mdev_bus/0000\:06\:00.0/mdev_supported_types/nvidia-18/description

[root@overcloud-computegpu-0 ~]# ls /sys/class/mdev_bus/0000\:06\:00.0/mdev_supported_types/
nvidia-11  nvidia-12  nvidia-13  nvidia-14  nvidia-15  nvidia-16  nvidia-17  nvidia-18  nvidia-19  nvidia-20  nvidia-21  nvidia-210  nvidia-22

[root@overcloud-computegpu-0 ~]# cat /sys/class/mdev_bus/0000\:06\:00.0/mdev_supported_types/nvidia-18/description
num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=4096x2160, max_instance=4

Copy to Clipboard

Toggle word wrap

Add the compute-gpu.yaml file to the network-environment.yaml file:

resource_registry:
  OS::TripleO::Compute::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute.yaml
  OS::TripleO::ComputeGpu::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute-gpu.yaml
  OS::TripleO::Controller::Net::SoftwareConfig: /home/stack/templates/nic-configs/controller.yaml
  #OS::TripleO::AllNodes::Validation: OS::Heat::None

resource_registry:
  OS::TripleO::Compute::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute.yaml
  OS::TripleO::ComputeGpu::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute-gpu.yaml
  OS::TripleO::Controller::Net::SoftwareConfig: /home/stack/templates/nic-configs/controller.yaml
  #OS::TripleO::AllNodes::Validation: OS::Heat::None

Copy to Clipboard

Toggle word wrap

Add the following parameters to the node-info.yaml file to specify the number of GPU-enabled Compute nodes, and the flavor to use for the vGPU-designated Compute nodes:

parameter_defaults:
  OvercloudControllerFlavor: control
  OvercloudComputeFlavor: compute
  OvercloudComputeGpuFlavor: compute-vgpu-nvidia
  ControllerCount: 1
  ComputeCount: 0
  ComputeGpuCount: 3 #set to the no of GPU nodes you have

parameter_defaults:
  OvercloudControllerFlavor: control
  OvercloudComputeFlavor: compute
  OvercloudComputeGpuFlavor: compute-vgpu-nvidia
  ControllerCount: 1
  ComputeCount: 0
  ComputeGpuCount: 3 #set to the no of GPU nodes you have

Copy to Clipboard

Toggle word wrap

Create a gpu.yaml file to specify the vGPU type of your GPU device:
```
parameter_defaults:
  ComputeGpuExtraConfig:
    nova::compute::vgpu::enabled_vgpu_types:
      - nvidia-18
```
```
parameter_defaults:
  ComputeGpuExtraConfig:
    nova::compute::vgpu::enabled_vgpu_types:
      - nvidia-18
```
Copy to Clipboard Toggle word wrap
Note
Each physical GPU supports only one virtual GPU type. If you specify multiple vGPU types in this property, only the first type is used.

Deploy the overcloud, adding your new role and environment files to the stack along with your other environment files:

(undercloud) $ openstack overcloud deploy --templates \
  -r /home/stack/templates/roles_data_gpu.yaml
  -e /home/stack/templates/node-info.yaml
  -e /home/stack/templates/network-environment.yaml
  -e [your environment files]
  -e /home/stack/templates/gpu.yaml

(undercloud) $ openstack overcloud deploy --templates \
  -r /home/stack/templates/roles_data_gpu.yaml
  -e /home/stack/templates/node-info.yaml
  -e /home/stack/templates/network-environment.yaml
  -e [your environment files]
  -e /home/stack/templates/gpu.yaml

Copy to Clipboard

Toggle word wrap

11.3. Creating the vGPU image and flavor
复制链接

To enable your cloud users to create instances that use a virtual GPU (vGPU), you can define a custom vGPU-enabled image, and you can create a vGPU flavor.

11.3.1. Creating a custom GPU instance image
复制链接

After you deploy the overcloud with GPU-enabled Compute nodes, you can create a custom vGPU-enabled instance image with the NVIDIA GRID guest driver and license file.

Procedure

Create an instance with the hardware and software profile that your vGPU instances require:
```
openstack server create --flavor <flavor> --image <image> temp_vgpu_instance
```
```
(overcloud) [stack@director ~]$ openstack server create --flavor <flavor> --image <image> temp_vgpu_instance
```
Copy to Clipboard Toggle word wrap
- Replace <flavor> with the name or ID of the flavor that has the hardware profile that your vGPU instances require. For information on default flavors, see Manage flavors.
- Replace <image> with the name or ID of the image that has the software profile that your vGPU instances require. For information on downloading RHEL cloud images, see Image service.
Log in to the instance as a cloud-user. For more information, see Log in to an Instance.
Create the gridd.conf NVIDIA GRID license file on the instance, following the NVIDIA guidance: Licensing an NVIDIA vGPU on Linux by Using a Configuration File.
Install the GPU driver on the instance. For more information about installing an NVIDIA driver, see Installing the NVIDIA vGPU Software Graphics Driver on Linux.
Note
Use the hw_video_model image property to define the GPU driver type. You can choose none if you want to disable the emulated GPUs for your vGPU instances. For more information about supported drivers, see Appendix A, Image configuration parameters.

Create an image snapshot of the instance:

openstack server image create --name vgpu_image temp_vgpu_instance

(overcloud) [stack@director ~]$ openstack server image create --name vgpu_image temp_vgpu_instance

Copy to Clipboard

Toggle word wrap

Optional: Delete the instance.

11.3.2. Creating a vGPU flavor for instances
复制链接

After you deploy the overcloud with GPU-enabled Compute nodes, you can create a custom flavor that your cloud users can use to launch instances for GPU workloads.

Procedure

Create an NVIDIA GPU flavor. For example:

openstack flavor create --vcpus 6 --ram 8192 --disk 100 m1.small-gpu

(overcloud) [stack@virtlab-director2 ~]$ openstack flavor create --vcpus 6 --ram 8192 --disk 100 m1.small-gpu
+----------------------------+--------------------------------------+
| Field                      | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| disk                       | 100                                  |
| id                         | a27b14dd-c42d-4084-9b6a-225555876f68 |
| name                       | m1.small-gpu                         |
| os-flavor-access:is_public | True                                 |
| properties                 |                                      |
| ram                        | 8192                                 |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 6                                    |
+----------------------------+--------------------------------------+

Copy to Clipboard

Toggle word wrap

Assign a vGPU resource to the flavor that you created. You can assign only one vGPU for each instance.

openstack flavor set m1.small-gpu --property "resources:VGPU=1"
openstack flavor show m1.small-gpu

(overcloud) [stack@virtlab-director2 ~]$ openstack flavor set m1.small-gpu --property "resources:VGPU=1"

(overcloud) [stack@virtlab-director2 ~]$ openstack flavor show m1.small-gpu
+----------------------------+--------------------------------------+
| Field                      | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| access_project_ids         | None                                 |
| disk                       | 100                                  |
| id                         | a27b14dd-c42d-4084-9b6a-225555876f68 |
| name                       | m1.small-gpu                         |
| os-flavor-access:is_public | True                                 |
| properties                 | resources:VGPU='1'                   |
| ram                        | 8192                                 |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 6                                    |
+----------------------------+--------------------------------------+

Copy to Clipboard

Toggle word wrap

11.3.3. Launching a vGPU instance
复制链接

You can create a GPU-enabled instance for GPU workloads.

Procedure

Create an instance using a GPU flavor and image. For example:

openstack server create --flavor m1.small-gpu --image vgpu_image --security-group web --nic net-id=internal0 --key-name lambda vgpu-instance

(overcloud) [stack@virtlab-director2 ~]$ openstack server create --flavor m1.small-gpu --image vgpu_image --security-group web --nic net-id=internal0 --key-name lambda vgpu-instance

Copy to Clipboard

Toggle word wrap

Log in to the instance as a cloud-user. For more information, see Log in to an Instance.
To verify that the GPU is accessible from the instance, run the following command from the instance:
```
lspci -nn | grep <gpu_name>
```
```
$ lspci -nn | grep <gpu_name>
```
Copy to Clipboard Toggle word wrap

11.4. Enabling PCI passthrough for a GPU device
复制链接

You can use PCI passthrough to attach a physical PCI device, such as a graphics card, to an instance. If you use PCI passthrough for a device, the instance reserves exclusive access to the device for performing tasks, and the device is not available to the host.

Prerequisites

The pciutils package is installed on the physical servers that have the PCI cards.
The GPU driver is available to install on the GPU instances. For more information, see Section 11.2.1, “Building a custom GPU overcloud image”.

Procedure

To determine the vendor ID and product ID for each passthrough device type, run the following command on the physical server that has the PCI cards:
```
lspci -nn | grep -i <gpu_name>
```
```
# lspci -nn | grep -i <gpu_name>
```
Copy to Clipboard Toggle word wrap
For example, to determine the vendor and product ID for an NVIDIA GPU, run the following command:
```
lspci -nn | grep -i nvidia
```
```
# lspci -nn | grep -i nvidia
3b:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
d8:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1db4] (rev a1)
```
Copy to Clipboard Toggle word wrap
To configure the Controller node on the overcloud for PCI passthrough, create an environment file, for example, pci_passthru_controller.yaml.

Add PciPassthroughFilter to the NovaSchedulerDefaultFilters parameter in pci_passthru_controller.yaml:

parameter_defaults:
  NovaSchedulerDefaultFilters: ['RetryFilter','AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']

parameter_defaults:
  NovaSchedulerDefaultFilters: ['RetryFilter','AvailabilityZoneFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter','NUMATopologyFilter']

Copy to Clipboard

Toggle word wrap

To specify the PCI alias for the devices on the Controller node, add the following to pci_passthru_controller.yaml:

ControllerExtraConfig:
    nova::pci::aliases:
      -  name: "t4"
         product_id: "1eb8"
         vendor_id: "10de"
      -  name: "v100"
         product_id: "1db4"
         vendor_id: "10de"

ControllerExtraConfig:
    nova::pci::aliases:
      -  name: "t4"
         product_id: "1eb8"
         vendor_id: "10de"
      -  name: "v100"
         product_id: "1db4"
         vendor_id: "10de"

Copy to Clipboard

Toggle word wrap

Note

If the nova-api service is running in a role other than the Controller, then replace ControllerExtraConfig with the user role, in the format <Role>ExtraConfig.

To configure the Compute node on the overcloud for PCI passthrough, create an environment file, for example, pci_passthru_compute.yaml.

To specify the available PCIs for the devices on the Compute node, add the following to pci_passthru_compute.yaml:

parameter_defaults:
  NovaPCIPassthrough:
    - vendor_id: "10de"
      product_id: "1eb8"

parameter_defaults:
  NovaPCIPassthrough:
    - vendor_id: "10de"
      product_id: "1eb8"

Copy to Clipboard

Toggle word wrap

To enable IOMMU in the server BIOS of the Compute nodes to support PCI passthrough, add the KernelArgs parameter to pci_passthru_compute.yaml:

   parameter_defaults:
      ...
      ComputeParameters:
        KernelArgs: "intel_iommu=on iommu=pt"

   parameter_defaults:
      ...
      ComputeParameters:
        KernelArgs: "intel_iommu=on iommu=pt"

Copy to Clipboard

Toggle word wrap

Deploy the overcloud, adding your custom environment files to the stack along with your other environment files:

(undercloud) $ openstack overcloud deploy --templates \
  -e [your environment files]
  -e /home/stack/templates/pci_passthru_controller.yaml
  -e /home/stack/templates/pci_passthru_compute.yaml

(undercloud) $ openstack overcloud deploy --templates \
  -e [your environment files]
  -e /home/stack/templates/pci_passthru_controller.yaml
  -e /home/stack/templates/pci_passthru_compute.yaml

Copy to Clipboard

Toggle word wrap

Configure a flavor to request the PCI devices. The following example requests two devices, each with a vendor ID of 10de and a product ID of 13f2:
```
openstack flavor set m1.large --property "pci_passthrough:alias"="t4:2"
```
```
# openstack flavor set m1.large --property "pci_passthrough:alias"="t4:2"
```
Copy to Clipboard Toggle word wrap

Verification

Create an instance with a PCI passthrough device:

openstack server create --flavor m1.large --image rhelgpu --wait test-pci

# openstack server create --flavor m1.large --image rhelgpu --wait test-pci

Copy to Clipboard

Toggle word wrap

Log in to the instance as a cloud user.
Install the GPU driver on the instance. For example, run the following script to install an NVIDIA driver:
```
sh NVIDIA-Linux-x86_64-430.24-grid.run
```
```
$ sh NVIDIA-Linux-x86_64-430.24-grid.run
```
Copy to Clipboard Toggle word wrap
To verify that the GPU is accessible from the instance, enter the following command from the instance:
```
lspci -nn | grep <gpu_name>
```
```
$ lspci -nn | grep <gpu_name>
```
Copy to Clipboard Toggle word wrap

To check the NVIDIA System Management Interface status, run the following command from the instance:

nvidia-smi

$ nvidia-smi

Copy to Clipboard

Toggle word wrap

Example output:

-----------------------------------------------------------------------------
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|---------------------------------------------------------------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===========================================================================|
|   0  Tesla T4            Off  | 00000000:01:00.0 Off |                    0 |
| N/A   43C    P0    20W /  70W |      0MiB / 15109MiB |      0%      Default |
---------------------------------------------------------------------------

-----------------------------------------------------------------------------
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
-----------------------------------------------------------------------------

-----------------------------------------------------------------------------
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|---------------------------------------------------------------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===========================================================================|
|   0  Tesla T4            Off  | 00000000:01:00.0 Off |                    0 |
| N/A   43C    P0    20W /  70W |      0MiB / 15109MiB |      0%      Default |
---------------------------------------------------------------------------

-----------------------------------------------------------------------------
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
-----------------------------------------------------------------------------

Copy to Clipboard

Toggle word wrap

返回顶部

此内容没有您所选择的语言版本。

Chapter 11. Configuring virtual GPUs for instances

11.1. Supported configurations and limitations
复制链接

11.2. Configuring vGPU on the Compute nodes
复制链接

11.2.1. Building a custom GPU overcloud image
复制链接

11.2.2. Designating Compute nodes for vGPU
复制链接

11.2.3. Configuring the Compute node for vGPU and deploying the overcloud
复制链接

11.3. Creating the vGPU image and flavor
复制链接

11.3.1. Creating a custom GPU instance image
复制链接

11.3.2. Creating a vGPU flavor for instances
复制链接

11.3.3. Launching a vGPU instance
复制链接

11.4. Enabling PCI passthrough for a GPU device
复制链接

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

此内容没有您所选择的语言版本。

Chapter 11. Configuring virtual GPUs for instances

11.1. Supported configurations and limitations复制链接链接已复制到粘贴板!

11.2. Configuring vGPU on the Compute nodes复制链接链接已复制到粘贴板!

11.2.1. Building a custom GPU overcloud image复制链接链接已复制到粘贴板!

11.2.2. Designating Compute nodes for vGPU复制链接链接已复制到粘贴板!

11.2.3. Configuring the Compute node for vGPU and deploying the overcloud复制链接链接已复制到粘贴板!

11.3. Creating the vGPU image and flavor复制链接链接已复制到粘贴板!

11.3.1. Creating a custom GPU instance image复制链接链接已复制到粘贴板!

11.3.2. Creating a vGPU flavor for instances复制链接链接已复制到粘贴板!

11.3.3. Launching a vGPU instance复制链接链接已复制到粘贴板!

11.4. Enabling PCI passthrough for a GPU device复制链接链接已复制到粘贴板!

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

11.1. Supported configurations and limitations
复制链接

11.2. Configuring vGPU on the Compute nodes
复制链接

11.2.1. Building a custom GPU overcloud image
复制链接

11.2.2. Designating Compute nodes for vGPU
复制链接

11.2.3. Configuring the Compute node for vGPU and deploying the overcloud
复制链接

11.3. Creating the vGPU image and flavor
复制链接

11.3.1. Creating a custom GPU instance image
复制链接

11.3.2. Creating a vGPU flavor for instances
复制链接

11.3.3. Launching a vGPU instance
复制链接

11.4. Enabling PCI passthrough for a GPU device
复制链接