Setting up an NVIDIA GPU for a virtual machine in Red Hat Virtualization
How to configure a virtual machine in Red Hat Virtualization to use a dedicated GPU or vGPU.
Abstract
Preface Copy linkLink copied to clipboard!
You can use a host with a compatible graphics processing unit (GPU) to run virtual machines in Red Hat Virtualization that are suited for graphics-intensive tasks and for running software that cannot run without a GPU, such as CAD.
You can assign a GPU to a virtual machine in one of the following ways:
- GPU passthrough: You can assign a host GPU to a single virtual machine, so the virtual machine, instead of the host, uses the GPU.
Virtual GPU (vGPU): You can divide a physical GPU device into one or more virtual devices, referred to as mediated devices. You can then assign these mediated devices to one or more virtual machines as virtual GPUs. These virtual machines share the performance of a single physical GPU. For some GPUs, only one mediated device can be assigned to a single guest. vGPU support is only available on selected NVIDIA GPUs.
Example:
A host has four GPUs. Each GPU can support up to 16 vGPUs, for a total of 64 vGPUs. Some possible vGPU assignments are:
- one virtual machine with 64 vGPUs
- 64 virtual machines, each with one vGPU
- 32 virtual machines, each with one vGPU; eight virtual machines, each with two vGPUs; 4 virtual machines, each with four vGPUs
Chapter 1. GPU device passthrough: Assigning a host GPU to a single virtual machine Copy linkLink copied to clipboard!
Red Hat Virtualization supports PCI VFIO, also called device passthrough, for some NVIDIA PCIe-based GPU devices as non-VGA graphics devices.
You can attach one or more host GPUs to a single virtual machine by passing through the host GPU to the virtual machine, in addition to one of the standard emulated graphics interfaces. The virtual machine uses the emulated graphics device for pre-boot and installation, and the GPU takes control when its graphics drivers are loaded.
For information on the exact number of host GPUs that you can pass through to a single virtual machine, see the NVIDIA website.
To assign a GPU to a virtual machine, follow the steps in these procedures:
These steps are detailed below.
Prerequisites
- Your GPU device supports GPU passthrough mode.
- Your system is listed as a validated server hardware platform.
- Your host chipset supports Intel VT-d or AMD-Vi
For more information about supported hardware and software, see Validated Platforms in the NVIDIA GPU Software Release Notes.
1.1. Enabling host IOMMU support and blacklisting nouveau Copy linkLink copied to clipboard!
I/O Memory Management Unit (IOMMU) support on the host machine is necessary to use a GPU on a virtual machine.
Procedure
- In the Administration Portal, click Edit Hosts pane appears. → . Select a host and click . The
- Click the Kernel tab.
-
Check the Hostdev Passthrough & SR-IOV checkbox. This checkbox enables IOMMU support for a host with Intel VT-d or AMD Vi by adding
intel_iommu=on
oramd_iommu=on
to the kernel command line. - Check the Blacklist Nouveau checkbox.
- Click .
- Select the host and click → and .
- Click → .
- After the reinstallation is finished, reboot the host machine.
- When the host machine has rebooted, click → .
To enable IOMMU support using the command line, edit the grub.conf
file in the virtual machine (./entries/rhvh-4.4.<machine id>.conf) to include the option intel_iommu=on
.
1.2. Detaching the GPU from the host Copy linkLink copied to clipboard!
You cannot add the GPU to the virtual machine if the GPU is bound to the host kernel driver, so you must unbind the GPU device from the host before you can add it to the virtual machine. Host drivers often do not support dynamic unbinding of the GPU, so it is recommended to manually exclude the device from binding to the host drivers.
Procedure
On the host, identify the device slot name and IDs of the device by running the
lspci
command. In the following example, a graphics controller such as an NVIDIA Quadro or GRID card is used:lspci -Dnn | grep -i NVIDIA
# lspci -Dnn | grep -i NVIDIA 0000:03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4] (rev a1) 0000:03:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio Controller [10de:0e0a] (rev a1)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output shows that the NVIDIA GK104 device is installed. It has a graphics controller and an audio controller with the following properties:
-
The device slot name of the graphics controller is
0000:03:00.0
, and the vendor-id:device-id for the graphics controller are10de:11b4
. -
The device slot name of the audio controller is
0000:03:00.1
, and the vendor-id:device-id for the audio controller are10de:0e0a
.
-
The device slot name of the graphics controller is
Prevent the host machine driver from using the GPU device. You can use a vendor-id:device-id with the pci-stub driver. To do this, append the
pci-stub.ids
option, with the vendor-id:device-id as its value, to theGRUB_CMDLINX_LINUX
environment variable located in the/etc/sysconfig/grub
configuration file, for example:GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/vg0-lv_swap rd.lvm.lv=vg0/lv_root rd.lvm.lv=vg0/lv_swap rhgb quiet intel_iommu=on pci-stub.ids=10de:11b4,10de:0e0a"
GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/vg0-lv_swap rd.lvm.lv=vg0/lv_root rd.lvm.lv=vg0/lv_swap rhgb quiet intel_iommu=on pci-stub.ids=10de:11b4,10de:0e0a"
Copy to Clipboard Copied! Toggle word wrap Toggle overflow When adding additional vendor IDs and device IDs for pci-stub, separate them with a comma.
Regenerate the boot loader configuration using grub2-mkconfig to include this option:
grub2-mkconfig -o /etc/grub2.cfg
# grub2-mkconfig -o /etc/grub2.cfg
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteWhen using a UEFI-based host, the target file should be
/etc/grub2-efi.cfg
.- Reboot the host machine.
Confirm that IOMMU is enabled, the host device is added to the list of pci-stub.ids, and Nouveau is blacklisted:
cat /proc/cmdline BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-147.el8.x86_64 root=/dev/mapper/vg0-lv_root ro crashkernel=auto resume=/dev/mapper/vg0-lv_swap rd.lvm.lv=vg0/lv_root rd.lvm.lv=vg0/lv_swap rhgb quiet intel_iommu=on pci-stub.ids=10de:11b4,10de:0e0a rdblacklist=nouveau
# cat /proc/cmdline BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-147.el8.x86_64 root=/dev/mapper/vg0-lv_root ro crashkernel=auto resume=/dev/mapper/vg0-lv_swap rd.lvm.lv=vg0/lv_root rd.lvm.lv=vg0/lv_swap rhgb quiet intel_iommu=on
1 pci-stub.ids=10de:11b4,10de:0e0a
2 rdblacklist=nouveau
3 Copy to Clipboard Copied! Toggle word wrap Toggle overflow
1.3. Attaching the GPU to a Virtual Machine Copy linkLink copied to clipboard!
After unbinding the GPU from host kernel driver, you can add it to the virtual machine and enable the correct driver.
Procedure
- Follow the steps in Adding a Host Device to a Virtual Machine in the Virtual Machine Management Guide.
- Run the virtual machine and log in to it.
- Install the NVIDIA GPU driver on the virtual machine.
Verify that the correct kernel driver is in use for the GPU with the
lspci -nnk
command. For example:lspci -nnk
# lspci -nnk 00:07.0 VGA compatible controller [0300]: NVIDIA Corporation GK104GL [Quadro K4200] [10de:11b4] (rev a1) Subsystem: Hewlett-Packard Company Device [103c:1096] Kernel driver in use: nvidia Kernel modules: nouveau, nvidia_drm, nvidia
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
1.4. Installing the GPU driver on the virtual machine Copy linkLink copied to clipboard!
Procedure
- Run the virtual machine and connect to it using the VNC or SPICE console.
- Download the driver to the virtual machine. For information on getting the driver, see the Drivers page on the NVIDIA website.
Install the GPU driver.
ImportantLinux only: When installing the driver on a Linux guest operating system, you are prompted to update xorg.conf. If you do not update xorg.conf during the installation, you need to update it manually.
After the driver finishes installing, reboot the machine. For Windows virtual machines, fully power off the guest from the Administration portal or the VM portal, not from within the guest operating system.
ImportantWindows only: Powering off the virtual machine from within the Windows guest operating system sometimes sends the virtual machine into hibernate mode, which does not completely clear the memory, possibly leading to subsequent problems. Using the Administration portal or the VM portal to power off the virtual machine forces it to fully clean the memory.
- Connect a monitor to the host GPU output interface and run the virtual machine.
- Set up NVIDIA vGPU guest software licensing for each vGPU and add the license credentials in the NVIDIA control panel. For more information, see How NVIDIA vGPU Software Licensing Is Enforced in the NVIDIA Virtual GPU Software Documentation.
1.5. Updating and Enabling xorg (Linux Virtual Machines) Copy linkLink copied to clipboard!
Before you can use the GPU on the virtual machine, you need to update and enable xorg on the virtual machine. The NVIDIA driver installation should do this automatically. Check if xorg is updated and enabled by viewing /etc/X11/xorg.conf
:
cat /etc/X11/xorg.conf
# cat /etc/X11/xorg.conf
The first two lines say if it was generated by NVIDIA. For example:
cat /etc/X11/xorg.conf nvidia-xconfig: X configuration file generated by nvidia-xconfig nvidia-xconfig: version 390.87 (buildmeister@swio-display-x64-rhel04-14) Tue Aug 21 17:33:38 PDT 2018
# cat /etc/X11/xorg.conf
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 390.87 (buildmeister@swio-display-x64-rhel04-14) Tue Aug 21 17:33:38 PDT 2018
Procedure
On the virtual machine, generate the
xorg.conf
file using following command:X -configure
# X -configure
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Copy the
xorg.conf
file to/etc/X11/xorg.conf
using the following command:cp /root/xorg.conf.new /etc/X11/xorg.conf
# cp /root/xorg.conf.new /etc/X11/xorg.conf
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Reboot the virtual machine.
Verify that xorg is updated and enabled by viewing
/etc/X11/xorg.conf
:cat /etc/X11/xorg.conf
# cat /etc/X11/xorg.conf
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Search for the
Device
section. You should see an entry similar to the following section:Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" EndSection
Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" EndSection
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
The GPU is now assigned to the virtual machine.
Chapter 2. Assigning virtual GPUs Copy linkLink copied to clipboard!
To set up NVIDIA vGPU devices, you need to:
- Obtain and install the correct NVIDIA vGPU driver for your GPU device
- Create mediated devices
- Assign each mediated device to a virtual machine
- Install guest drivers on each virtual machine.
The following procedures explain this process.
2.1. Setting up NVIDIA vGPU devices on the host Copy linkLink copied to clipboard!
Before installing the NVIDIA vGPU driver on the guest operating system, you need to understand the licensing requirements and obtain the correct license credentials.
Prerequisites
- Your GPU device supports virtual GPU (vGPU) functionality.
- Your system is listed as a validated server hardware platform.
For more information about supported GPUs and validated platforms, see NVIDIA vGPU CERTIFIED SERVERS on www.nvidia.com.
Procedure
- Download and install the NVIDIA-vGPU driver. For information on getting the driver, see vGPU drivers page on the NVIDIA website. An Nvidia enterprise account is required to download the drivers. Contact the hardware vendor if this is not available.
- Unzip the downloaded file from the Nvidia website and copy it to the host to install the driver.
-
If the NVIDIA software installer did not create the
/etc/modprobe.d/nvidia-installer-disable-nouveau.conf
file, create it manually. Open
/etc/modprobe.d/nvidia-installer-disable-nouveau.conf
file in a text editor and add the following lines to the end of the file:blacklist nouveau options nouveau modeset=0
blacklist nouveau options nouveau modeset=0
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Regenerate the initial ramdisk for the current kernel, then reboot:
dracut --force reboot
# dracut --force # reboot
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Alternatively, if you need to use a prior supported kernel version with mediated devices, regenerate the initial ramdisk for all installed kernel versions:
dracut --regenerate-all --force reboot
# dracut --regenerate-all --force # reboot
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the kernel loaded the
nvidia_vgpu_vfio
module:lsmod | grep nvidia_vgpu_vfio
# lsmod | grep nvidia_vgpu_vfio
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the
nvidia-vgpu-mgr.service
service is running:systemctl status nvidia-vgpu-mgr.service
# systemctl status nvidia-vgpu-mgr.service
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - In the Administration Portal, click → .
- Click the name of the virtual machine to go to the details view.
- Click the Host Devices tab.
- Click Manage vGPU dialog box opens. . The
- Select a vGPU type and the number of instances that you would like to use with this virtual machine.
Select On for Secondary display adapter for VNC to add a second emulated QXL or VGA graphics adapter as the primary graphics adapter for the console in addition to the vGPU.
NoteOn cluster levels 4.5 and later, when a vGPU is used and the Secondary display adapter for VNC is set to On, an additional framebuffer display device is automatically added to the virtual machine. This allows the virtual machine console to be displayed before the vGPU is initialized, instead of a blank screen.
- Click .
2.2. Installing the vGPU driver on the virtual machine Copy linkLink copied to clipboard!
Procedure
Run the virtual machine and connect to it using the VNC console.
NoteSPICE is not supported on vGPU.
- Download the driver to the virtual machine. For information on getting the driver, see the Drivers page on the NVIDIA website.
Install the vGPU driver, following the instructions in Installing the NVIDIA vGPU Software Graphics Driver in the NVIDIA Virtual GPU software documentation.
ImportantLinux only: When installing the driver on a Linux guest operating system, you are prompted to update xorg.conf. If you do not update xorg.conf during the installation, you need to update it manually.
After the driver finishes installing, reboot the machine. For Windows virtual machines, fully power off the guest from the Administration portal or the VM portal, not from within the guest operating system.
ImportantWindows only: Powering off the virtual machine from within the Windows guest operating system sometimes sends the virtual machine into hibernate mode, which does not completely clear the memory, possibly leading to subsequent problems. Using the Administration portal or the VM portal to power off the virtual machine forces it to fully clean the memory.
- Run the virtual machine and connect to it using one of the supported remote desktop protocols, such as Mechdyne TGX, and verify that the vGPU is recognized by opening the NVIDIA Control Panel. On Windows, you can alternatively open the Windows Device Manager. The vGPU should appear under Display adapters. For more information, see the NVIDIA vGPU Software Graphics Driver in the NVIDIA Virtual GPU software documentation.
- Set up NVIDIA vGPU guest software licensing for each vGPU and add the license credentials in the NVIDIA control panel. For more information, see How NVIDIA vGPU Software Licensing Is Enforced in the NVIDIA Virtual GPU Software Documentation.
2.3. Removing NVIDIA vGPU devices Copy linkLink copied to clipboard!
To change the configuration of assigned vGPU mediated devices, the existing devices have to be removed from the assigned guests.
Procedure
- In the Administration Portal, click → .
- Click the name of the virtual machine to go to the details view.
- Click the Host Devices tab.
- Click Manage vGPU dialog box opens. . The
- Click the Selected vGPU Type Instances to detach the vGPU from the virtual machine. button next to
- Click .
2.4. Monitoring NVIDIA vGPUs Copy linkLink copied to clipboard!
For NVIDIA vGPUS, to get info on the physical GPU and vGPU, you can use the NVIDIA System Management Interface by entering the nvidia-smi
command on the host. For more information, see NVIDIA System Management Interface nvidia-smi in the NVIDIA Virtual GPU Software Documentation.
For example:
2.5. Remote desktop streaming services for NVIDIA vGPU Copy linkLink copied to clipboard!
The following remote desktop streaming services have been successfully tested for use with the NVIDIA vGPU feature in RHEL 8:
- HP-RGS
- Mechdyne TGX - It is currently not possible to use Mechdyne TGX with Windows Server 2016 guests.
- NICE DCV - When using this streaming service, use fixed resolution settings, because using dynamic resolution in some cases results in a black screen.
Appendix A. Legal notice Copy linkLink copied to clipboard!
Copyright © 2022 Red Hat, Inc.
Licensed under the (Creative Commons Attribution–ShareAlike 4.0 International License). Derived from documentation for the (oVirt Project). If you distribute this document or an adaptation of it, you must provide the URL for the original version.
Modified versions must remove all Red Hat trademarks.
Red Hat, Red Hat Enterprise Linux, the Red Hat logo, the Shadowman logo, JBoss, OpenShift, Fedora, the Infinity logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
Java® is a registered trademark of Oracle and/or its affiliates.
XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
MySQL® is a registered trademark of MySQL AB in the United States, the European Union and other countries.
Node.js® is an official trademark of Joyent. Red Hat Software Collections is not formally related to or endorsed by the official Joyent Node.js open source or commercial project.
The OpenStack® Word Mark and OpenStack logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation’s permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.