Chapter 8. Enabling accelerators
Before you can use an accelerator in OpenShift AI, you must install the relevant software components. The installation process varies based on the accelerator type.
Prerequisites
- You have logged in to your OpenShift cluster.
-
You have the
cluster-admin
role in your OpenShift cluster. - You have installed an accelerator and confirmed that it is detected in your environment.
Procedure
Follow the appropriate documentation to enable your accelerator:
- NVIDIA GPUs: See Enabling NVIDIA GPUs.
- Intel Gaudi AI accelerators: See Enabling Intel Gaudi AI accelerators.
- AMD GPUs: See Enabling AMD GPUs.
- After installing your accelerator, create a hardware profile as described in: Working with hardware profiles.
Verification
From the Administrator perspective, go to the Operators
Installed Operators page. Confirm that the following Operators appear: - The Operator for your accelerator
- Node Feature Discovery (NFD)
- Kernel Module Management (KMM)
The accelerator is correctly detected a few minutes after full installation of the Node Feature Discovery (NFD) and the relevant accelerator Operator. The OpenShift command line interface (CLI) displays the appropriate output for the GPU worker node. For example, here is output confirming that an NVIDIA GPU is detected:
Expected output when the accelerator is detected correctly
# Expected output when the accelerator is detected correctly oc describe node <node name> ... Capacity: cpu: 4 ephemeral-storage: 313981932Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 16076568Ki nvidia.com/gpu: 1 pods: 250 Allocatable: cpu: 3920m ephemeral-storage: 288292006229 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 12828440Ki nvidia.com/gpu: 1 pods: 250
Copy to Clipboard Copied!