Chapter 7. Enabling accelerators
Before you can use an accelerator in OpenShift AI, you must install the relevant software components. The installation process varies based on the accelerator type.
Prerequisites
- You have logged in to your OpenShift cluster.
-
You have the
cluster-adminrole in your OpenShift cluster. - You have installed an accelerator and confirmed that it is detected in your environment.
Procedure
Follow the appropriate documentation to enable your accelerator:
- NVIDIA GPUs: See Enabling NVIDIA GPUs.
- Intel Gaudi AI accelerators: See Enabling Intel Gaudi AI accelerators.
- AMD GPUs: See Enabling AMD GPUs.
- IBM Spyre: See Enabling IBM Spyre accelerators.
- After installing your accelerator, create a hardware profile as described in: Working with hardware profiles.
Verification
From the Administrator perspective, go to the Ecosystem
Installed Operators page. Confirm that the following Operators appear: - The Operator for your accelerator
- Node Feature Discovery (NFD)
- Kernel Module Management (KMM)
The accelerator is correctly detected a few minutes after full installation of the Node Feature Discovery (NFD) and the relevant accelerator Operator. The OpenShift CLI (
oc) displays the appropriate output for the GPU worker nodes with accelerator cards. For example, here is output confirming that an NVIDIA GPU is detected:# Expected output when the accelerator is detected correctly oc describe node <node name> ... Capacity: cpu: 4 ephemeral-storage: 313981932Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 16076568Ki nvidia.com/gpu: 1 pods: 250 Allocatable: cpu: 3920m ephemeral-storage: 288292006229 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 12828440Ki nvidia.com/gpu: 1 pods: 250