このコンテンツは選択した言語では利用できません。
Chapter 7. Creating a hardware profile for LAB-tuning
Configure a GPU hardware profile in OpenShift AI that users can select when launching a LAB-tuning run.
A GPU hardware profile is required to run LAB-tuning workloads in OpenShift AI. LAB-tuning uses distributed training that must be scheduled on nodes with GPU resources. A GPU hardware profile allows users to target specific GPU-enabled worker nodes when launching pipelines, ensuring that training workloads run on compatible hardware.
Prerequisites
- You are logged in to OpenShift AI as a user with administrator privileges.
- The relevant hardware is installed and you have confirmed that it is detected in your environment.
Procedure
Follow the steps described in Creating a hardware profile to create a
LAB-tuning
hardware profile, adapting the following configurations to your specific cluster setup:Expand Setting Value CPU: Default
4 Cores
CPU: Minimum allowed
2 Cores
CPU: Maximum allowed
4 Cores
Memory: Maximum allowed
250 GiB or more
Resource label
nvidia.com/gpu
Resource identifier
nvidia.com/gpu
Resource type
Accelerator
Node selector key (optional)
node.kubernetes.io/instance-type
Node selector value
a2-ultragpu-2g
Toleration operator (optional)
Exists
Toleration key
nvidia.com/gpu
Toleration effect
NoSchedule
- Ensure that the new hardware profile is available for use with a checkmark in the Enable column.