8.2. Kue 리소스 구성의 예
이 예제에서는 Kueue 리소스 플레이버 및 클러스터 대기열을 구성하는 방법을 보여줍니다.
참고
OpenShift AI 2.17에서 Red Hat은 공유 코호트를 지원하지 않습니다.
8.2.1. 공유 코호트가 없는 NVIDIA GPU
8.2.1.1. NVIDIA RTX A400 GPU 리소스 플레이버
apiVersion: kueue.x-k8s.io/v1beta1 kind: ResourceFlavor metadata: name: "A400-node" spec: nodeLabels: instance-type: nvidia-a400-node tolerations: - key: "HasGPU" operator: "Exists" effect: "NoSchedule"
8.2.1.2. NVIDIA RTX A1000 GPU 리소스 플레이버
apiVersion: kueue.x-k8s.io/v1beta1 kind: ResourceFlavor metadata: name: "A1000-node" spec: nodeLabels: instance-type: nvidia-a1000-node tolerations: - key: "HasGPU" operator: "Exists" effect: "NoSchedule"
8.2.1.3. NVIDIA RTX A400 GPU 클러스터 대기열
apiVersion: kueue.x-k8s.io/v1beta1 kind: ClusterQueue metadata: name: "A400-queue" spec: namespaceSelector: {} # match all. resourceGroups: - coveredResources: ["cpu", "memory", "nvidia.com/gpu"] - name: "A400-node" resources: - name: "cpu" nominalQuota: 16 - name: "memory" nominalQuota: 64Gi - name: "nvidia.com/gpu" nominalQuota: 2
8.2.1.4. NVIDIA RTX A1000 GPU 클러스터 대기열
apiVersion: kueue.x-k8s.io/v1beta1 kind: ClusterQueue metadata: name: "A1000-queue" spec: namespaceSelector: {} # match all. resourceGroups: - coveredResources: ["cpu", "memory", "nvidia.com/gpu"] flavors: - name: "A1000-node" resources: - name: "cpu" nominalQuota: 16 - name: "memory" nominalQuota: 64Gi - name: "nvidia.com/gpu" nominalQuota: 2
8.2.2. 공유 코호트가 없는 NVIDIA GPU 및 AMD GPU
8.2.2.1. AMD GPU 리소스 플레이버
apiVersion: kueue.x-k8s.io/v1beta1 kind: ResourceFlavor metadata: name: "amd-node" spec: nodeLabels: instance-type: amd-node tolerations: - key: "HasGPU" operator: "Exists" effect: "NoSchedule"
8.2.2.2. NVIDIA GPU 리소스 플레이버
apiVersion: kueue.x-k8s.io/v1beta1 kind: ResourceFlavor metadata: name: "nvidia-node" spec: nodeLabels: instance-type: nvidia-node tolerations: - key: "HasGPU" operator: "Exists" effect: "NoSchedule"
8.2.2.3. AMD GPU 클러스터 대기열
apiVersion: kueue.x-k8s.io/v1beta1 kind: ClusterQueue metadata: name: "team-a-amd-queue" spec: namespaceSelector: {} # match all. resourceGroups: - coveredResources: ["cpu", "memory", "amd.com/gpu"] - name: "amd-node" resources: - name: "cpu" nominalQuota: 16 - name: "memory" nominalQuota: 64Gi - name: "amd.com/gpu"
8.2.2.4. NVIDIA GPU 클러스터 대기열
apiVersion: kueue.x-k8s.io/v1beta1 kind: ClusterQueue metadata: name: "team-a-nvidia-queue" spec: namespaceSelector: {} # match all. resourceGroups: - coveredResources: ["cpu", "memory", "nvidia.com/gpu"] flavors: - name: "nvidia-node" resources: - name: "cpu" nominalQuota: 16 - name: "memory" nominalQuota: 64Gi - name: "nvidia.com/gpu" nominalQuota: 2