3.3. 测试 AMD GPU Operator
使用以下步骤测试 ROCmInfo 安装并查看 AMD MI210 GPU 的日志。
流程
创建测试 ROCmInfo 的 YAML 文件:
$ cat << EOF > rocminfo.yaml apiVersion: v1 kind: Pod metadata: name: rocminfo spec: containers: - image: docker.io/rocm/pytorch:latest name: rocminfo command: ["/bin/sh","-c"] args: ["rocminfo"] resources: limits: amd.com/gpu: 1 requests: amd.com/gpu: 1 restartPolicy: Never EOF创建
rocminfopod:$ oc create -f rocminfo.yaml输出示例
apiVersion: v1 pod/rocminfo created使用一个 MI210 GPU 检查
rocmnfo日志:$ oc logs rocminfo | grep -A5 "Agent"输出示例
HSA Agents ========== ******* Agent 1 ******* Name: Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz Uuid: CPU-XX Marketing Name: Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz Vendor Name: CPU -- Agent 2 ******* Name: Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz Uuid: CPU-XX Marketing Name: Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz Vendor Name: CPU -- Agent 3 ******* Name: gfx90a Uuid: GPU-024b776f768a638b Marketing Name: AMD Instinct MI210 Vendor Name: AMD删除 Pod:
$ oc delete -f rocminfo.yaml输出示例
pod "rocminfo" deleted