이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 3. Installing the Node Feature Discovery Operator and NVIDIA GPU Operator


Install the Node Feature Discovery Operator and the NVIDIA GPU Operator that allow you to use the underlying host AI accelerators.

Prerequisites

  • You have installed the OpenShift CLI (oc).
  • You have logged in as a user with cluster-admin privileges.
  • You have successfully mirrored the required Operator images in the disconnected environment.

Procedure

  1. Disable the default OperatorHub sources. Run the following command:

    $ oc patch OperatorHub cluster --type json \
        -p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
  2. Apply Namespace, OperatorGroup, and Subscription CRs for the Node Feature Discovery Operator and the NVIDIA GPU Operators.

    1. Create the Namespace CRs:

      oc apply -f - <<EOF
      apiVersion: v1
      kind: Namespace
      metadata:
        name: nvidia-gpu-operator
      ---
      apiVersion: v1
      kind: Namespace
      metadata:
        name: openshift-nfd
        labels:
          name: openshift-nfd
          openshift.io/cluster-monitoring: "true"
      EOF
    2. Create the OperatorGroup CRs:

      oc apply -f - <<EOF
      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        name: gpu-operator-certified
        namespace: nvidia-gpu-operator
      spec:
       targetNamespaces:
       - nvidia-gpu-operator
      ---
      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        generateName: openshift-nfd-
        name: openshift-nfd
        namespace: openshift-nfd
      spec:
        targetNamespaces:
        - openshift-nfd
      EOF
    3. Create the Subscription CRs:

      oc apply -f - <<EOF
      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
        name: gpu-operator-certified
        namespace: nvidia-gpu-operator
      spec:
        channel: "stable"
        installPlanApproval: Manual
        name: gpu-operator-certified
        source: certified-operators
        sourceNamespace: openshift-marketplace
      ---
      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
        name: nfd
        namespace: openshift-nfd
      spec:
        channel: "stable"
        installPlanApproval: Automatic
        name: nfd
        source: redhat-operators
        sourceNamespace: openshift-marketplace
      EOF
  3. Create a Secret custom resource (CR) for the Hugging Face token.

    1. Set the HF_TOKEN variable using the token you set in Hugging Face.

      $ HF_TOKEN=<your_huggingface_token>
    2. Set the cluster namespace to match where you deploy the Red Hat AI Inference Server image, for example:

      $ NAMESPACE=rhaiis-namespace
    3. Create the Secret CR in the cluster:

      $ oc create secret generic hf-secret --from-literal=HF_TOKEN=$HF_TOKEN -n $NAMESPACE

Verification

Verify that the Operator deployments are successful by running the following command:

$ oc get pods

Example output

NAME                                                  READY   STATUS     RESTARTS   AGE
nfd-controller-manager-7f86ccfb58-vgr4x               2/2     Running    0          10m
gpu-feature-discovery-c2rfm                           1/1     Running    0          6m28s
gpu-operator-84b7f5bcb9-vqds7                         1/1     Running    0          39m
nvidia-container-toolkit-daemonset-pgcrf              1/1     Running    0          6m28s
nvidia-cuda-validator-p8gv2                           0/1     Completed  0          99s
nvidia-dcgm-exporter-kv6k8                            1/1     Running    0          6m28s
nvidia-dcgm-tpsps                                     1/1     Running    0          6m28s
nvidia-device-plugin-daemonset-gbn55                  1/1     Running    0          6m28s
nvidia-device-plugin-validator-z7ltr                  0/1     Completed  0          82s
nvidia-driver-daemonset-410.84.202203290245-0-xxgdv   2/2     Running    0          6m28s
nvidia-node-status-exporter-snmsm                     1/1     Running    0          6m28s
nvidia-operator-validator-6pfk6                       1/1     Running    0          6m28s
...

Red Hat logoGithubredditYoutubeTwitter

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

Red Hat을 사용하는 고객은 신뢰할 수 있는 콘텐츠가 포함된 제품과 서비스를 통해 혁신하고 목표를 달성할 수 있습니다. 최신 업데이트를 확인하세요.

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat은 코드, 문서, 웹 속성에서 문제가 있는 언어를 교체하기 위해 최선을 다하고 있습니다. 자세한 내용은 다음을 참조하세요.Red Hat 블로그.

Red Hat 소개

Red Hat은 기업이 핵심 데이터 센터에서 네트워크 에지에 이르기까지 플랫폼과 환경 전반에서 더 쉽게 작업할 수 있도록 강화된 솔루션을 제공합니다.

Theme

© 2026 Red Hat
맨 위로 이동