Chapter 3. Installing the Node Feature Discovery Operator and NVIDIA GPU Operator


Install the Node Feature Discovery Operator and the NVIDIA GPU Operator that allow you to use the underlying host AI accelerators.

Prerequisites

  • You have installed the OpenShift CLI (oc).
  • You have logged in as a user with cluster-admin privileges.
  • You have successfully mirrored the required Operator images in the disconnected environment.

Procedure

  1. Disable the default OperatorHub sources. Run the following command:

    $ oc patch OperatorHub cluster --type json \
        -p '[{"op": "add", "path": "/spec/disableAllDefaultSources", "value": true}]'
  2. Apply Namespace, OperatorGroup, and Subscription CRs for the Node Feature Discovery Operator and the NVIDIA GPU Operators.

    1. Create the Namespace CRs:

      oc apply -f - <<EOF
      apiVersion: v1
      kind: Namespace
      metadata:
        name: nvidia-gpu-operator
      ---
      apiVersion: v1
      kind: Namespace
      metadata:
        name: openshift-nfd
        labels:
          name: openshift-nfd
          openshift.io/cluster-monitoring: "true"
      EOF
    2. Create the OperatorGroup CRs:

      oc apply -f - <<EOF
      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        name: gpu-operator-certified
        namespace: nvidia-gpu-operator
      spec:
       targetNamespaces:
       - nvidia-gpu-operator
      ---
      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        generateName: openshift-nfd-
        name: openshift-nfd
        namespace: openshift-nfd
      spec:
        targetNamespaces:
        - openshift-nfd
      EOF
    3. Create the Subscription CRs:

      oc apply -f - <<EOF
      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
        name: gpu-operator-certified
        namespace: nvidia-gpu-operator
      spec:
        channel: "stable"
        installPlanApproval: Manual
        name: gpu-operator-certified
        source: certified-operators
        sourceNamespace: openshift-marketplace
      ---
      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
        name: nfd
        namespace: openshift-nfd
      spec:
        channel: "stable"
        installPlanApproval: Automatic
        name: nfd
        source: redhat-operators
        sourceNamespace: openshift-marketplace
      EOF
  3. Create a Secret custom resource (CR) for the Hugging Face token.

    1. Set the HF_TOKEN variable using the token you set in Hugging Face.

      $ HF_TOKEN=<your_huggingface_token>
    2. Set the cluster namespace to match where you deploy the Red Hat AI Inference Server image, for example:

      $ NAMESPACE=rhaiis-namespace
    3. Create the Secret CR in the cluster:

      $ oc create secret generic hf-secret --from-literal=HF_TOKEN=$HF_TOKEN -n $NAMESPACE

Verification

Verify that the Operator deployments are successful by running the following command:

$ oc get pods

Example output

NAME                                                  READY   STATUS     RESTARTS   AGE
nfd-controller-manager-7f86ccfb58-vgr4x               2/2     Running    0          10m
gpu-feature-discovery-c2rfm                           1/1     Running    0          6m28s
gpu-operator-84b7f5bcb9-vqds7                         1/1     Running    0          39m
nvidia-container-toolkit-daemonset-pgcrf              1/1     Running    0          6m28s
nvidia-cuda-validator-p8gv2                           0/1     Completed  0          99s
nvidia-dcgm-exporter-kv6k8                            1/1     Running    0          6m28s
nvidia-dcgm-tpsps                                     1/1     Running    0          6m28s
nvidia-device-plugin-daemonset-gbn55                  1/1     Running    0          6m28s
nvidia-device-plugin-validator-z7ltr                  0/1     Completed  0          82s
nvidia-driver-daemonset-410.84.202203290245-0-xxgdv   2/2     Running    0          6m28s
nvidia-node-status-exporter-snmsm                     1/1     Running    0          6m28s
nvidia-operator-validator-6pfk6                       1/1     Running    0          6m28s
...

Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top