Chapter 3. Managing and monitoring models on the NVIDIA NIM model serving platform


As a cluster administrator, you can manage and monitor models on the NVIDIA NIM model serving platform. You can customize your NVIDIA NIM model selection options and enable metrics for a NIM model, among other tasks.

The NVIDIA NIM model serving platform provides access to all available NVIDIA NIM models from the NVIDIA GPU Cloud (NGC). You can deploy a NIM model by selecting it from the NVIDIA NIM list in the Deploy model dialog. To customize the models that appear in the list, you can create a ConfigMap object specifying your preferred models.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have an NVIDIA Cloud Account (NCA) and can access the NVIDIA GPU Cloud (NGC) portal.
  • You know the IDs of the NVIDIA NIM models that you want to make available for selection on the NVIDIA NIM model serving platform.

    Note
    • You can find the model ID from the NGC Catalog. The ID is usually part of the URL path.
    • You can also find the model ID by using the NGC CLI. For more information, see NGC CLI reference.
  • You know the name and namespace of your Account custom resource (CR).

Procedure

  1. In a terminal window, log in to the OpenShift CLI as a cluster administrator as shown in the following example:

    oc login <openshift_cluster_url> -u <admin_username> -p <password>
    Copy to Clipboard Toggle word wrap
  2. Define a ConfigMap object in a YAML file, similar to the one in the following example, containing the model IDs that you want to make available for selection on the NVIDIA NIM model serving platform:

    apiVersion: v1
    kind: ConfigMap
    metadata:
     name: nvidia-nim-enabled-models
    data:
     models: |-
        [
        "mistral-nemo-12b-instruct",
        "llama3-70b-instruct",
        "phind-codellama-34b-v2-instruct",
        "deepseek-r1",
        "qwen-2.5-72b-instruct"
        ]
    Copy to Clipboard Toggle word wrap
  3. Confirm the name and namespace of your Account CR:

    oc get account -A
    Copy to Clipboard Toggle word wrap

    You see output similar to the following example:

    NAMESPACE         NAME       TEMPLATE  CONFIGMAP  SECRET
    redhat-ods-applications  odh-nim-account
    Copy to Clipboard Toggle word wrap
  4. Deploy the ConfigMap object in the same namespace as your Account CR:

    oc apply -f <configmap-name> -n <namespace>
    Copy to Clipboard Toggle word wrap

    Replace <configmap-name> with the name of your YAML file, and <namespace> with the namespace of your Account CR.

  5. Add the ConfigMap object that you previously created to the spec.modelListConfig section of your Account CR:

    oc patch account <account-name> \
      --type='merge' \
      	-p '{"spec": {"modelListConfig": {"name": "<configmap-name>"}}}'
    Copy to Clipboard Toggle word wrap

    Replace <account-name> with the name of your Account CR, and <configmap-name> with your ConfigMap object.

  6. Confirm that the ConfigMap object is added to your Account CR:

    oc get account <account-name> -o yaml
    Copy to Clipboard Toggle word wrap

    You see the ConfigMap object in the spec.modelListConfig section of your Account CR, similar to the following output:

    spec:
     enabledModelsConfig:
     modelListConfig:
      name: <configmap-name>
    Copy to Clipboard Toggle word wrap

Verification

If you have previously deployed a NIM model in OpenShift AI, and then upgraded to 3.2, you must manually enable NIM metrics for your existing deployment by adding annotations to enable metrics collection and graph generation.

Note

NIM metrics and graphs are automatically enabled for new deployments in 2.17.

The following procedure describes how to enable graph generation for an existing NIM deployment.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

  • You have an existing NIM deployment in OpenShift AI.

Procedure

  1. In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI (oc).
  2. Confirm the name of the ServingRuntime associated with your NIM deployment:

    oc get servingruntime -n <namespace>
    Copy to Clipboard Toggle word wrap

    Replace <namespace> with the namespace of the project where your NIM model is deployed.

  3. Check for an existing metadata.annotations section in the ServingRuntime configuration:

    oc get servingruntime -n  <namespace> <servingruntime-name> -o json | jq '.metadata.annotations'
    Copy to Clipboard Toggle word wrap

    Replace <servingruntime-name> with the name of the ServingRuntime from the previous step.

  4. Perform one of the following actions:

    1. If the metadata.annotations section is not present in the configuration, add the section with the required annotations:

      oc patch servingruntime -n <namespace> <servingruntime-name> --type json --patch \
       '[{"op": "add", "path": "/metadata/annotations", "value": {"runtimes.opendatahub.io/nvidia-nim": "true"}}]'
      Copy to Clipboard Toggle word wrap

      You see output similar to the following:

      servingruntime.serving.kserve.io/nim-serving-runtime patched
      Copy to Clipboard Toggle word wrap
    2. If there is an existing metadata.annotations section, add the required annotations to the section:

      oc patch servingruntime -n <project-namespace> <runtime-name> --type json --patch \
       '[{"op": "add", "path": "/metadata/annotations/runtimes.opendatahub.io~1nvidia-nim", "value": "true"}]'
      Copy to Clipboard Toggle word wrap

      You see output similar to the following:

      servingruntime.serving.kserve.io/nim-serving-runtime patched
      Copy to Clipboard Toggle word wrap

Verification

  • Confirm that the annotation has been added to the ServingRuntime of your existing NIM deployment.

    oc get servingruntime -n <namespace> <servingruntime-name> -o json | jq '.metadata.annotations'
    Copy to Clipboard Toggle word wrap

    The annotation that you added is displayed in the output:

    ...
    "runtimes.opendatahub.io/nvidia-nim": "true"
    Copy to Clipboard Toggle word wrap
    Note

    For metrics to be available for graph generation, you must also enable metrics collection for your deployment. For more information, see Enabling metrics collection for an existing NIM deployment.

To enable metrics collection for your existing NIM deployment, you must manually add the Prometheus endpoint and port annotations to the InferenceService of your deployment.

The following procedure describes how to add the required Prometheus annotations to the InferenceService of your NIM deployment.

Prerequisites

  • You have cluster administrator privileges for your OpenShift cluster.
  • You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:

  • You have an existing NIM deployment in OpenShift AI.

Procedure

  1. In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI (oc).
  2. Confirm the name of the InferenceService associated with your NIM deployment:

    oc get inferenceservice -n <namespace>
    Copy to Clipboard Toggle word wrap

    Replace <namespace> with the namespace of the project where your NIM model is deployed.

  3. Check if there is an existing spec.predictor.annotations section in the InferenceService configuration:

    oc get inferenceservice -n <namespace> <inferenceservice-name> -o json | jq '.spec.predictor.annotations'
    Copy to Clipboard Toggle word wrap

    Replace <inferenceservice-name> with the name of the InferenceService from the previous step.

  4. Perform one of the following actions:

    1. If the spec.predictor.annotations section does not exist in the configuration, add the section and required annotations:

      oc patch inferenceservice -n <namespace> <inference-name> --type json --patch \
       '[{"op": "add", "path": "/spec/predictor/annotations", "value": {"prometheus.io/path": "/metrics", "prometheus.io/port": "8000"}}]'
      Copy to Clipboard Toggle word wrap

      The annotation that you added is displayed in the output:

      inferenceservice.serving.kserve.io/nim-serving-runtime patched
      Copy to Clipboard Toggle word wrap
    2. If there is an existing spec.predictor.annotations section, add the Prometheus annotations to the section:

      oc patch inferenceservice -n <namespace> <inference-service-name> --type json --patch \
       '[{"op": "add", "path": "/spec/predictor/annotations/prometheus.io~1path", "value": "/metrics"},
       {"op": "add", "path": "/spec/predictor/annotations/prometheus.io~1port", "value": "8000"}]'
      Copy to Clipboard Toggle word wrap

      The annotations that you added is displayed in the output:

      inferenceservice.serving.kserve.io/nim-serving-runtime patched
      Copy to Clipboard Toggle word wrap

Verification

  • Confirm that the annotations have been added to the InferenceService.

    oc get inferenceservice -n <namespace> <inferenceservice-name> -o json | jq '.spec.predictor.annotations'
    Copy to Clipboard Toggle word wrap

    You see the annotation that you added in the output:

    {
      "prometheus.io/path": "/metrics",
      "prometheus.io/port": "8000"
    }
    Copy to Clipboard Toggle word wrap
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top