Chapter 3. Managing and monitoring models on the NVIDIA NIM model serving platform

As a cluster administrator, you can manage and monitor models on the NVIDIA NIM model serving platform. You can customize your NVIDIA NIM model selection options and enable metrics for a NIM model, among other tasks.

3.1. Customizing model selection options for the NVIDIA NIM model serving platform
Copy link

The NVIDIA NIM model serving platform provides access to all available NVIDIA NIM models from the NVIDIA GPU Cloud (NGC). You can deploy a NIM model by selecting it from the NVIDIA NIM list in the Deploy model dialog. To customize the models that appear in the list, you can create a ConfigMap object specifying your preferred models.

Prerequisites

You have cluster administrator privileges for your OpenShift cluster.
You have an NVIDIA Cloud Account (NCA) and can access the NVIDIA GPU Cloud (NGC) portal.
You know the IDs of the NVIDIA NIM models that you want to make available for selection on the NVIDIA NIM model serving platform.
Note
- You can find the model ID from the NGC Catalog. The ID is usually part of the URL path.
- You can also find the model ID by using the NGC CLI. For more information, see NGC CLI reference.
You know the name and namespace of your Account custom resource (CR).

Procedure

In a terminal window, log in to the OpenShift CLI as a cluster administrator as shown in the following example:
```
oc login <openshift_cluster_url> -u <admin_username> -p <password>
```

Define a ConfigMap object in a YAML file, similar to the one in the following example, containing the model IDs that you want to make available for selection on the NVIDIA NIM model serving platform:

apiVersion: v1
kind: ConfigMap
metadata:
 name: nvidia-nim-enabled-models
data:
 models: |-
    [
    "mistral-nemo-12b-instruct",
    "llama3-70b-instruct",
    "phind-codellama-34b-v2-instruct",
    "deepseek-r1",
    "qwen-2.5-72b-instruct"
    ]

Confirm the name and namespace of your Account CR:

oc get account -A

You see output similar to the following example:

NAMESPACE         NAME       TEMPLATE  CONFIGMAP  SECRET
redhat-ods-applications  odh-nim-account

Deploy the ConfigMap object in the same namespace as your Account CR:
```
oc apply -f <configmap-name> -n <namespace>
```
Replace <configmap-name> with the name of your YAML file, and <namespace> with the namespace of your Account CR.
Add the ConfigMap object that you previously created to the spec.modelListConfig section of your Account CR:
```
oc patch account <account-name> \
  --type='merge' \
  	-p '{"spec": {"modelListConfig": {"name": "<configmap-name>"}}}'
```
Replace <account-name> with the name of your Account CR, and <configmap-name> with your ConfigMap object.
Confirm that the ConfigMap object is added to your Account CR:
```
oc get account <account-name> -o yaml
```
You see the ConfigMap object in the spec.modelListConfig section of your Account CR, similar to the following output:
```
spec:
 enabledModelsConfig:
 modelListConfig:
  name: <configmap-name>
```

Verification

Follow the steps to deploy a model as described in Deploying models on the NVIDIA NIM model serving platform to deploy a NIM model. You see that the NVIDIA NIM list in the Deploy model dialog displays your preferred list of models instead of all the models available in the NGC catalog.

3.2. Enabling NVIDIA NIM metrics for an existing NIM deployment
Copy link

If you have previously deployed a NIM model in OpenShift AI, and then upgraded to 2.25, you must manually enable NIM metrics for your existing deployment by adding annotations to enable metrics collection and graph generation.

Note

NIM metrics and graphs are automatically enabled for new deployments in 2.17.

3.2.1. Enabling graph generation for an existing NIM deployment
Copy link

The following procedure describes how to enable graph generation for an existing NIM deployment.

Prerequisites

You have cluster administrator privileges for your OpenShift cluster.
You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:
- Installing the OpenShift CLI for OpenShift Container Platform
- Installing the OpenShift CLI for Red Hat OpenShift Service on AWS
You have an existing NIM deployment in OpenShift AI.

Procedure

In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI (oc).
Confirm the name of the ServingRuntime associated with your NIM deployment:
```
oc get servingruntime -n <namespace>
```
Replace <namespace> with the namespace of the project where your NIM model is deployed.
Check for an existing metadata.annotations section in the ServingRuntime configuration:
```
oc get servingruntime -n  <namespace> <servingruntime-name> -o json | jq '.metadata.annotations'
```
Replace <servingruntime-name> with the name of the ServingRuntime from the previous step.

Perform one of the following actions:

If the metadata.annotations section is not present in the configuration, add the section with the required annotations:

oc patch servingruntime -n <namespace> <servingruntime-name> --type json --patch \
 '[{"op": "add", "path": "/metadata/annotations", "value": {"runtimes.opendatahub.io/nvidia-nim": "true"}}]'

You see output similar to the following:

servingruntime.serving.kserve.io/nim-serving-runtime patched

If there is an existing metadata.annotations section, add the required annotations to the section:

oc patch servingruntime -n <project-namespace> <runtime-name> --type json --patch \
 '[{"op": "add", "path": "/metadata/annotations/runtimes.opendatahub.io~1nvidia-nim", "value": "true"}]'

You see output similar to the following:

servingruntime.serving.kserve.io/nim-serving-runtime patched

Verification

Confirm that the annotation has been added to the ServingRuntime of your existing NIM deployment.
```
oc get servingruntime -n <namespace> <servingruntime-name> -o json | jq '.metadata.annotations'
```
The annotation that you added is displayed in the output:
```
...
"runtimes.opendatahub.io/nvidia-nim": "true"
```
Note
For metrics to be available for graph generation, you must also enable metrics collection for your deployment. Please see Enabling metrics collection for an existing NIM deployment.

3.2.2. Enabling metrics collection for an existing NIM deployment
Copy link

To enable metrics collection for your existing NIM deployment, you must manually add the Prometheus endpoint and port annotations to the InferenceService of your deployment.

The following procedure describes how to add the required Prometheus annotations to the InferenceService of your NIM deployment.

Prerequisites

You have cluster administrator privileges for your OpenShift cluster.
You have installed the OpenShift CLI (oc) as described in the appropriate documentation for your cluster:
- Installing the OpenShift CLI for OpenShift Container Platform
- Installing the OpenShift CLI for Red Hat OpenShift Service on AWS
You have an existing NIM deployment in OpenShift AI.

Procedure

In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI (oc).
Confirm the name of the InferenceService associated with your NIM deployment:
```
oc get inferenceservice -n <namespace>
```
Replace <namespace> with the namespace of the project where your NIM model is deployed.
Check if there is an existing spec.predictor.annotations section in the InferenceService configuration:
```
oc get inferenceservice -n <namespace> <inferenceservice-name> -o json | jq '.spec.predictor.annotations'
```
Replace <inferenceservice-name> with the name of the InferenceService from the previous step.

Perform one of the following actions:

If the spec.predictor.annotations section does not exist in the configuration, add the section and required annotations:

oc patch inferenceservice -n <namespace> <inference-name> --type json --patch \
 '[{"op": "add", "path": "/spec/predictor/annotations", "value": {"prometheus.io/path": "/metrics", "prometheus.io/port": "8000"}}]'

The annotation that you added is displayed in the output:

inferenceservice.serving.kserve.io/nim-serving-runtime patched

If there is an existing spec.predictor.annotations section, add the Prometheus annotations to the section:

oc patch inferenceservice -n <namespace> <inference-service-name> --type json --patch \
 '[{"op": "add", "path": "/spec/predictor/annotations/prometheus.io~1path", "value": "/metrics"},
 {"op": "add", "path": "/spec/predictor/annotations/prometheus.io~1port", "value": "8000"}]'

The annotations that you added is displayed in the output:

inferenceservice.serving.kserve.io/nim-serving-runtime patched

Verification

Confirm that the annotations have been added to the InferenceService.

oc get inferenceservice -n <namespace> <inferenceservice-name> -o json | jq '.spec.predictor.annotations'

You see the annotation that you added in the output:

{
  "prometheus.io/path": "/metrics",
  "prometheus.io/port": "8000"
}

Chapter 3. Managing and monitoring models on the NVIDIA NIM model serving platform

3.1. Customizing model selection options for the NVIDIA NIM model serving platform
Copy link

3.2. Enabling NVIDIA NIM metrics for an existing NIM deployment
Copy link

3.2.1. Enabling graph generation for an existing NIM deployment
Copy link

3.2.2. Enabling metrics collection for an existing NIM deployment
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 3. Managing and monitoring models on the NVIDIA NIM model serving platform

3.1. Customizing model selection options for the NVIDIA NIM model serving platformCopy linkLink copied to clipboard!

3.2. Enabling NVIDIA NIM metrics for an existing NIM deploymentCopy linkLink copied to clipboard!

3.2.1. Enabling graph generation for an existing NIM deploymentCopy linkLink copied to clipboard!

3.2.2. Enabling metrics collection for an existing NIM deploymentCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

3.1. Customizing model selection options for the NVIDIA NIM model serving platform
Copy link

3.2. Enabling NVIDIA NIM metrics for an existing NIM deployment
Copy link

3.2.1. Enabling graph generation for an existing NIM deployment
Copy link

3.2.2. Enabling metrics collection for an existing NIM deployment
Copy link