Chapter 3. Managing and monitoring models on the NVIDIA NIM model serving platform
As a cluster administrator, you can manage and monitor models on the NVIDIA NIM model serving platform. You can customize your NVIDIA NIM model selection options and enable metrics for a NIM model, among other tasks.
3.1. Customizing model selection options for the NVIDIA NIM model serving platform Copy linkLink copied to clipboard!
The NVIDIA NIM model serving platform provides access to all available NVIDIA NIM models from the NVIDIA GPU Cloud (NGC). You can deploy a NIM model by selecting it from the NVIDIA NIM list in the Deploy model dialog. To customize the models that appear in the list, you can create a ConfigMap object specifying your preferred models.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- You have an NVIDIA Cloud Account (NCA) and can access the NVIDIA GPU Cloud (NGC) portal.
You know the IDs of the NVIDIA NIM models that you want to make available for selection on the NVIDIA NIM model serving platform.
Note- You can find the model ID from the NGC Catalog. The ID is usually part of the URL path.
- You can also find the model ID by using the NGC CLI. For more information, see NGC CLI reference.
-
You know the name and namespace of your
Accountcustom resource (CR).
Procedure
In a terminal window, log in to the OpenShift CLI as a cluster administrator as shown in the following example:
oc login <openshift_cluster_url> -u <admin_username> -p <password>Define a
ConfigMapobject in a YAML file, similar to the one in the following example, containing the model IDs that you want to make available for selection on the NVIDIA NIM model serving platform:apiVersion: v1 kind: ConfigMap metadata: name: nvidia-nim-enabled-models data: models: |- [ "mistral-nemo-12b-instruct", "llama3-70b-instruct", "phind-codellama-34b-v2-instruct", "deepseek-r1", "qwen-2.5-72b-instruct" ]Confirm the name and namespace of your
AccountCR:oc get account -AYou see output similar to the following example:
NAMESPACE NAME TEMPLATE CONFIGMAP SECRET redhat-ods-applications odh-nim-accountDeploy the
ConfigMapobject in the same namespace as yourAccountCR:oc apply -f <configmap-name> -n <namespace>Replace <configmap-name> with the name of your YAML file, and <namespace> with the namespace of your
AccountCR.Add the
ConfigMapobject that you previously created to thespec.modelListConfigsection of yourAccountCR:oc patch account <account-name> \ --type='merge' \ -p '{"spec": {"modelListConfig": {"name": "<configmap-name>"}}}'Replace <account-name> with the name of your
AccountCR, and <configmap-name> with yourConfigMapobject.Confirm that the
ConfigMapobject is added to yourAccountCR:oc get account <account-name> -o yamlYou see the
ConfigMapobject in thespec.modelListConfigsection of yourAccountCR, similar to the following output:spec: enabledModelsConfig: modelListConfig: name: <configmap-name>
Verification
- Follow the steps to deploy a model as described in Deploying models on the NVIDIA NIM model serving platform to deploy a NIM model. You see that the NVIDIA NIM list in the Deploy model dialog displays your preferred list of models instead of all the models available in the NGC catalog.
3.2. Enabling NVIDIA NIM metrics for an existing NIM deployment Copy linkLink copied to clipboard!
If you have previously deployed a NIM model in OpenShift AI, and then upgraded to 2.25, you must manually enable NIM metrics for your existing deployment by adding annotations to enable metrics collection and graph generation.
NIM metrics and graphs are automatically enabled for new deployments in 2.17.
3.2.1. Enabling graph generation for an existing NIM deployment Copy linkLink copied to clipboard!
The following procedure describes how to enable graph generation for an existing NIM deployment.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
You have installed the OpenShift CLI (
oc) as described in the appropriate documentation for your cluster:- Installing the OpenShift CLI for OpenShift Container Platform
- Installing the OpenShift CLI for Red Hat OpenShift Service on AWS
- You have an existing NIM deployment in OpenShift AI.
Procedure
-
In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI (
oc). Confirm the name of the
ServingRuntimeassociated with your NIM deployment:oc get servingruntime -n <namespace>Replace
<namespace>with the namespace of the project where your NIM model is deployed.Check for an existing
metadata.annotationssection in theServingRuntimeconfiguration:oc get servingruntime -n <namespace> <servingruntime-name> -o json | jq '.metadata.annotations'Replace <servingruntime-name> with the name of the
ServingRuntimefrom the previous step.Perform one of the following actions:
If the
metadata.annotationssection is not present in the configuration, add the section with the required annotations:oc patch servingruntime -n <namespace> <servingruntime-name> --type json --patch \ '[{"op": "add", "path": "/metadata/annotations", "value": {"runtimes.opendatahub.io/nvidia-nim": "true"}}]'You see output similar to the following:
servingruntime.serving.kserve.io/nim-serving-runtime patchedIf there is an existing
metadata.annotationssection, add the required annotations to the section:oc patch servingruntime -n <project-namespace> <runtime-name> --type json --patch \ '[{"op": "add", "path": "/metadata/annotations/runtimes.opendatahub.io~1nvidia-nim", "value": "true"}]'You see output similar to the following:
servingruntime.serving.kserve.io/nim-serving-runtime patched
Verification
Confirm that the annotation has been added to the
ServingRuntimeof your existing NIM deployment.oc get servingruntime -n <namespace> <servingruntime-name> -o json | jq '.metadata.annotations'The annotation that you added is displayed in the output:
... "runtimes.opendatahub.io/nvidia-nim": "true"NoteFor metrics to be available for graph generation, you must also enable metrics collection for your deployment. Please see Enabling metrics collection for an existing NIM deployment.
3.2.2. Enabling metrics collection for an existing NIM deployment Copy linkLink copied to clipboard!
To enable metrics collection for your existing NIM deployment, you must manually add the Prometheus endpoint and port annotations to the InferenceService of your deployment.
The following procedure describes how to add the required Prometheus annotations to the InferenceService of your NIM deployment.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
You have installed the OpenShift CLI (
oc) as described in the appropriate documentation for your cluster:- Installing the OpenShift CLI for OpenShift Container Platform
- Installing the OpenShift CLI for Red Hat OpenShift Service on AWS
- You have an existing NIM deployment in OpenShift AI.
Procedure
-
In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI (
oc). Confirm the name of the
InferenceServiceassociated with your NIM deployment:oc get inferenceservice -n <namespace>Replace
<namespace>with the namespace of the project where your NIM model is deployed.Check if there is an existing
spec.predictor.annotationssection in theInferenceServiceconfiguration:oc get inferenceservice -n <namespace> <inferenceservice-name> -o json | jq '.spec.predictor.annotations'Replace <inferenceservice-name> with the name of the
InferenceServicefrom the previous step.Perform one of the following actions:
If the
spec.predictor.annotationssection does not exist in the configuration, add the section and required annotations:oc patch inferenceservice -n <namespace> <inference-name> --type json --patch \ '[{"op": "add", "path": "/spec/predictor/annotations", "value": {"prometheus.io/path": "/metrics", "prometheus.io/port": "8000"}}]'The annotation that you added is displayed in the output:
inferenceservice.serving.kserve.io/nim-serving-runtime patchedIf there is an existing
spec.predictor.annotationssection, add the Prometheus annotations to the section:oc patch inferenceservice -n <namespace> <inference-service-name> --type json --patch \ '[{"op": "add", "path": "/spec/predictor/annotations/prometheus.io~1path", "value": "/metrics"}, {"op": "add", "path": "/spec/predictor/annotations/prometheus.io~1port", "value": "8000"}]'The annotations that you added is displayed in the output:
inferenceservice.serving.kserve.io/nim-serving-runtime patched
Verification
Confirm that the annotations have been added to the
InferenceService.oc get inferenceservice -n <namespace> <inferenceservice-name> -o json | jq '.spec.predictor.annotations'You see the annotation that you added in the output:
{ "prometheus.io/path": "/metrics", "prometheus.io/port": "8000" }