Chapter 3. Managing and monitoring models on the NVIDIA NIM model serving platform
As a cluster administrator, you can manage and monitor models on the NVIDIA NIM model serving platform. You can customize your NVIDIA NIM model selection options and enable metrics for a NIM model, among other tasks.
3.1. Customizing model selection options for the NVIDIA NIM model serving platform Copy linkLink copied to clipboard!
The NVIDIA NIM model serving platform provides access to all available NVIDIA NIM models from the NVIDIA GPU Cloud (NGC). You can deploy a NIM model by selecting it from the NVIDIA NIM list in the Deploy model dialog. To customize the models that appear in the list, you can create a ConfigMap object specifying your preferred models.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
- You have an NVIDIA Cloud Account (NCA) and can access the NVIDIA GPU Cloud (NGC) portal.
You know the IDs of the NVIDIA NIM models that you want to make available for selection on the NVIDIA NIM model serving platform.
Note- You can find the model ID from the NGC Catalog. The ID is usually part of the URL path.
- You can also find the model ID by using the NGC CLI. For more information, see NGC CLI reference.
-
You know the name and namespace of your
Accountcustom resource (CR).
Procedure
In a terminal window, log in to the OpenShift CLI as a cluster administrator as shown in the following example:
oc login <openshift_cluster_url> -u <admin_username> -p <password>
oc login <openshift_cluster_url> -u <admin_username> -p <password>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Define a
ConfigMapobject in a YAML file, similar to the one in the following example, containing the model IDs that you want to make available for selection on the NVIDIA NIM model serving platform:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm the name and namespace of your
AccountCR:oc get account -A
oc get account -ACopy to Clipboard Copied! Toggle word wrap Toggle overflow You see output similar to the following example:
NAMESPACE NAME TEMPLATE CONFIGMAP SECRET redhat-ods-applications odh-nim-account
NAMESPACE NAME TEMPLATE CONFIGMAP SECRET redhat-ods-applications odh-nim-accountCopy to Clipboard Copied! Toggle word wrap Toggle overflow Deploy the
ConfigMapobject in the same namespace as yourAccountCR:oc apply -f <configmap-name> -n <namespace>
oc apply -f <configmap-name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <configmap-name> with the name of your YAML file, and <namespace> with the namespace of your
AccountCR.Add the
ConfigMapobject that you previously created to thespec.modelListConfigsection of yourAccountCR:oc patch account <account-name> \ --type='merge' \ -p '{"spec": {"modelListConfig": {"name": "<configmap-name>"}}}'oc patch account <account-name> \ --type='merge' \ -p '{"spec": {"modelListConfig": {"name": "<configmap-name>"}}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <account-name> with the name of your
AccountCR, and <configmap-name> with yourConfigMapobject.Confirm that the
ConfigMapobject is added to yourAccountCR:oc get account <account-name> -o yaml
oc get account <account-name> -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow You see the
ConfigMapobject in thespec.modelListConfigsection of yourAccountCR, similar to the following output:spec: enabledModelsConfig: modelListConfig: name: <configmap-name>
spec: enabledModelsConfig: modelListConfig: name: <configmap-name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
- Follow the steps to deploy a model as described in Deploying models on the NVIDIA NIM model serving platform to deploy a NIM model. You see that the NVIDIA NIM list in the Deploy model dialog displays your preferred list of models instead of all the models available in the NGC catalog.
3.2. Enabling NVIDIA NIM metrics for an existing NIM deployment Copy linkLink copied to clipboard!
If you have previously deployed a NIM model in OpenShift AI, and then upgraded to 3.2, you must manually enable NIM metrics for your existing deployment by adding annotations to enable metrics collection and graph generation.
NIM metrics and graphs are automatically enabled for new deployments in 2.17.
3.2.1. Enabling graph generation for an existing NIM deployment Copy linkLink copied to clipboard!
The following procedure describes how to enable graph generation for an existing NIM deployment.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
You have installed the OpenShift CLI (
oc) as described in the appropriate documentation for your cluster:- Installing the OpenShift CLI for OpenShift Container Platform
- Installing the OpenShift CLI for Red Hat OpenShift Service on AWS
- You have an existing NIM deployment in OpenShift AI.
Procedure
-
In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI (
oc). Confirm the name of the
ServingRuntimeassociated with your NIM deployment:oc get servingruntime -n <namespace>
oc get servingruntime -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<namespace>with the namespace of the project where your NIM model is deployed.Check for an existing
metadata.annotationssection in theServingRuntimeconfiguration:oc get servingruntime -n <namespace> <servingruntime-name> -o json | jq '.metadata.annotations'
oc get servingruntime -n <namespace> <servingruntime-name> -o json | jq '.metadata.annotations'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <servingruntime-name> with the name of the
ServingRuntimefrom the previous step.Perform one of the following actions:
If the
metadata.annotationssection is not present in the configuration, add the section with the required annotations:oc patch servingruntime -n <namespace> <servingruntime-name> --type json --patch \ '[{"op": "add", "path": "/metadata/annotations", "value": {"runtimes.opendatahub.io/nvidia-nim": "true"}}]'oc patch servingruntime -n <namespace> <servingruntime-name> --type json --patch \ '[{"op": "add", "path": "/metadata/annotations", "value": {"runtimes.opendatahub.io/nvidia-nim": "true"}}]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow You see output similar to the following:
servingruntime.serving.kserve.io/nim-serving-runtime patched
servingruntime.serving.kserve.io/nim-serving-runtime patchedCopy to Clipboard Copied! Toggle word wrap Toggle overflow If there is an existing
metadata.annotationssection, add the required annotations to the section:oc patch servingruntime -n <project-namespace> <runtime-name> --type json --patch \ '[{"op": "add", "path": "/metadata/annotations/runtimes.opendatahub.io~1nvidia-nim", "value": "true"}]'oc patch servingruntime -n <project-namespace> <runtime-name> --type json --patch \ '[{"op": "add", "path": "/metadata/annotations/runtimes.opendatahub.io~1nvidia-nim", "value": "true"}]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow You see output similar to the following:
servingruntime.serving.kserve.io/nim-serving-runtime patched
servingruntime.serving.kserve.io/nim-serving-runtime patchedCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Confirm that the annotation has been added to the
ServingRuntimeof your existing NIM deployment.oc get servingruntime -n <namespace> <servingruntime-name> -o json | jq '.metadata.annotations'
oc get servingruntime -n <namespace> <servingruntime-name> -o json | jq '.metadata.annotations'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The annotation that you added is displayed in the output:
... "runtimes.opendatahub.io/nvidia-nim": "true"
... "runtimes.opendatahub.io/nvidia-nim": "true"Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor metrics to be available for graph generation, you must also enable metrics collection for your deployment. For more information, see Enabling metrics collection for an existing NIM deployment.
3.2.2. Enabling metrics collection for an existing NIM deployment Copy linkLink copied to clipboard!
To enable metrics collection for your existing NIM deployment, you must manually add the Prometheus endpoint and port annotations to the InferenceService of your deployment.
The following procedure describes how to add the required Prometheus annotations to the InferenceService of your NIM deployment.
Prerequisites
- You have cluster administrator privileges for your OpenShift cluster.
You have installed the OpenShift CLI (
oc) as described in the appropriate documentation for your cluster:- Installing the OpenShift CLI for OpenShift Container Platform
- Installing the OpenShift CLI for Red Hat OpenShift Service on AWS
- You have an existing NIM deployment in OpenShift AI.
Procedure
-
In a terminal window, if you are not already logged in to your OpenShift cluster as a cluster administrator, log in to the OpenShift CLI (
oc). Confirm the name of the
InferenceServiceassociated with your NIM deployment:oc get inferenceservice -n <namespace>
oc get inferenceservice -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<namespace>with the namespace of the project where your NIM model is deployed.Check if there is an existing
spec.predictor.annotationssection in theInferenceServiceconfiguration:oc get inferenceservice -n <namespace> <inferenceservice-name> -o json | jq '.spec.predictor.annotations'
oc get inferenceservice -n <namespace> <inferenceservice-name> -o json | jq '.spec.predictor.annotations'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace <inferenceservice-name> with the name of the
InferenceServicefrom the previous step.Perform one of the following actions:
If the
spec.predictor.annotationssection does not exist in the configuration, add the section and required annotations:oc patch inferenceservice -n <namespace> <inference-name> --type json --patch \ '[{"op": "add", "path": "/spec/predictor/annotations", "value": {"prometheus.io/path": "/metrics", "prometheus.io/port": "8000"}}]'oc patch inferenceservice -n <namespace> <inference-name> --type json --patch \ '[{"op": "add", "path": "/spec/predictor/annotations", "value": {"prometheus.io/path": "/metrics", "prometheus.io/port": "8000"}}]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The annotation that you added is displayed in the output:
inferenceservice.serving.kserve.io/nim-serving-runtime patched
inferenceservice.serving.kserve.io/nim-serving-runtime patchedCopy to Clipboard Copied! Toggle word wrap Toggle overflow If there is an existing
spec.predictor.annotationssection, add the Prometheus annotations to the section:oc patch inferenceservice -n <namespace> <inference-service-name> --type json --patch \ '[{"op": "add", "path": "/spec/predictor/annotations/prometheus.io~1path", "value": "/metrics"}, {"op": "add", "path": "/spec/predictor/annotations/prometheus.io~1port", "value": "8000"}]'oc patch inferenceservice -n <namespace> <inference-service-name> --type json --patch \ '[{"op": "add", "path": "/spec/predictor/annotations/prometheus.io~1path", "value": "/metrics"}, {"op": "add", "path": "/spec/predictor/annotations/prometheus.io~1port", "value": "8000"}]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The annotations that you added is displayed in the output:
inferenceservice.serving.kserve.io/nim-serving-runtime patched
inferenceservice.serving.kserve.io/nim-serving-runtime patchedCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Confirm that the annotations have been added to the
InferenceService.oc get inferenceservice -n <namespace> <inferenceservice-name> -o json | jq '.spec.predictor.annotations'
oc get inferenceservice -n <namespace> <inferenceservice-name> -o json | jq '.spec.predictor.annotations'Copy to Clipboard Copied! Toggle word wrap Toggle overflow You see the annotation that you added in the output:
{ "prometheus.io/path": "/metrics", "prometheus.io/port": "8000" }{ "prometheus.io/path": "/metrics", "prometheus.io/port": "8000" }Copy to Clipboard Copied! Toggle word wrap Toggle overflow