Chapter 4. Configuring persistent storage and inferencing the model
You should configure persistent storage for AI Inference Server to store the model images before you inference the model.
Configuring persistent storage is an optional but recommended step.
Prerequisites
- You have installed a mirror registry on the bastion host.
- You have installed the Node Feature Discovery Operator and NVIDIA GPU Operator in the disconnected cluster.
Procedure
- In the disconnected OpenShift Container Platform cluster, configure persistent storage using Network File System (NFS).
Create a
Deploymentcustom resource (CR). For example, the followingDeploymentCR uses AI Inference Server to serve a Granite model on a CUDA accelerator.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a
ServiceCR for the model inference. For example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional. Create a
RouteCR to enable public access to the model. For example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the URL for the exposed route:
oc get route granite -n rhaiis-namespace -o jsonpath='{.spec.host}'$ oc get route granite -n rhaiis-namespace -o jsonpath='{.spec.host}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
granite-rhaiis-namespace.apps.example.com
granite-rhaiis-namespace.apps.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Query the model by running the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow