Chapter 4. Deploying and testing a model
4.1. Preparing the model for deployment Copy linkLink copied to clipboard!
After you train a model, you can deploy it by using the OpenShift AI model serving capabilities. Model serving in OpenShift AI requires that you store models in object storage so that the model server pods can access them.
To prepare a model for deployment, you must move the model from your workbench to your S3-compatible object storage. Use the connection that you created in the Storing data with connections section and upload the model from a notebook.
Prerequisites
You created the
My Storageconnection and have added it to your workbench.
Procedure
-
In your JupyterLab environment, open the
2_save_model.ipynbfile. - Follow the instructions in the notebook to make the model accessible in storage.
Verification
When you have completed the notebook instructions, the models/fraud/1/model.onnx file is in your object storage and it is ready for your model server to use.
Next step
4.2. Deploying the model Copy linkLink copied to clipboard!
You can use an OpenShift AI model server to deploy the model as an API.
Prerequisites
- You have saved the model as described in Preparing the model for deployment.
- You have installed KServe and enabled the model serving platform.
- You have enabled a preinstalled or custom model-serving runtime.
You have obtained values for the following MinIO storage parameters:
- Access Key
- Secret Key
- Endpoint
- Region
Bucket
To obtain these values, navigate to your project’s Connections tab. For the MyStorage connection, click the action menu (⋮) and then click Edit.
Procedure
- In the OpenShift AI dashboard, navigate to the project details page and click the Deployments tab.
Click Deploy model.
The Deploy a model wizard opens.
In the Model details section, provide information about the model:
- For Model location, select Existing connection and then select My Storage.
Enter the following values from your MinIO storage connection:
- Access Key
- Secret Key
- Endpoint
- Region
- Bucket
-
For Path, enter
models/fraud. - For Model type, select Predictive model.
- Click Next.
In the Model deployment section, configure the deployment:
-
For Model deployment name, enter
fraud. - For Description, enter a description of your deployment.
- For the hardware profile, keep the default value.
-
For Model framework (name - version), select
onnx-1. -
For the Serving runtime field, accept the auto-selected runtime,
OpenVINO Model Server. - Click Next.
-
For Model deployment name, enter
- In the Advanced settings section, accept the defaults by clicking Next.
- In the Review section, click Deploy model.
Verification
Confirm that the deployed model is shown on the Deployments tab for the project, and on the Deployments page of the dashboard with a Started status.
Next step
4.3. Testing the model API Copy linkLink copied to clipboard!
After you deploy the model, you can test its API endpoints.
Procedure
- In the OpenShift AI dashboard, navigate to the project details page and click the Deployments tab.
Take note of the model’s Inference endpoint URL. You need this information when you test the model API.
If the Inference endpoint field has an Internal endpoint details link, click the link to open a text box that shows the URL details, and then take note of the restUrl value.
NOTE: When you test the model API from inside a workbench, you must edit the endpoint to specify
8888for the port. For example:http://fraud-predictor.fraud-detection.svc.cluster.local:8888Return to the JupyterLab environment and try out your new endpoint.
Follow the directions in
3_rest_requests.ipynbto try a REST API call.
Next step
(Optional) Automating workflows with AI pipelines