Chapter 4. Deploying and testing a model


4.1. Preparing the model for deployment

After you train a model, you can deploy it by using the OpenShift AI model serving capabilities. Model serving in OpenShift AI requires that you store models in object storage so that the model server pods can access them.

To prepare a model for deployment, you must move the model from your workbench to your S3-compatible object storage. Use the connection that you created in the Storing data with connections section and upload the model from a notebook.

Prerequisites

  • You created the My Storage connection and have added it to your workbench.

    Data storage in workbench

Procedure

  1. In your JupyterLab environment, open the 2_save_model.ipynb file.
  2. Follow the instructions in the notebook to make the model accessible in storage.

Verification

When you have completed the notebook instructions, the models/fraud/1/model.onnx file is in your object storage and it is ready for your model server to use.

4.2. Deploying the model

You can use an OpenShift AI model server to deploy the model as an API.

Prerequisites

  • You have saved the model as described in Preparing the model for deployment.
  • You have installed KServe and enabled the model serving platform.
  • You have enabled a preinstalled or custom model-serving runtime.
  • You have obtained values for the following MinIO storage parameters:

    • Access Key
    • Secret Key
    • Endpoint
    • Region
    • Bucket

      To obtain these values, navigate to your project’s Connections tab. For the MyStorage connection, click the action menu (⋮) and then click Edit.

Procedure

  1. In the OpenShift AI dashboard, navigate to the project details page and click the Deployments tab.
  2. Click Deploy model.

    The Deploy a model wizard opens.

  3. In the Model details section, provide information about the model:

    1. For Model location, select Existing connection and then select My Storage.
    2. Enter the following values from your MinIO storage connection:

      • Access Key
      • Secret Key
      • Endpoint
      • Region
      • Bucket
    3. For Path, enter models/fraud.
    4. For Model type, select Predictive model.
    5. Click Next.
  4. In the Model deployment section, configure the deployment:

    1. For Model deployment name, enter fraud.
    2. For Description, enter a description of your deployment.
    3. For the hardware profile, keep the default value.
    4. For Model framework (name - version), select onnx-1.
    5. For the Serving runtime field, accept the auto-selected runtime, OpenVINO Model Server.
    6. Click Next.
  5. In the Advanced settings section, accept the defaults by clicking Next.
  6. In the Review section, click Deploy model.

Verification

  • Confirm that the deployed model is shown on the Deployments tab for the project, and on the Deployments page of the dashboard with a Started status.

    Deployed model status

4.3. Testing the model API

After you deploy the model, you can test its API endpoints.

Procedure

  1. In the OpenShift AI dashboard, navigate to the project details page and click the Deployments tab.
  2. Take note of the model’s Inference endpoint URL. You need this information when you test the model API.

    If the Inference endpoint field has an Internal endpoint details link, click the link to open a text box that shows the URL details, and then take note of the restUrl value.

    Model inference endpoint

    NOTE: When you test the model API from inside a workbench, you must edit the endpoint to specify 8888 for the port. For example:

    http://fraud-predictor.fraud-detection.svc.cluster.local:8888

  3. Return to the JupyterLab environment and try out your new endpoint.

    Follow the directions in 3_rest_requests.ipynb to try a REST API call.

(Optional) Running a pipeline generated from Python code

Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top