Search

Chapter 4. Deploying and testing a model

download PDF

4.1. Preparing a model for deployment

After you train a model, you can deploy it by using the OpenShift AI model serving capabilities.

To prepare a model for deployment, you must move the model from your workbench to your S3-compatible object storage. You use the data connection that you created in the Storing data with data connections section and upload the model from a notebook. You also convert the model to the portable ONNX format. ONNX allows you to transfer models between frameworks with minimal preparation and without the need for rewriting the models.

Prerequisites

  • You created the data connection My Storage.

    Workbench data connection form
  • You added the My Storage data connection to your workbench.

    Data storage in workbench

Procedure

  1. In your Jupyter environment, open the 2_save_model.ipynb file.
  2. Follow the instructions in the notebook to make the model accessible in storage and save it in the portable ONNX format.

Verification

When you have completed the notebook instructions, the models/fraud/1/model.onnx file is in your object storage and it is ready for your model server to use.

4.2. Deploying a model

Now that the model is accessible in storage and saved in the portable ONNX format, you can use an OpenShift AI model server to deploy it as an API.

OpenShift AI offers two options for model serving:

  • Single-model serving - Each model in the project is deployed on its own model server. This platform is suitable for large models or models that need dedicated resources.
  • Multi-model serving - All models in the project are deployed on the same model server. This platform is suitable for sharing resources amongst deployed models. Multi-model serving is the only option offered in the Red Hat Developer Sandbox environment.

Note: For each project, you can specify only one model serving platform. If you want to change to the other model serving platform, you must create a new project.

For this tutorial, since you are only deploying only one model, you can select either serving type. The steps for deploying the fraud detection model depend on the type of model serving platform you select:

4.2.1. Deploying a model on a single-model server

OpenShift AI single-model servers host only one model. You create a new model server and deploy your model to it.

Note: If you are using the Red Hat Developer Sandbox environment, you must use multi-model serving.

Prerequiste

  • A user with admin privileges has enabled the single-model serving platform on your OpenShift cluster.

Procedure

  1. In the OpenShift AI dashboard, navigate to the project details page and click the Models tab.

    Models

    Note: Depending on how model serving has been configured on your cluster, you might see only one model serving platform option.

  2. In the Single-model serving platform tile, click Deploy model.
  3. In the form, provide the following values:

    1. For Model Name, type fraud.
    2. For Serving runtime, select OpenVINO Model Server.
    3. For Model framework, select onnx-1.
    4. For Existing data connection, select My Storage.
    5. Type the path that leads to the version folder that contains your model file: models/fraud
    6. Leave the other fields with the default settings.

      Deploy model from for single-model serving
  4. Click Deploy.

Verification

Wait for the model to deploy and for the Status to show a green checkmark.

Deployed model status

4.2.2. Deploying a model on a multi-model server

OpenShift AI multi-model servers can host several models at once. You create a new model server and deploy your model to it.

Prerequiste

  • A user with admin privileges has enabled the multi-model serving platform on your OpenShift cluster.

Procedure

  1. In the OpenShift AI dashboard, navigate to the project details page and click the Models tab.

    Models

    Note: Depending on how model serving has been configured on your cluster, you might see only one model serving platform option.

  2. In the Multi-model serving platform tile, click Add model server.
  3. In the form, provide the following values:

    1. For Model server name, type a name, for example Model Server.
    2. For Serving runtime, select OpenVINO Model Server.
    3. Leave the other fields with the default settings.

      Create model server form
  4. Click Add.
  5. In the Models and model servers list, next to the new model server, click Deploy model.

    Create model server form
  6. In the form, provide the following values:

    1. For Model Name, type fraud.
    2. For Model framework, select onnx-1.
    3. For Existing data connection, select My Storage.
    4. Type the path that leads to the version folder that contains your model file: models/fraud
    5. Leave the other fields with the default settings.

      Deploy model from for multi-model serving
  7. Click Deploy.

Verification

Wait for the model to deploy and for the Status to show a green checkmark.

Deployed model status

4.3. Testing the model API

Now that you’ve deployed the model, you can test its API endpoints.

Procedure

  1. In the OpenShift AI dashboard, navigate to the project details page and click the Models tab.
  2. Take note of the model’s Inference endpoint. You need this information when you test the model API.

    Model inference endpoint
  3. Return to the Jupyter environment and try out your new endpoint.

    If you deployed your model with multi-model serving, follow the directions in 3_rest_requests_multi_model.ipynb to try a REST API call and 4_grpc_requests_multi_model.ipynb to try a gRPC API call.

    If you deployed your model with single-model serving, follow the directions in 5_rest_requests_single_model.ipynb to try a REST API call.

Running a data science pipeline generated from Python code

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.