Chapter 4. Deploying and testing a model


4.1. Preparing a model for deployment

After you train a model, you can deploy it by using the OpenShift AI model serving capabilities.

To prepare a model for deployment, you must complete the following tasks:

  • Move the model from your workbench to your S3-compatible object storage. Use the connection that you created in the Storing data with connections section and upload the model from a notebook.
  • Convert the model to the portable ONNX format. ONNX allows you to transfer models between frameworks with minimal preparation and without the need for rewriting the models.

Prerequisites

  • You created the My Storage connection and have added it to your workbench.

    Data storage in workbench

Procedure

  1. In your JupyterLab environment, open the 2_save_model.ipynb file.
  2. Follow the instructions in the notebook to make the model accessible in storage and save it in the portable ONNX format.

Verification

When you have completed the notebook instructions, the models/fraud/1/model.onnx file is in your object storage and it is ready for your model server to use.

4.2. Deploying a model

Now that the model is accessible in storage and saved in the portable ONNX format, you can use an OpenShift AI model server to deploy it as an API.

OpenShift AI offers two options for model serving:

  • Single-model serving - Each model in the project is deployed on its own model server. This platform works well for large models or models that need dedicated resources.
  • Multi-model serving - All models in the project are deployed on the same model server. This platform is suitable for sharing resources amongst deployed models. Multi-model serving is the only option offered in the Red Hat Developer Sandbox environment.

For this tutorial, since you are only deploying only one model, you can select either serving type. The steps for deploying the fraud detection model depend on the type of model serving platform you select:

4.2.1. Deploying a model on a single-model server

OpenShift AI single-model servers host only one model. You create a new model server and deploy your model to it.

Prerequisites

  • A user with admin privileges has enabled the single-model serving platform on your OpenShift cluster.

Procedure

  1. In the OpenShift AI dashboard, navigate to the project details page and click the Models tab.

    Models
    Note

    Depending on how model serving has been configured on your cluster, you might see only one model serving platform option.

  2. In the Single-model serving platform tile, click Select single-model.
  3. In the form, provide the following values:

    1. For Model deployment name, type fraud.
    2. For Serving runtime, select OpenVINO Model Server.
    3. For Model framework (name - version), select onnx-1.
    4. For Deployment mode, select Standard.
    5. For Existing connection, select My Storage.
    6. Type the path that leads to the version folder that contains your model file: models/fraud
    7. Leave the other fields with the default settings.

      Deploy model form for single-model serving
  4. Click Deploy.

Verification

Notice the loading symbol under the Status section. The symbol changes to a green checkmark when the deployment completes successfully.

Deployed model status

4.2.2. Deploying a model on a multi-model server

OpenShift AI multi-model servers can host several models at once. You create a new model server and deploy your model to it.

Prerequisites

  • A user with admin privileges has enabled the multi-model serving platform on your OpenShift cluster.

Procedure

  1. In the OpenShift AI dashboard, navigate to the project details page and click the Models tab.

    Models
    Note

    Depending on how model serving has been configured on your cluster, you might see only one model serving platform option.

  2. In the Multi-model serving platform tile, click Select multi-model.
  3. In the form, provide the following values:

    1. For Model server name, type a name, for example Model Server.
    2. For Serving runtime, select OpenVINO Model Server.
    3. Leave the other fields with the default settings.

      Create model server form
  4. Click Add.
  5. In the Models and model servers list, next to the new model server, click Deploy model.

    Create model server form
  6. In the form, provide the following values:

    1. For Model deployment name, type fraud.
    2. For Model framework (name - version), select onnx-1.
    3. For Existing connection, select My Storage.
    4. Type the path that leads to the version folder that contains your model file: models/fraud
    5. Leave the other fields with the default settings.

      Deploy model from for multi-model serving
  7. Click Deploy.

Verification

Notice the loading symbol under the Status section. The symbol changes to a green checkmark when the deployment completes successfully.

Deployed model status

4.3. Testing the model API

Now that you’ve deployed the model, you can test its API endpoints.

Procedure

  1. In the OpenShift AI dashboard, navigate to the project details page and click the Models tab.
  2. Take note of the model’s Inference endpoint URL. You need this information when you test the model API.

    If the Inference endpoint field contains an Internal endpoint details link, click the link to open a text box that shows the URL details, and then take note of the restUrl value.

    Model inference endpoint
  3. Return to the JupyterLab environment and try out your new endpoint.

    If you deployed your model with the multi-model serving platform, follow the directions in 3_rest_requests_multi_model.ipynb to try a REST API call and 4_grpc_requests_multi_model.ipynb to try a gRPC API call.

    If you deployed your model with the single-model serving platform, follow the directions in 5_rest_requests_single_model.ipynb to try a REST API call.

(Optional) Running a data science pipeline generated from Python code

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.