Chapter 4. Deploying a model from the model catalog

From the OpenShift AI dashboard, click Models

Model catalog.

In the drop-down list, select from the available catalog sources that have been configured by your administrator. The Default Catalog is displayed by default.

Note

OpenShift cluster administrators can configure additional model catalog sources. For more details, see the Kubeflow Model Registry community documentation on configuring catalog sources.

Use the search bar to find a model in the catalog. You can enter text to search by model name, description, or provider.

Click the name of a model to view the model details page.

Click Deploy model to display the Deploy model dialog.

From the Project drop-down list, select a project in which to deploy your model.

Note

Models using OCI storage can only be deployed on the single-model serving platform. Projects using the multi-model serving platform do not appear in the project list.

In the Model deployment section:

Optional: In the Model deployment name field, enter a unique name for your model deployment. This field is autofilled with a value that contains the model name by default.
This is the name of the inference service created when the model is deployed.
Optional: Click Edit resource name, and then enter a specific resource name for the model deployment in the Resource name field. By default, the resource name matches the name of the model deployment.
Important
Resource names are what your resources are labeled as in OpenShift. Your resource name cannot exceed 253 characters, must consist of lowercase alphanumeric characters or -, and must start and end with an alphanumeric character. Resource names are not editable after creation.
The resource name must not match the name of any other model deployment resource in your OpenShift cluster.
From the Serving runtime list, select a model-serving runtime that is installed and enabled in your OpenShift AI deployment. If project-scoped runtimes exist, the Serving runtime list includes subheadings to distinguish between global runtimes and project-scoped runtimes.
From the Model framework list, select a framework for your model.
Note
The Model framework list shows only the frameworks that are supported by the model-serving runtime that you specified when you deployed your model.

From the Deployment mode list, select KServe RawDeployment or Knative Serverless. For more information about deployment modes, see About KServe deployment modes.

In the Number of model server replicas to deploy field, specify a value.
From the Model server size list, select a value.
If you have created a hardware profile, select a hardware profile from the Hardware profile list. If project-scoped hardware profiles exist, the Hardware profile list includes subheadings to distinguish between global hardware profiles and project-scoped hardware profiles.
Important
By default, hardware profiles are hidden from appearing in the dashboard navigation menu and user interface. In addition, user interface components associated with the deprecated accelerator profiles functionality are still displayed. To show the Settings Hardware profiles option in the dashboard navigation menu and the user interface components associated with hardware profiles, set the disableHardwareProfiles value to false in the OdhDashboardConfig custom resource (CR) in OpenShift. For more information about setting dashboard configuration options, see Customizing the dashboard.
In the Model route section, select the Make deployed models available through an external route checkbox to make your deployed models available to external clients.
In the Token authentication section, select the Require token authentication checkbox to require token authentication for your model server. To finish configuring token authentication, perform the following actions:
1. In the Service account name field, enter a service account name for which the token will be generated. The generated token is created and displayed in the Token secret field when the model server is configured.
2. To add an additional service account, click Add a service account and enter another service account name.

In the Source model location section, select Current URI to deploy the selected model from the catalog.

Optional: Customize the runtime parameters in the Configuration parameters section:

Modify the values in Additional serving runtime arguments to define how the deployed model behaves.
Modify the values in Additional environment variables to define variables in the model’s environment.

Click Deploy.

Este contenido no está disponible en el idioma seleccionado.

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Hacer que el código abierto sea más inclusivo

Acerca de Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links