Este contenido no está disponible en el idioma seleccionado.
Chapter 4. Deploying a model from the model catalog
You can deploy models directly from the model catalog.
OpenShift AI model serving deployments use the global cluster pull secret to pull models in ModelCar format from the catalog.
Prerequisites
- You have completed the prerequisites in Deploying models on the single-model serving platform.
Procedure
-
From the OpenShift AI dashboard, click Models
Model catalog. In the drop-down list, select from the available catalog sources that have been configured by your administrator. The Default Catalog is displayed by default.
NoteOpenShift cluster administrators can configure additional model catalog sources. For more details, see the Kubeflow Model Registry community documentation on configuring catalog sources.
- Use the search bar to find a model in the catalog. You can enter text to search by model name, description, or provider.
- Click the name of a model to view the model details page.
- Click Deploy model to display the Deploy model dialog.
From the Project drop-down list, select a project in which to deploy your model.
NoteModels using OCI storage can only be deployed on the single-model serving platform. Projects using the multi-model serving platform do not appear in the project list.
In the Model deployment section:
Optional: In the Model deployment name field, enter a unique name for your model deployment. This field is autofilled with a value that contains the model name by default.
This is the name of the inference service created when the model is deployed.
Optional: Click Edit resource name, and then enter a specific resource name for the model deployment in the Resource name field. By default, the resource name matches the name of the model deployment.
ImportantResource names are what your resources are labeled as in OpenShift. Your resource name cannot exceed 253 characters, must consist of lowercase alphanumeric characters or -, and must start and end with an alphanumeric character. Resource names are not editable after creation.
The resource name must not match the name of any other model deployment resource in your OpenShift cluster.
- From the Serving runtime list, select a model-serving runtime that is installed and enabled in your OpenShift AI deployment. If project-scoped runtimes exist, the Serving runtime list includes subheadings to distinguish between global runtimes and project-scoped runtimes.
From the Model framework list, select a framework for your model.
NoteThe Model framework list shows only the frameworks that are supported by the model-serving runtime that you specified when you deployed your model.
From the Deployment mode list, select KServe RawDeployment or Knative Serverless. For more information about deployment modes, see About KServe deployment modes.
- In the Number of model server replicas to deploy field, specify a value.
- From the Model server size list, select a value.
If you have created a hardware profile, select a hardware profile from the Hardware profile list. If project-scoped hardware profiles exist, the Hardware profile list includes subheadings to distinguish between global hardware profiles and project-scoped hardware profiles.
ImportantBy default, hardware profiles are hidden from appearing in the dashboard navigation menu and user interface. In addition, user interface components associated with the deprecated accelerator profiles functionality are still displayed. To show the Settings
Hardware profiles option in the dashboard navigation menu and the user interface components associated with hardware profiles, set the disableHardwareProfilesvalue tofalsein theOdhDashboardConfigcustom resource (CR) in OpenShift. For more information about setting dashboard configuration options, see Customizing the dashboard.- In the Model route section, select the Make deployed models available through an external route checkbox to make your deployed models available to external clients.
In the Token authentication section, select the Require token authentication checkbox to require token authentication for your model server. To finish configuring token authentication, perform the following actions:
- In the Service account name field, enter a service account name for which the token will be generated. The generated token is created and displayed in the Token secret field when the model server is configured.
- To add an additional service account, click Add a service account and enter another service account name.
- In the Source model location section, select Current URI to deploy the selected model from the catalog.
Optional: Customize the runtime parameters in the Configuration parameters section:
- Modify the values in Additional serving runtime arguments to define how the deployed model behaves.
- Modify the values in Additional environment variables to define variables in the model’s environment.
- Click Deploy.
Verification
-
The model deployment is displayed on the Models
Model Deployments page. - The model deployment is displayed in the Latest deployments section of the model details page.
- The model deployment is displayed on the Deployments tab for the model version.