Este contenido no está disponible en el idioma seleccionado.

Chapter 4. Deploying a model from the model catalog


You can deploy models directly from the model catalog.

Note

OpenShift AI model serving deployments use the global cluster pull secret to pull models in ModelCar format from the catalog.

Prerequisites

Procedure

  1. From the OpenShift AI dashboard, click Models Model catalog.
  2. In the drop-down list, select from the available catalog sources that have been configured by your administrator. The Default Catalog is displayed by default.

    Note

    OpenShift cluster administrators can configure additional model catalog sources. For more details, see the Kubeflow Model Registry community documentation on configuring catalog sources.

  3. Use the search bar to find a model in the catalog. You can enter text to search by model name, description, or provider.
  4. Click the name of a model to view the model details page.
  5. Click Deploy model to display the Deploy model dialog.
  6. From the Project drop-down list, select a project in which to deploy your model.

    Note

    Models using OCI storage can only be deployed on the single-model serving platform. Projects using the multi-model serving platform do not appear in the project list.

  7. In the Model deployment section:

    1. Optional: In the Model deployment name field, enter a unique name for your model deployment. This field is autofilled with a value that contains the model name by default.

      This is the name of the inference service created when the model is deployed.

    2. Optional: Click Edit resource name, and then enter a specific resource name for the model deployment in the Resource name field. By default, the resource name matches the name of the model deployment.

      Important

      Resource names are what your resources are labeled as in OpenShift. Your resource name cannot exceed 253 characters, must consist of lowercase alphanumeric characters or -, and must start and end with an alphanumeric character. Resource names are not editable after creation.

      The resource name must not match the name of any other model deployment resource in your OpenShift cluster.

    3. From the Serving runtime list, select a model-serving runtime that is installed and enabled in your OpenShift AI deployment. If project-scoped runtimes exist, the Serving runtime list includes subheadings to distinguish between global runtimes and project-scoped runtimes.
    4. From the Model framework list, select a framework for your model.

      Note

      The Model framework list shows only the frameworks that are supported by the model-serving runtime that you specified when you deployed your model.

  8. From the Deployment mode list, select KServe RawDeployment or Knative Serverless. For more information about deployment modes, see About KServe deployment modes.

    1. In the Number of model server replicas to deploy field, specify a value.
    2. From the Model server size list, select a value.
    3. If you have created a hardware profile, select a hardware profile from the Hardware profile list. If project-scoped hardware profiles exist, the Hardware profile list includes subheadings to distinguish between global hardware profiles and project-scoped hardware profiles.

      Important

      By default, hardware profiles are hidden from appearing in the dashboard navigation menu and user interface. In addition, user interface components associated with the deprecated accelerator profiles functionality are still displayed. To show the Settings Hardware profiles option in the dashboard navigation menu and the user interface components associated with hardware profiles, set the disableHardwareProfiles value to false in the OdhDashboardConfig custom resource (CR) in OpenShift. For more information about setting dashboard configuration options, see Customizing the dashboard.

    4. In the Model route section, select the Make deployed models available through an external route checkbox to make your deployed models available to external clients.
    5. In the Token authentication section, select the Require token authentication checkbox to require token authentication for your model server. To finish configuring token authentication, perform the following actions:

      1. In the Service account name field, enter a service account name for which the token will be generated. The generated token is created and displayed in the Token secret field when the model server is configured.
      2. To add an additional service account, click Add a service account and enter another service account name.
  9. In the Source model location section, select Current URI to deploy the selected model from the catalog.
  10. Optional: Customize the runtime parameters in the Configuration parameters section:

    1. Modify the values in Additional serving runtime arguments to define how the deployed model behaves.
    2. Modify the values in Additional environment variables to define variables in the model’s environment.
  11. Click Deploy.

Verification

  • The model deployment is displayed on the Models Model Deployments page.
  • The model deployment is displayed in the Latest deployments section of the model details page.
  • The model deployment is displayed on the Deployments tab for the model version.
Volver arriba
Red Hat logoGithubredditYoutubeTwitter

Aprender

Pruebe, compre y venda

Comunidades

Acerca de la documentación de Red Hat

Ayudamos a los usuarios de Red Hat a innovar y alcanzar sus objetivos con nuestros productos y servicios con contenido en el que pueden confiar. Explore nuestras recientes actualizaciones.

Hacer que el código abierto sea más inclusivo

Red Hat se compromete a reemplazar el lenguaje problemático en nuestro código, documentación y propiedades web. Para más detalles, consulte el Blog de Red Hat.

Acerca de Red Hat

Ofrecemos soluciones reforzadas que facilitan a las empresas trabajar en plataformas y entornos, desde el centro de datos central hasta el perímetro de la red.

Theme

© 2025 Red Hat