2.3. 使用 CLI 部署存储在 OCI 镜像中的模型

您可以从命令行界面部署存储在 OCI 镜像中的模型。

以下流程使用 ONNX 格式部署 mobileNet v2-7 模型的示例，存储在 OpenVINO 模型服务器上的 OCI 镜像中。

注意

默认情况下，在 KServe 中，模型在集群外公开，不受身份验证的保护。

先决条件

您已将模型存储在 OCI 镜像中，如在 OCI 镜像中存储模型中所述。
如果要部署存储在私有 OCI 存储库中的模型，您必须配置镜像 pull secret。有关创建镜像 pull secret 的更多信息，请参阅使用镜像 pull secret。
已登陆到 OpenShift 集群。

流程

创建一个项目来部署模型：
```
oc new-project oci-model-example
```
```
oc new-project oci-model-example
```
Copy to Clipboard Toggle word wrap
使用 OpenShift AI Applications 项目 kserve-ovms 模板来创建 ServingRuntime 资源，并在新项目中配置 OpenVINO 模型服务器：
```
oc process -n redhat-ods-applications -o yaml kserve-ovms | oc apply -f -
```
```
oc process -n redhat-ods-applications -o yaml kserve-ovms | oc apply -f -
```
Copy to Clipboard Toggle word wrap

验证名为 kserve-ovms 的 ServingRuntime 是否已创建：

oc get servingruntimes

oc get servingruntimes

Copy to Clipboard

Toggle word wrap

该命令应该返回类似如下的输出：

NAME          DISABLED   MODELTYPE     CONTAINERS         AGE
kserve-ovms              openvino_ir   kserve-container   1m

NAME          DISABLED   MODELTYPE     CONTAINERS         AGE
kserve-ovms              openvino_ir   kserve-container   1m

Copy to Clipboard

Toggle word wrap

根据模型是否存储于私有还是公共 OCI 存储库，创建 InferenceService YAML 资源：

对于存储在公共 OCI 存储库中的模型，创建一个带有以下值的 InferenceService YAML 文件，将 < user_name>、< repository_name>、<tag_name >、< tag_name > 的值替换为特定于您的环境的值：

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it

Copy to Clipboard

Toggle word wrap

对于存储在私有 OCI 存储库中的模型，请创建一个 InferenceService YAML 文件，该文件在 spec.predictor.imagePullSecrets 字段中指定您的 pull secret，如下例所示：

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-private-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
    imagePullSecrets: # Specify image pull secrets to use for fetching container images, including OCI model images
    - name: <pull-secret-name>

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-private-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
    imagePullSecrets: # Specify image pull secrets to use for fetching container images, including OCI model images
    - name: <pull-secret-name>

Copy to Clipboard

Toggle word wrap

创建 InferenceService 资源后，KServe 会部署 storageUri 字段引用的 OCI 镜像中存储的模型。

验证

检查部署的状态：

oc get inferenceservice

oc get inferenceservice

Copy to Clipboard

Toggle word wrap

该命令应返回包含信息的输出，如部署模型的 URL 及其就绪状态。

2.3. 使用 CLI 部署存储在 OCI 镜像中的模型

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links