2.3. CLI を使用して OCI イメージに保存されたモデルをデプロイする

OCI イメージに保存されているモデルをコマンドラインインターフェイスからデプロイできます。

次の手順では、OpenVINO モデルサーバー上の OCI イメージに保存されている ONNX 形式の MobileNet v2-7 モデルをデプロイする例を使用します。

注記

KServe では、デフォルトでモデルはクラスター外部に公開され、認証によって保護されません。

前提条件

OCI イメージへのモデルの保存の説明に従って、モデルを OCI イメージに保存した。
プライベート OCI リポジトリーに保存されているモデルをデプロイする場合は、イメージプルシークレットを設定した。イメージプルシークレットの作成の詳細は、イメージプルシークレットの使用を参照してください。
OpenShift クラスターにログインしている。

手順

モデルをデプロイするためのプロジェクトを作成します。
```
oc new-project oci-model-example
```
```
oc new-project oci-model-example
```
Copy to Clipboard Toggle word wrap
OpenShift AI アプリケーションプロジェクトの kserve-ovms テンプレートを使用して ServingRuntime リソースを作成し、新しいプロジェクトで OpenVINO モデルサーバーを設定します。
```
oc process -n redhat-ods-applications -o yaml kserve-ovms | oc apply -f -
```
```
oc process -n redhat-ods-applications -o yaml kserve-ovms | oc apply -f -
```
Copy to Clipboard Toggle word wrap

kserve-ovms という名前の ServingRuntime が作成されていることを確認します。

oc get servingruntimes

oc get servingruntimes

Copy to Clipboard

Toggle word wrap

このコマンドは、次のような出力を返すはずです。

NAME          DISABLED   MODELTYPE     CONTAINERS         AGE
kserve-ovms              openvino_ir   kserve-container   1m

NAME          DISABLED   MODELTYPE     CONTAINERS         AGE
kserve-ovms              openvino_ir   kserve-container   1m

Copy to Clipboard

Toggle word wrap

モデルがプライベート OCI リポジトリーに保存されているか、パブリック OCI リポジトリーに保存されているかに応じて、InferenceService YAML リソースを作成します。

パブリック OCI リポジトリーに保存されているモデルの場合は、次の値を含む InferenceService YAML ファイルを作成し、<user_name>、<repository_name>、<tag_name> をお客様の環境固有の値に置き換えます。

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it

Copy to Clipboard

Toggle word wrap

プライベート OCI リポジトリーに保存されているモデルの場合は、次の例に示すように、spec.predictor.imagePullSecrets フィールドにプルシークレットを指定した InferenceService YAML ファイルを作成します。

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-private-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
    imagePullSecrets: # Specify image pull secrets to use for fetching container images, including OCI model images
    - name: <pull-secret-name>

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-private-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
    imagePullSecrets: # Specify image pull secrets to use for fetching container images, including OCI model images
    - name: <pull-secret-name>

Copy to Clipboard

Toggle word wrap

InferenceService リソースを作成すると、KServe は storageUri フィールドによって参照される OCI イメージに保存されているモデルをデプロイします。

検証

デプロイメントのステータスを確認します。

oc get inferenceservice

oc get inferenceservice

Copy to Clipboard

Toggle word wrap

このコマンドで、デプロイしたモデルの URL やその準備状態などの情報を含む出力が返されるはずです。

2.3. CLI を使用して OCI イメージに保存されたモデルをデプロイする

詳細情報

試用、購入および販売

コミュニティー

Red Hat ドキュメントについて

多様性を受け入れるオープンソースの強化

会社概要

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links