2.3. CLI를 사용하여 OCI 이미지에 저장된 모델 배포

명령줄 인터페이스에서 OCI 이미지에 저장된 모델을 배포할 수 있습니다.

다음 절차에서는 OpenVINO 모델 서버의 OCI 이미지에 저장된 ONNX 형식으로 MobileNet v2-7 모델을 배포하는 예제를 사용합니다.

참고

기본적으로 KServe에서 모델은 클러스터 외부에 노출되며 인증으로 보호되지 않습니다.

사전 요구 사항

OCI 이미지의 모델 저장에 설명된 대로 OCI 이미지에 모델을 저장했습니다.
개인 OCI 리포지토리에 저장된 모델을 배포하려면 이미지 가져오기 보안을 구성해야 합니다. 이미지 가져오기 보안 생성에 대한 자세한 내용은 이미지 가져오기 보안 사용을 참조하십시오.
OpenShift 클러스터에 로그인되어 있습니다.

프로세스

모델을 배포할 프로젝트를 생성합니다.
```
oc new-project oci-model-example
```
```
oc new-project oci-model-example
```
Copy to Clipboard Toggle word wrap
OpenShift AI Applications 프로젝트 kserve-ovms 템플릿을 사용하여 ServingRuntime 리소스를 생성하고 새 프로젝트에서 OpenVINO 모델 서버를 구성합니다.
```
oc process -n redhat-ods-applications -o yaml kserve-ovms | oc apply -f -
```
```
oc process -n redhat-ods-applications -o yaml kserve-ovms | oc apply -f -
```
Copy to Clipboard Toggle word wrap

kserve-ovms 라는 ServingRuntime 이 생성되었는지 확인합니다.

oc get servingruntimes

oc get servingruntimes

Copy to Clipboard

Toggle word wrap

명령은 다음과 유사한 출력을 반환해야 합니다.

NAME          DISABLED   MODELTYPE     CONTAINERS         AGE
kserve-ovms              openvino_ir   kserve-container   1m

NAME          DISABLED   MODELTYPE     CONTAINERS         AGE
kserve-ovms              openvino_ir   kserve-container   1m

Copy to Clipboard

Toggle word wrap

모델이 개인 또는 공용 OCI 리포지토리에서 저장되는지 여부에 따라 InferenceService YAML 리소스를 생성합니다.

공용 OCI 리포지토리에 저장된 모델의 경우 InferenceService YAML 파일을 다음 값으로 생성하여 < user_name> , < repository_name > , < tag_name >을 해당 환경과 관련된 값으로 바꿉니다.

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it

Copy to Clipboard

Toggle word wrap

프라이빗 OCI 리포지토리에 저장된 모델의 경우 다음 예와 같이 spec.predictor.imagePullSecrets 필드에 가져오기 보안을 지정하는 InferenceService YAML 파일을 생성합니다.

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-private-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
    imagePullSecrets: # Specify image pull secrets to use for fetching container images, including OCI model images
    - name: <pull-secret-name>

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sample-isvc-using-private-oci
spec:
  predictor:
    model:
      runtime: kserve-ovms # Ensure this matches the name of the ServingRuntime resource
      modelFormat:
        name: onnx
      storageUri: oci://quay.io/<user_name>/<repository_name>:<tag_name>
      resources:
        requests:
          memory: 500Mi
          cpu: 100m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
        limits:
          memory: 4Gi
          cpu: 500m
          # nvidia.com/gpu: "1" # Only required if you have GPUs available and the model and runtime will use it
    imagePullSecrets: # Specify image pull secrets to use for fetching container images, including OCI model images
    - name: <pull-secret-name>

Copy to Clipboard

Toggle word wrap

InferenceService 리소스를 생성한 후 KServe는 storageUri 필드에서 참조하는 OCI 이미지에 저장된 모델을 배포합니다.

검증

배포 상태를 확인합니다.

oc get inferenceservice

oc get inferenceservice

Copy to Clipboard

Toggle word wrap

명령은 배포된 모델의 URL 및 준비 상태 등 정보가 포함된 출력을 반환해야 합니다.

2.3. CLI를 사용하여 OCI 이미지에 저장된 모델 배포

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat 소개

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links