이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 2. Model serving options for RHEL AI

When deploying models with Red Hat Enterprise Linux AI and Red Hat AI Inference Server, you can choose from different approaches to supply a model for inference serving. Understanding the differences between each approach helps you select the right one for your deployment scenario.

2.1. Hugging Face models
링크 복사

Hugging Face models are the recommended approach for RHEL AI deployments that use the Red Hat AI Inference Server systemd Quadlet service. With this approach, you can either download models directly from Hugging Face Hub at runtime, or pre-download models to the local file system for offline serving.

Use Hugging Face models in the following scenarios:

You are deploying RHEL AI using the Red Hat AI Inference Server systemd Quadlet service.
You want to serve models from Hugging Face Hub (online mode) or from a local directory (offline mode).
You want to use Red Hat AI Model Optimization Toolkit to quantize and compress models before serving.

2.2. ModelCar container images
링크 복사

ModelCar container images are OCI-compliant container images that package language models as standard container images. You can pull ModelCar images from registry.redhat.io using podman pull and mount them directly into the Red Hat AI Inference Server vLLM container by using the --mount type=image option.

Note

For a list of available ModelCar container images, see ModelCar container images.

Use ModelCar container images in the following scenarios:

You are running Red Hat AI Inference Server with Podman directly, outside of the RHEL AI Quadlet service.
You want to package and distribute models as container images.
You want a container-native workflow for model distribution and versioning.

For example, to pull and run a ModelCar image with Podman:

Pull the ModelCar image:

$ podman pull registry.redhat.io/rhelai1/modelcar-granite-8b-code-instruct:1.4

Inference serve the ModelCar container image:

$ podman run --rm -it \
  --device nvidia.com/gpu=all \
  --security-opt=label=disable \
  --shm-size=4g \
  --userns=keep-id:uid=1001 \
  -p 8000:8000 \
  -e HF_HUB_OFFLINE=1 \
  -e TRANSFORMERS_OFFLINE=1 \
  --mount type=image,source=registry.redhat.io/rhelai1/modelcar-granite-8b-code-instruct:1.4,destination=/model \
  registry.redhat.io/rhaii-early-access/vllm-cuda-rhel9:3.4.0-ea.2 \
  --model /model/models \
  --port 8000

Note

The --device nvidia.com/gpu=all option is specific to NVIDIA GPUs. If you are using AMD ROCm GPUs, use --device /dev/kfd --device /dev/dri instead and set --tensor-parallel-size to match the number of available GPUs. For the complete AMD ROCm deployment procedure, see Serving and inferencing with Podman using AMD ROCm AI accelerators.

2.3. OCI artifact images
링크 복사

OCI artifact images use the OCI artifact specification to distribute model weights as container registry artifacts rather than as container images. OCI artifact images are distinct from ModelCar container images and require different tooling.

Important

OCI artifact images are designed for use with Red Hat OpenShift AI on OpenShift Container Platform, where the model serving infrastructure handles artifact retrieval natively. If you are using Red Hat AI Inference Server with Podman on RHEL AI, use either Hugging Face models or ModelCar container images instead.

Use OCI artifact images in the following scenarios:

You are deploying models on OpenShift Container Platform with Red Hat OpenShift AI.
Your deployment platform supports OCI artifact retrieval natively.

이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 2. Model serving options for RHEL AI

2.1. Hugging Face models
링크 복사

2.2. ModelCar container images
링크 복사

2.3. OCI artifact images
링크 복사

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 소개

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat 문서 정보

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

이 콘텐츠는 선택한 언어로 제공되지 않습니다.

Chapter 2. Model serving options for RHEL AI

2.1. Hugging Face models링크 복사링크가 클립보드에 복사되었습니다!

2.2. ModelCar container images링크 복사링크가 클립보드에 복사되었습니다!

2.3. OCI artifact images링크 복사링크가 클립보드에 복사되었습니다!

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 소개

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat 문서 정보

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

2.1. Hugging Face models
링크 복사

2.2. ModelCar container images
링크 복사

2.3. OCI artifact images
링크 복사