Chapter 1. About OCI-compliant model containers
You can inference serve OCI-compliant models in Red Hat AI Inference Server. Storing models in OCI-compliant models containers (or modelcars) is an alternative to S3 or URI-based storage for language models. OCI model images let you distribute models through container registries by using the same versioning, caching, security, and distribution infrastructure you already have for containers.
Using modelcar containers allows for faster startup times by avoiding repeated downloads, lower disk usage, and better performance with pre-fetched images. Modelcar containers can be stored in standard container registries alongside application containers, enabling unified model versioning and distribution workflows.
Before you can deploy a language model in a modelcar container in the cluster, you need to package the model in an OCI container image and then deploy the container image in the cluster.