Chapter 1. About deployments in disconnected environments
You can inference large language models with Red Hat AI Inference Server without any connection to the outside internet by installing OpenShift Container Platform and configuring a mirrored container image registry in the disconnected environment.
Currently, only NVIDIA CUDA AI accelerators are supported for OpenShift Container Platform in disconnected environments.
Disconnected deployments require setting up a mirror registry to host container images and operator catalogs that would normally be pulled from internet-accessible registries. After mirroring the required images, you can install the Node Feature Discovery Operator and NVIDIA GPU Operator from the mirrored sources, then deploy Red Hat AI Inference Server for inference serving.