Preface
You can inference large language models with Red Hat AI Inference Server without any connection to the outside internet by installing OpenShift Container Platform and configuring a mirrored container image registry in the disconnected environment.
Currently, only NVIDIA accelerators are supported in disconnected environments on OpenShift Container Platform.