1.2. Hardware requirements for inference serving Granite models
The following charts display the minimum hardware requirements for inference serving a model on Red Hat Enterprise Linux AI.
1.2.1. Bare metal 复制链接链接已复制到粘贴板!
| Hardware vendor | Supported accelerators (GPUs) | Minimum Aggregate GPU memory | Recommended additional disk storage |
|---|---|---|---|
| NVIDIA | A100 | 80 GB | 1 TB |
| NVIDIA | H100 | 80 GB | 1 TB |
| NVIDIA | H200 | 141 GB | 1 TB |
| NVIDIA | GH200 (Technology Preview) | 192 GB | 1 TP |
| NVIDIA | L40S | 48 GB | 1 TB |
| NVIDIA | L4 | 24 GB | 1 TB |
| AMD | MI300X | 192 GB | 1 TB |
| Intel | Gaudi 3 (Technology Preview) | 128 GB | 1 TB |
1.2.2. Amazon Web Services (AWS) 复制链接链接已复制到粘贴板!
| Hardware vendor | Supported accelerators (GPUs) | Minimum Aggregate GPU Memory | AWS Instance family | Recommended additional disk storage |
|---|---|---|---|---|
| NVIDIA | A100 | 40 GB | P4d series | 1 TB |
| NVIDIA | H100 | 80 GB | P5 series | 1 TB |
| NVIDIA | L40S | 48 GB | G6e series | 1 TB |
| NVIDIA | L4 | 24 GB | G6 series | 1 TB |
1.2.3. IBM cloud 复制链接链接已复制到粘贴板!
| Hardware vendor | Supported accelerators (GPUs) | Minimum Aggregate GPU Memory | IBM Cloud Instance family | Recommended additional disk storage |
|---|---|---|---|---|
| NVIDIA | L4 | 24 GB | gx3 series | 1 TB |
| NVIDIA | L40S | 48 GB | gx3 series | 1 TB |
| NVIDIA | A100 | 80 GB | gx3 series | 1 TB |
| NVIDIA | H100 | 80 GB | gx3 series | 1 TB |
| NVIDIA | H200 | 141 GB | gx3 series | 1 TB |
| AMD | MI300X | 192 GB | gx3 series | 1 TB |
| Intel | Gaudi 3 (Technology Preview) | 128 GB | gx3 series | 1 TB |
1.2.4. Azure 复制链接链接已复制到粘贴板!
| Hardware vendor | Supported accelerators (GPUs) | Minimum Aggregate GPU Memory | Azure Instance family | Recommended additional disk storage |
|---|---|---|---|---|
| NVIDIA | A100 | 80 GB | ND series | 1 TB |
| NVIDIA | H100 | 80 GB | ND sereis | 1 TB |
| AMD | MI300X | 192 GB | ND series | 1 TB |
1.2.5. Google Cloud Platform (GCP) 复制链接链接已复制到粘贴板!
| Hardware vendor | Supported accelerators (GPUs) | Minimum Aggregate GPU Memory | GCP Instance family | Recommended additional disk storage |
|---|---|---|---|---|
| NVIDIA | A100 | 40 GB | A2 series | 1 TB |
| NVIDIA | H100 | 80 GB | A3 series | 1 TB |
| NVIDIA | 4xL4 | 96 GB | G2 series | 1 TB |