Chapter 4. Configuring multi-architecture compute machines on an OpenShift cluster
4.1. About clusters with multi-architecture compute machines
An OpenShift Container Platform cluster with multi-architecture compute machines is a cluster that supports compute machines with different architectures. Clusters with multi-architecture compute machines are available only on Amazon Web Services (AWS) or Microsoft Azure installer-provisioned infrastructures and bare metal, IBM Power®, and IBM Z® user-provisioned infrastructures with 64-bit x86 control plane machines.
When there are nodes with multiple architectures in your cluster, the architecture of your image must be consistent with the architecture of the node. You need to ensure that the pod is assigned to the node with the appropriate architecture and that it matches the image architecture. For more information on assigning pods to nodes, see Assigning pods to nodes.
The Cluster Samples Operator is not supported on clusters with multi-architecture compute machines. Your cluster can be created without this capability. For more information, see Cluster capabilities.
For information on migrating your single-architecture cluster to a cluster that supports multi-architecture compute machines, see Migrating to a cluster with multi-architecture compute machines.
4.1.1. Configuring your cluster with multi-architecture compute machines
To create a cluster with multi-architecture compute machines with different installation options and platforms, you can use the documentation in the following table:
Documentation section | Platform | User-provisioned installation | Installer-provisioned installation | Control Plane | Compute node |
---|---|---|---|---|---|
Creating a cluster with multi-architecture compute machines on Azure | Microsoft Azure | ✓ |
|
| |
Creating a cluster with multi-architecture compute machines on AWS | Amazon Web Services (AWS) | ✓ |
|
| |
Creating a cluster with multi-architecture compute machines on GCP | Google Cloud Platform (GCP) | ✓ |
|
| |
Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z | Bare metal | ✓ |
|
| |
IBM Power | ✓ |
|
| ||
IBM Z | ✓ |
|
| ||
Creating a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE with z/VM | IBM Z® and IBM® LinuxONE | ✓ |
|
| |
IBM Z® and IBM® LinuxONE | ✓ |
|
| ||
Creating a cluster with multi-architecture compute machines on IBM Power® | IBM Power® | ✓ |
|
|
Autoscaling from zero is currently not supported on Google Cloud Platform (GCP).
4.2. Creating a cluster with multi-architecture compute machine on Azure
To deploy an Azure cluster with multi-architecture compute machines, you must first create a single-architecture Azure installer-provisioned cluster that uses the multi-architecture installer binary. For more information on Azure installations, see Installing a cluster on Azure with customizations. You can then add an ARM64 compute machine set to your cluster to create a cluster with multi-architecture compute machines.
The following procedures explain how to generate an ARM64 boot image and create an Azure compute machine set that uses the ARM64 boot image. This adds ARM64 compute nodes to your cluster and deploys the amount of ARM64 virtual machines (VM) that you need.
4.2.1. Verifying cluster compatibility
Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.
Prerequisites
-
You installed the OpenShift CLI (
oc
)
Procedure
You can check that your cluster uses the architecture payload by running the following command:
$ oc adm release info -o jsonpath="{ .metadata.metadata}"
Verification
If you see the following output, then your cluster is using the multi-architecture payload:
{ "release.openshift.io/architecture": "multi", "url": "https://access.redhat.com/errata/<errata_version>" }
You can then begin adding multi-arch compute nodes to your cluster.
If you see the following output, then your cluster is not using the multi-architecture payload:
{ "url": "https://access.redhat.com/errata/<errata_version>" }
ImportantTo migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.
4.2.2. Creating an ARM64 boot image using the Azure image gallery
The following procedure describes how to manually generate an ARM64 boot image.
Prerequisites
-
You installed the Azure CLI (
az
). - You created a single-architecture Azure installer-provisioned cluster with the multi-architecture installer binary.
Procedure
Log in to your Azure account:
$ az login
Create a storage account and upload the
arm64
virtual hard disk (VHD) to your storage account. The OpenShift Container Platform installation program creates a resource group, however, the boot image can also be uploaded to a custom named resource group:$ az storage account create -n ${STORAGE_ACCOUNT_NAME} -g ${RESOURCE_GROUP} -l westus --sku Standard_LRS 1
- 1
- The
westus
object is an example region.
Create a storage container using the storage account you generated:
$ az storage container create -n ${CONTAINER_NAME} --account-name ${STORAGE_ACCOUNT_NAME}
You must use the OpenShift Container Platform installation program JSON file to extract the URL and
aarch64
VHD name:Extract the
URL
field and set it toRHCOS_VHD_ORIGIN_URL
as the file name by running the following command:$ RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.aarch64."rhel-coreos-extensions"."azure-disk".url')
Extract the
aarch64
VHD name and set it toBLOB_NAME
as the file name by running the following command:$ BLOB_NAME=rhcos-$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.aarch64."rhel-coreos-extensions"."azure-disk".release')-azure.aarch64.vhd
Generate a shared access signature (SAS) token. Use this token to upload the RHCOS VHD to your storage container with the following commands:
$ end=`date -u -d "30 minutes" '+%Y-%m-%dT%H:%MZ'`
$ sas=`az storage container generate-sas -n ${CONTAINER_NAME} --account-name ${STORAGE_ACCOUNT_NAME} --https-only --permissions dlrw --expiry $end -o tsv`
Copy the RHCOS VHD into the storage container:
$ az storage blob copy start --account-name ${STORAGE_ACCOUNT_NAME} --sas-token "$sas" \ --source-uri "${RHCOS_VHD_ORIGIN_URL}" \ --destination-blob "${BLOB_NAME}" --destination-container ${CONTAINER_NAME}
You can check the status of the copying process with the following command:
$ az storage blob show -c ${CONTAINER_NAME} -n ${BLOB_NAME} --account-name ${STORAGE_ACCOUNT_NAME} | jq .properties.copy
Example output
{ "completionTime": null, "destinationSnapshot": null, "id": "1fd97630-03ca-489a-8c4e-cfe839c9627d", "incrementalCopy": null, "progress": "17179869696/17179869696", "source": "https://rhcos.blob.core.windows.net/imagebucket/rhcos-411.86.202207130959-0-azure.aarch64.vhd", "status": "success", 1 "statusDescription": null }
- 1
- If the status parameter displays the
success
object, the copying process is complete.
Create an image gallery using the following command:
$ az sig create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME}
Use the image gallery to create an image definition. In the following example command,
rhcos-arm64
is the name of the image definition.$ az sig image-definition create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME} --gallery-image-definition rhcos-arm64 --publisher RedHat --offer arm --sku arm64 --os-type linux --architecture Arm64 --hyper-v-generation V2
To get the URL of the VHD and set it to
RHCOS_VHD_URL
as the file name, run the following command:$ RHCOS_VHD_URL=$(az storage blob url --account-name ${STORAGE_ACCOUNT_NAME} -c ${CONTAINER_NAME} -n "${BLOB_NAME}" -o tsv)
Use the
RHCOS_VHD_URL
file, your storage account, resource group, and image gallery to create an image version. In the following example,1.0.0
is the image version.$ az sig image-version create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME} --gallery-image-definition rhcos-arm64 --gallery-image-version 1.0.0 --os-vhd-storage-account ${STORAGE_ACCOUNT_NAME} --os-vhd-uri ${RHCOS_VHD_URL}
Your
arm64
boot image is now generated. You can access the ID of your image with the following command:$ az sig image-version show -r $GALLERY_NAME -g $RESOURCE_GROUP -i rhcos-arm64 -e 1.0.0
The following example image ID is used in the
recourseID
parameter of the compute machine set:Example
resourceID
/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.Compute/galleries/${GALLERY_NAME}/images/rhcos-arm64/versions/1.0.0
4.2.3. Adding a multi-architecture compute machine set to your cluster
To add ARM64 compute nodes to your cluster, you must create an Azure compute machine set that uses the ARM64 boot image. To create your own custom compute machine set on Azure, see "Creating a compute machine set on Azure".
Prerequisites
-
You installed the OpenShift CLI (
oc
).
Procedure
Create a compute machine set and modify the
resourceID
andvmSize
parameters with the following command. This compute machine set will control thearm64
worker nodes in your cluster:$ oc create -f arm64-machine-set-0.yaml
Sample YAML compute machine set with
arm64
boot imageapiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker name: <infrastructure_id>-arm64-machine-set-0 namespace: openshift-machine-api spec: replicas: 2 selector: matchLabels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machineset: <infrastructure_id>-arm64-machine-set-0 template: metadata: labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker machine.openshift.io/cluster-api-machineset: <infrastructure_id>-arm64-machine-set-0 spec: lifecycleHooks: {} metadata: {} providerSpec: value: acceleratedNetworking: true apiVersion: machine.openshift.io/v1beta1 credentialsSecret: name: azure-cloud-credentials namespace: openshift-machine-api image: offer: "" publisher: "" resourceID: /resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.Compute/galleries/${GALLERY_NAME}/images/rhcos-arm64/versions/1.0.0 1 sku: "" version: "" kind: AzureMachineProviderSpec location: <region> managedIdentity: <infrastructure_id>-identity networkResourceGroup: <infrastructure_id>-rg osDisk: diskSettings: {} diskSizeGB: 128 managedDisk: storageAccountType: Premium_LRS osType: Linux publicIP: false publicLoadBalancer: <infrastructure_id> resourceGroup: <infrastructure_id>-rg subnet: <infrastructure_id>-worker-subnet userDataSecret: name: worker-user-data vmSize: Standard_D4ps_v5 2 vnet: <infrastructure_id>-vnet zone: "<zone>"
Verification
Verify that the new ARM64 machines are running by entering the following command:
$ oc get machineset -n openshift-machine-api
Example output
NAME DESIRED CURRENT READY AVAILABLE AGE <infrastructure_id>-arm64-machine-set-0 2 2 2 2 10m
You can check that the nodes are ready and scheduable with the following command:
$ oc get nodes
Additional resources
4.3. Creating a cluster with multi-architecture compute machines on AWS
To create an AWS cluster with multi-architecture compute machines, you must first create a single-architecture AWS installer-provisioned cluster with the multi-architecture installer binary. For more information on AWS installations, refer to Installing a cluster on AWS with customizations. You can then add a ARM64 compute machine set to your AWS cluster.
4.3.1. Verifying cluster compatibility
Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.
Prerequisites
-
You installed the OpenShift CLI (
oc
)
Procedure
You can check that your cluster uses the architecture payload by running the following command:
$ oc adm release info -o jsonpath="{ .metadata.metadata}"
Verification
If you see the following output, then your cluster is using the multi-architecture payload:
{ "release.openshift.io/architecture": "multi", "url": "https://access.redhat.com/errata/<errata_version>" }
You can then begin adding multi-arch compute nodes to your cluster.
If you see the following output, then your cluster is not using the multi-architecture payload:
{ "url": "https://access.redhat.com/errata/<errata_version>" }
ImportantTo migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.
4.3.2. Adding an ARM64 compute machine set to your cluster
To configure a cluster with multi-architecture compute machines, you must create a AWS ARM64 compute machine set. This adds ARM64 compute nodes to your cluster so that your cluster has multi-architecture compute machines.
Prerequisites
-
You installed the OpenShift CLI (
oc
). - You used the installation program to create an AMD64 single-architecture AWS cluster with the multi-architecture installer binary.
Procedure
Create and modify a compute machine set, this will control the ARM64 compute nodes in your cluster.
$ oc create -f aws-arm64-machine-set-0.yaml
Sample YAML compute machine set to deploy an ARM64 compute node
apiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> 1 name: <infrastructure_id>-aws-arm64-machine-set-0 2 namespace: openshift-machine-api spec: replicas: 1 selector: matchLabels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> 3 machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone> 4 template: metadata: labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machine-role: <role> 5 machine.openshift.io/cluster-api-machine-type: <role> 6 machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone> 7 spec: metadata: labels: node-role.kubernetes.io/<role>: "" providerSpec: value: ami: id: ami-02a574449d4f4d280 8 apiVersion: awsproviderconfig.openshift.io/v1beta1 blockDevices: - ebs: iops: 0 volumeSize: 120 volumeType: gp2 credentialsSecret: name: aws-cloud-credentials deviceIndex: 0 iamInstanceProfile: id: <infrastructure_id>-worker-profile 9 instanceType: m6g.xlarge 10 kind: AWSMachineProviderConfig placement: availabilityZone: us-east-1a 11 region: <region> 12 securityGroups: - filters: - name: tag:Name values: - <infrastructure_id>-worker-sg 13 subnet: filters: - name: tag:Name values: - <infrastructure_id>-private-<zone> tags: - name: kubernetes.io/cluster/<infrastructure_id> 14 value: owned - name: <custom_tag_name> value: <custom_tag_value> userDataSecret: name: worker-user-data
- 1 2 3 9 13 14
- Specify the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. If you have the OpenShift CLI installed, you can obtain the infrastructure ID by running the following command:
$ oc get -o jsonpath=‘{.status.infrastructureName}{“\n”}’ infrastructure cluster
- 4 7
- Specify the infrastructure ID, role node label, and zone.
- 5 6
- Specify the role node label to add.
- 8
- Specify an ARM64 supported Red Hat Enterprise Linux CoreOS (RHCOS) Amazon Machine Image (AMI) for your AWS zone for your OpenShift Container Platform nodes.
$ oc get configmap/coreos-bootimages \ -n openshift-machine-config-operator \ -o jsonpath='{.data.stream}' | jq \ -r '.architectures.<arch>.images.aws.regions."<region>".image'
- 10
- Specify an ARM64 supported machine type. For more information, refer to "Tested instance types for AWS 64-bit ARM"
- 11
- Specify the zone, for example
us-east-1a
. Ensure that the zone you select offers 64-bit ARM machines. - 12
- Specify the region, for example,
us-east-1
. Ensure that the zone you select offers 64-bit ARM machines.
Verification
View the list of compute machine sets by entering the following command:
$ oc get machineset -n openshift-machine-api
You can then see your created ARM64 machine set.
Example output
NAME DESIRED CURRENT READY AVAILABLE AGE <infrastructure_id>-aws-arm64-machine-set-0 2 2 2 2 10m
You can check that the nodes are ready and scheduable with the following command:
$ oc get nodes
Additional resources
4.4. Creating a cluster with multi-architecture compute machines on GCP
To create a Google Cloud Platform (GCP) cluster with multi-architecture compute machines, you must first create a single-architecture GCP installer-provisioned cluster with the multi-architecture installer binary. For more information on AWS installations, refer to Installing a cluster on GCP with customizations. You can then add ARM64 compute machines sets to your GCP cluster.
Secure booting is currently not supported on ARM64 machines for GCP
4.4.1. Verifying cluster compatibility
Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.
Prerequisites
-
You installed the OpenShift CLI (
oc
)
Procedure
You can check that your cluster uses the architecture payload by running the following command:
$ oc adm release info -o jsonpath="{ .metadata.metadata}"
Verification
If you see the following output, then your cluster is using the multi-architecture payload:
{ "release.openshift.io/architecture": "multi", "url": "https://access.redhat.com/errata/<errata_version>" }
You can then begin adding multi-arch compute nodes to your cluster.
If you see the following output, then your cluster is not using the multi-architecture payload:
{ "url": "https://access.redhat.com/errata/<errata_version>" }
ImportantTo migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.
4.4.2. Adding an ARM64 compute machine set to your GCP cluster
To configure a cluster with multi-architecture compute machines, you must create a GCP ARM64 compute machine set. This adds ARM64 compute nodes to your cluster.
Prerequisites
-
You installed the OpenShift CLI (
oc
). - You used the installation program to create an AMD64 single-architecture AWS cluster with the multi-architecture installer binary.
Procedure
Create and modify a compute machine set, this controls the ARM64 compute nodes in your cluster:
$ oc create -f gcp-arm64-machine-set-0.yaml
Sample GCP YAML compute machine set to deploy an ARM64 compute node
apiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> 1 name: <infrastructure_id>-w-a namespace: openshift-machine-api spec: replicas: 1 selector: matchLabels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machineset: <infrastructure_id>-w-a template: metadata: creationTimestamp: null labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machine-role: <role> 2 machine.openshift.io/cluster-api-machine-type: <role> machine.openshift.io/cluster-api-machineset: <infrastructure_id>-w-a spec: metadata: labels: node-role.kubernetes.io/<role>: "" providerSpec: value: apiVersion: gcpprovider.openshift.io/v1beta1 canIPForward: false credentialsSecret: name: gcp-cloud-credentials deletionProtection: false disks: - autoDelete: true boot: true image: <path_to_image> 3 labels: null sizeGb: 128 type: pd-ssd gcpMetadata: 4 - key: <custom_metadata_key> value: <custom_metadata_value> kind: GCPMachineProviderSpec machineType: n1-standard-4 5 metadata: creationTimestamp: null networkInterfaces: - network: <infrastructure_id>-network subnetwork: <infrastructure_id>-worker-subnet projectID: <project_name> 6 region: us-central1 7 serviceAccounts: - email: <infrastructure_id>-w@<project_name>.iam.gserviceaccount.com scopes: - https://www.googleapis.com/auth/cloud-platform tags: - <infrastructure_id>-worker userDataSecret: name: worker-user-data zone: us-central1-a
- 1
- Specify the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. You can obtain the infrastructure ID by running the following command:
$ oc get -o jsonpath='{.status.infrastructureName}{"\n"}' infrastructure cluster
- 2
- Specify the role node label to add.
- 3
- Specify the path to the image that is used in current compute machine sets. You need the project and image name for your path to image.
To access the project and image name, run the following command:
$ oc get configmap/coreos-bootimages \ -n openshift-machine-config-operator \ -o jsonpath='{.data.stream}' | jq \ -r '.architectures.aarch64.images.gcp'
Example output
"gcp": { "release": "415.92.202309142014-0", "project": "rhcos-cloud", "name": "rhcos-415-92-202309142014-0-gcp-aarch64" }
Use the
project
andname
parameters from the output to create the path to image field in your machine set. The path to the image should follow the following format:$ projects/<project>/global/images/<image_name>
- 4
- Optional: Specify custom metadata in the form of a
key:value
pair. For example use cases, see the GCP documentation for setting custom metadata. - 5
- Specify an ARM64 supported machine type. For more information, refer to Tested instance types for GCP on 64-bit ARM infrastructures in "Additional resources".
- 6
- Specify the name of the GCP project that you use for your cluster.
- 7
- Specify the region, for example,
us-central1
. Ensure that the zone you select offers 64-bit ARM machines.
Verification
View the list of compute machine sets by entering the following command:
$ oc get machineset -n openshift-machine-api
You can then see your created ARM64 machine set.
Example output
NAME DESIRED CURRENT READY AVAILABLE AGE <infrastructure_id>-gcp-arm64-machine-set-0 2 2 2 2 10m
You can check that the nodes are ready and scheduable with the following command:
$ oc get nodes
Additional resources
4.5. Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z
To create a cluster with multi-architecture compute machines on bare metal (x86_64
), IBM Power® (ppc64le
), or IBM Z® (s390x
) you must have an existing single-architecture cluster on one of these platforms. Follow the installations procedures for your platform:
- Installing a user provisioned cluster on bare metal. You can then add 64-bit ARM compute machines to your OpenShift Container Platform cluster on bare metal.
-
Installing a cluster on IBM Power®. You can then add
x86_64
compute machines to your OpenShift Container Platform cluster on IBM Power®. -
Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add
x86_64
compute machines to your OpenShift Container Platform cluster on IBM Z® and IBM® LinuxONE.
Before you can add additional compute nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.
The following procedures explain how to create a RHCOS compute machine using an ISO image or network PXE booting. This will allow you to add additional nodes to your cluster and deploy a cluster with multi-architecture compute machines.
4.5.1. Verifying cluster compatibility
Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.
Prerequisites
-
You installed the OpenShift CLI (
oc
)
Procedure
You can check that your cluster uses the architecture payload by running the following command:
$ oc adm release info -o jsonpath="{ .metadata.metadata}"
Verification
If you see the following output, then your cluster is using the multi-architecture payload:
{ "release.openshift.io/architecture": "multi", "url": "https://access.redhat.com/errata/<errata_version>" }
You can then begin adding multi-arch compute nodes to your cluster.
If you see the following output, then your cluster is not using the multi-architecture payload:
{ "url": "https://access.redhat.com/errata/<errata_version>" }
ImportantTo migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.
4.5.2. Creating RHCOS machines using an ISO image
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using an ISO image to create the machines.
Prerequisites
- Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
-
You must have the OpenShift CLI (
oc
) installed.
Procedure
Extract the Ignition config file from the cluster by running the following command:
$ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
-
Upload the
worker.ign
Ignition config file you exported from your cluster to your HTTP server. Note the URLs of these files. You can validate that the ignition files are available on the URLs. The following example gets the Ignition config files for the compute node:
$ curl -k http://<HTTP_server>/worker.ign
You can access the ISO image for booting your new machine by running to following command:
RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.<architecture>.artifacts.metal.formats.iso.disk.location')
Use the ISO file to install RHCOS on more compute machines. Use the same method that you used when you created machines before you installed the cluster:
- Burn the ISO image to a disk and boot it directly.
- Use ISO redirection with a LOM interface.
Boot the RHCOS ISO image without specifying any options, or interrupting the live boot sequence. Wait for the installer to boot into a shell prompt in the RHCOS live environment.
NoteYou can interrupt the RHCOS installation boot process to add kernel arguments. However, for this ISO procedure you must use the
coreos-installer
command as outlined in the following steps, instead of adding kernel arguments.Run the
coreos-installer
command and specify the options that meet your installation requirements. At a minimum, you must specify the URL that points to the Ignition config file for the node type, and the device that you are installing to:$ sudo coreos-installer install --ignition-url=http://<HTTP_server>/<node_type>.ign <device> --ignition-hash=sha512-<digest> 12
- 1
- You must run the
coreos-installer
command by usingsudo
, because thecore
user does not have the required root privileges to perform the installation. - 2
- The
--ignition-hash
option is required when the Ignition config file is obtained through an HTTP URL to validate the authenticity of the Ignition config file on the cluster node.<digest>
is the Ignition config file SHA512 digest obtained in a preceding step.
NoteIf you want to provide your Ignition config files through an HTTPS server that uses TLS, you can add the internal certificate authority (CA) to the system trust store before running
coreos-installer
.The following example initializes a bootstrap node installation to the
/dev/sda
device. The Ignition config file for the bootstrap node is obtained from an HTTP web server with the IP address 192.168.1.2:$ sudo coreos-installer install --ignition-url=http://192.168.1.2:80/installation_directory/bootstrap.ign /dev/sda --ignition-hash=sha512-a5a2d43879223273c9b60af66b44202a1d1248fc01cf156c46d4a79f552b6bad47bc8cc78ddf0116e80c59d2ea9e32ba53bc807afbca581aa059311def2c3e3b
Monitor the progress of the RHCOS installation on the console of the machine.
ImportantEnsure that the installation is successful on each node before commencing with the OpenShift Container Platform installation. Observing the installation process can also help to determine the cause of RHCOS installation issues that might arise.
- Continue to create more compute machines for your cluster.
4.5.3. Creating RHCOS machines by PXE or iPXE booting
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using PXE or iPXE booting.
Prerequisites
- Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
-
Obtain the URLs of the RHCOS ISO image, compressed metal BIOS,
kernel
, andinitramfs
files that you uploaded to your HTTP server during cluster installation. - You have access to the PXE booting infrastructure that you used to create the machines for your OpenShift Container Platform cluster during installation. The machines must boot from their local disks after RHCOS is installed on them.
-
If you use UEFI, you have access to the
grub.conf
file that you modified during OpenShift Container Platform installation.
Procedure
Confirm that your PXE or iPXE installation for the RHCOS images is correct.
For PXE:
DEFAULT pxeboot TIMEOUT 20 PROMPT 0 LABEL pxeboot KERNEL http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> 1 APPEND initrd=http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img 2
- 1
- Specify the location of the live
kernel
file that you uploaded to your HTTP server. - 2
- Specify locations of the RHCOS files that you uploaded to your HTTP server. The
initrd
parameter value is the location of the liveinitramfs
file, thecoreos.inst.ignition_url
parameter value is the location of the worker Ignition config file, and thecoreos.live.rootfs_url
parameter value is the location of the liverootfs
file. Thecoreos.inst.ignition_url
andcoreos.live.rootfs_url
parameters only support HTTP and HTTPS.
NoteThis configuration does not enable serial console access on machines with a graphical console. To configure a different console, add one or more
console=
arguments to theAPPEND
line. For example, addconsole=tty0 console=ttyS0
to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux?.For iPXE (
x86_64
+aarch64
):kernel http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> initrd=main coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 1 2 initrd --name main http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img 3 boot
- 1
- Specify the locations of the RHCOS files that you uploaded to your HTTP server. The
kernel
parameter value is the location of thekernel
file, theinitrd=main
argument is needed for booting on UEFI systems, thecoreos.live.rootfs_url
parameter value is the location of therootfs
file, and thecoreos.inst.ignition_url
parameter value is the location of the worker Ignition config file. - 2
- If you use multiple NICs, specify a single interface in the
ip
option. For example, to use DHCP on a NIC that is namedeno1
, setip=eno1:dhcp
. - 3
- Specify the location of the
initramfs
file that you uploaded to your HTTP server.
NoteThis configuration does not enable serial console access on machines with a graphical console To configure a different console, add one or more
console=
arguments to thekernel
line. For example, addconsole=tty0 console=ttyS0
to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux? and "Enabling the serial console for PXE and ISO installation" in the "Advanced RHCOS installation configuration" section.NoteTo network boot the CoreOS
kernel
onaarch64
architecture, you need to use a version of iPXE build with theIMAGE_GZIP
option enabled. SeeIMAGE_GZIP
option in iPXE.For PXE (with UEFI and GRUB as second stage) on
aarch64
:menuentry 'Install CoreOS' { linux rhcos-<version>-live-kernel-<architecture> coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 1 2 initrd rhcos-<version>-live-initramfs.<architecture>.img 3 }
- 1
- Specify the locations of the RHCOS files that you uploaded to your HTTP/TFTP server. The
kernel
parameter value is the location of thekernel
file on your TFTP server. Thecoreos.live.rootfs_url
parameter value is the location of therootfs
file, and thecoreos.inst.ignition_url
parameter value is the location of the worker Ignition config file on your HTTP Server. - 2
- If you use multiple NICs, specify a single interface in the
ip
option. For example, to use DHCP on a NIC that is namedeno1
, setip=eno1:dhcp
. - 3
- Specify the location of the
initramfs
file that you uploaded to your TFTP server.
- Use the PXE or iPXE infrastructure to create the required compute machines for your cluster.
4.5.4. Approving the certificate signing requests for your machines
When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.
Prerequisites
- You added machines to your cluster.
Procedure
Confirm that the cluster recognizes the machines:
$ oc get nodes
Example output
NAME STATUS ROLES AGE VERSION master-0 Ready master 63m v1.28.5 master-1 Ready master 63m v1.28.5 master-2 Ready master 64m v1.28.5
The output lists all of the machines that you created.
NoteThe preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.
Review the pending CSRs and ensure that you see the client requests with the
Pending
orApproved
status for each machine that you added to the cluster:$ oc get csr
Example output
NAME AGE REQUESTOR CONDITION csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-8vnps 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...
In this example, two machines are joining the cluster. You might see more approved CSRs in the list.
If the CSRs were not approved, after all of the pending CSRs for the machines you added are in
Pending
status, approve the CSRs for your cluster machines:NoteBecause the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the
machine-approver
if the Kubelet requests a new certificate with identical parameters.NoteFor clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the
oc exec
,oc rsh
, andoc logs
commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by thenode-bootstrapper
service account in thesystem:node
orsystem:admin
groups, and confirm the identity of the node.To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name> 1
- 1
<csr_name>
is the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
NoteSome Operators might not become available until some CSRs are approved.
Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:
$ oc get csr
Example output
NAME AGE REQUESTOR CONDITION csr-bfd72 5m26s system:node:ip-10-0-50-126.us-east-2.compute.internal Pending csr-c57lv 5m26s system:node:ip-10-0-95-157.us-east-2.compute.internal Pending ...
If the remaining CSRs are not approved, and are in the
Pending
status, approve the CSRs for your cluster machines:To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name> 1
- 1
<csr_name>
is the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
After all client and server CSRs have been approved, the machines have the
Ready
status. Verify this by running the following command:$ oc get nodes
Example output
NAME STATUS ROLES AGE VERSION master-0 Ready master 73m v1.28.5 master-1 Ready master 73m v1.28.5 master-2 Ready master 74m v1.28.5 worker-0 Ready worker 11m v1.28.5 worker-1 Ready worker 11m v1.28.5
NoteIt can take a few minutes after approval of the server CSRs for the machines to transition to the
Ready
status.
Additional information
- For more information on CSRs, see Certificate Signing Requests.
4.6. Creating a cluster with multi-architecture compute machines on IBM Z and IBM LinuxONE with z/VM
To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x
) with z/VM, you must have an existing single-architecture x86_64
cluster. You can then add s390x
compute machines to your OpenShift Container Platform cluster.
Before you can add s390x
nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.
The following procedures explain how to create a RHCOS compute machine using a z/VM instance. This will allow you to add s390x
nodes to your cluster and deploy a cluster with multi-architecture compute machines.
To create an IBM Z® or IBM® LinuxONE (s390x
) cluster with multi-architecture compute machines on x86_64
, follow the instructions for Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64
compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.
4.6.1. Verifying cluster compatibility
Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.
Prerequisites
-
You installed the OpenShift CLI (
oc
)
Procedure
You can check that your cluster uses the architecture payload by running the following command:
$ oc adm release info -o jsonpath="{ .metadata.metadata}"
Verification
If you see the following output, then your cluster is using the multi-architecture payload:
{ "release.openshift.io/architecture": "multi", "url": "https://access.redhat.com/errata/<errata_version>" }
You can then begin adding multi-arch compute nodes to your cluster.
If you see the following output, then your cluster is not using the multi-architecture payload:
{ "url": "https://access.redhat.com/errata/<errata_version>" }
ImportantTo migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.
4.6.2. Creating RHCOS machines on IBM Z with z/VM
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines running on IBM Z® with z/VM and attach them to your existing cluster.
Prerequisites
- You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
- You have an HTTP or HTTPS server running on your provisioning machine that is accessible to the machines you create.
Procedure
Disable UDP aggregation.
Currently, UDP aggregation is not supported on IBM Z® and is not automatically deactivated on multi-architecture compute clusters with an
x86_64
control plane and additionals390x
compute machines. To ensure that the addtional compute nodes are added to the cluster correctly, you must manually disable UDP aggregation.Create a YAML file
udp-aggregation-config.yaml
with the following content:apiVersion: v1 kind: ConfigMap data: disable-udp-aggregation: "true" metadata: name: udp-aggregation-config namespace: openshift-network-operator
Create the ConfigMap resource by running the following command:
$ oc create -f udp-aggregation-config.yaml
Extract the Ignition config file from the cluster by running the following command:
$ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
-
Upload the
worker.ign
Ignition config file you exported from your cluster to your HTTP server. Note the URL of this file. You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:
$ curl -k http://<HTTP_server>/worker.ign
Download the RHEL live
kernel
,initramfs
, androotfs
files by running the following commands:$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')
$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')
$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location')
-
Move the downloaded RHEL live
kernel
,initramfs
, androotfs
files to an HTTP or HTTPS server that is accessible from the z/VM guest you want to add. Create a parameter file for the z/VM guest. The following parameters are specific for the virtual machine:
Optional: To specify a static IP address, add an
ip=
parameter with the following entries, with each separated by a colon:- The IP address for the machine.
- An empty string.
- The gateway.
- The netmask.
-
The machine host and domain name in the form
hostname.domainname
. Omit this value to let RHCOS decide. - The network interface name. Omit this value to let RHCOS decide.
-
The value
none
.
-
For
coreos.inst.ignition_url=
, specify the URL to theworker.ign
file. Only HTTP and HTTPS protocols are supported. -
For
coreos.live.rootfs_url=
, specify the matching rootfs artifact for thekernel
andinitramfs
you are booting. Only HTTP and HTTPS protocols are supported. For installations on DASD-type disks, complete the following tasks:
-
For
coreos.inst.install_dev=
, specify/dev/dasda
. -
Use
rd.dasd=
to specify the DASD where RHCOS is to be installed. Leave all other parameters unchanged.
The following is an example parameter file,
additional-worker-dasd.parm
:rd.neednet=1 \ console=ttysclp0 \ coreos.inst.install_dev=/dev/dasda \ coreos.live.rootfs_url=http://cl1.provide.example.com:8080/assets/rhcos-live-rootfs.s390x.img \ coreos.inst.ignition_url=http://cl1.provide.example.com:8080/ignition/worker.ign \ ip=172.18.78.2::172.18.78.1:255.255.255.0:::none nameserver=172.18.78.1 \ rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \ zfcp.allow_lun_scan=0 \ rd.dasd=0.0.3490
Write all options in the parameter file as a single line and make sure that you have no newline characters.
-
For
For installations on FCP-type disks, complete the following tasks:
Use
rd.zfcp=<adapter>,<wwpn>,<lun>
to specify the FCP disk where RHCOS is to be installed. For multipathing, repeat this step for each additional path.NoteWhen you install with multiple paths, you must enable multipathing directly after the installation, not at a later point in time, as this can cause problems.
Set the install device as:
coreos.inst.install_dev=/dev/sda
.NoteIf additional LUNs are configured with NPIV, FCP requires
zfcp.allow_lun_scan=0
. If you must enablezfcp.allow_lun_scan=1
because you use a CSI driver, for example, you must configure your NPIV so that each node cannot access the boot partition of another node.Leave all other parameters unchanged.
ImportantAdditional postinstallation steps are required to fully enable multipathing. For more information, see “Enabling multipathing with kernel arguments on RHCOS" in Postinstallation machine configuration tasks.
The following is an example parameter file,
additional-worker-fcp.parm
for a worker node with multipathing:rd.neednet=1 \ console=ttysclp0 \ coreos.inst.install_dev=/dev/sda \ coreos.live.rootfs_url=http://cl1.provide.example.com:8080/assets/rhcos-live-rootfs.s390x.img \ coreos.inst.ignition_url=http://cl1.provide.example.com:8080/ignition/worker.ign \ ip=172.18.78.2::172.18.78.1:255.255.255.0:::none nameserver=172.18.78.1 \ rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \ zfcp.allow_lun_scan=0 \ rd.zfcp=0.0.1987,0x50050763070bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.19C7,0x50050763070bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.1987,0x50050763071bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.19C7,0x50050763071bc5e3,0x4008400B00000000
Write all options in the parameter file as a single line and make sure that you have no newline characters.
-
Transfer the
initramfs
,kernel
, parameter files, and RHCOS images to z/VM, for example, by using FTP. For details about how to transfer the files with FTP and boot from the virtual reader, see Installing under Z/VM. Punch the files to the virtual reader of the z/VM guest virtual machine.
See PUNCH in IBM® Documentation.
TipYou can use the CP PUNCH command or, if you use Linux, the vmur command to transfer files between two z/VM guest virtual machines.
- Log in to CMS on the bootstrap machine.
IPL the bootstrap machine from the reader by running the following command:
$ ipl c
See IPL in IBM® Documentation.
4.6.3. Approving the certificate signing requests for your machines
When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.
Prerequisites
- You added machines to your cluster.
Procedure
Confirm that the cluster recognizes the machines:
$ oc get nodes
Example output
NAME STATUS ROLES AGE VERSION master-0 Ready master 63m v1.28.5 master-1 Ready master 63m v1.28.5 master-2 Ready master 64m v1.28.5
The output lists all of the machines that you created.
NoteThe preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.
Review the pending CSRs and ensure that you see the client requests with the
Pending
orApproved
status for each machine that you added to the cluster:$ oc get csr
Example output
NAME AGE REQUESTOR CONDITION csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-8vnps 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...
In this example, two machines are joining the cluster. You might see more approved CSRs in the list.
If the CSRs were not approved, after all of the pending CSRs for the machines you added are in
Pending
status, approve the CSRs for your cluster machines:NoteBecause the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the
machine-approver
if the Kubelet requests a new certificate with identical parameters.NoteFor clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the
oc exec
,oc rsh
, andoc logs
commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by thenode-bootstrapper
service account in thesystem:node
orsystem:admin
groups, and confirm the identity of the node.To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name> 1
- 1
<csr_name>
is the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
NoteSome Operators might not become available until some CSRs are approved.
Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:
$ oc get csr
Example output
NAME AGE REQUESTOR CONDITION csr-bfd72 5m26s system:node:ip-10-0-50-126.us-east-2.compute.internal Pending csr-c57lv 5m26s system:node:ip-10-0-95-157.us-east-2.compute.internal Pending ...
If the remaining CSRs are not approved, and are in the
Pending
status, approve the CSRs for your cluster machines:To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name> 1
- 1
<csr_name>
is the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
After all client and server CSRs have been approved, the machines have the
Ready
status. Verify this by running the following command:$ oc get nodes
Example output
NAME STATUS ROLES AGE VERSION master-0 Ready master 73m v1.28.5 master-1 Ready master 73m v1.28.5 master-2 Ready master 74m v1.28.5 worker-0 Ready worker 11m v1.28.5 worker-1 Ready worker 11m v1.28.5
NoteIt can take a few minutes after approval of the server CSRs for the machines to transition to the
Ready
status.
Additional information
- For more information on CSRs, see Certificate Signing Requests.
4.7. Creating a cluster with multi-architecture compute machines on IBM Z and IBM LinuxONE with RHEL KVM
To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x
) with RHEL KVM, you must have an existing single-architecture x86_64
cluster. You can then add s390x
compute machines to your OpenShift Container Platform cluster.
Before you can add s390x
nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.
The following procedures explain how to create a RHCOS compute machine using a RHEL KVM instance. This will allow you to add s390x
nodes to your cluster and deploy a cluster with multi-architecture compute machines.
To create an IBM Z® or IBM® LinuxONE (s390x
) cluster with multi-architecture compute machines on x86_64
, follow the instructions for Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64
compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.
4.7.1. Verifying cluster compatibility
Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.
Prerequisites
-
You installed the OpenShift CLI (
oc
)
Procedure
You can check that your cluster uses the architecture payload by running the following command:
$ oc adm release info -o jsonpath="{ .metadata.metadata}"
Verification
If you see the following output, then your cluster is using the multi-architecture payload:
{ "release.openshift.io/architecture": "multi", "url": "https://access.redhat.com/errata/<errata_version>" }
You can then begin adding multi-arch compute nodes to your cluster.
If you see the following output, then your cluster is not using the multi-architecture payload:
{ "url": "https://access.redhat.com/errata/<errata_version>" }
ImportantTo migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.
4.7.2. Creating RHCOS machines using virt-install
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your cluster by using virt-install
.
Prerequisites
- You have at least one LPAR running on RHEL 8.7 or later with KVM, referred to as RHEL KVM host in this procedure.
- The KVM/QEMU hypervisor is installed on the RHEL KVM host.
- You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
- An HTTP or HTTPS server is set up.
Procedure
Disable UDP aggregation.
Currently, UDP aggregation is not supported on IBM Z® and is not automatically deactivated on multi-architecture compute clusters with an
x86_64
control plane and additionals390x
compute machines. To ensure that the addtional compute nodes are added to the cluster correctly, you must manually disable UDP aggregation.Create a YAML file
udp-aggregation-config.yaml
with the following content:apiVersion: v1 kind: ConfigMap data: disable-udp-aggregation: "true" metadata: name: udp-aggregation-config namespace: openshift-network-operator
Create the ConfigMap resource by running the following command:
$ oc create -f udp-aggregation-config.yaml
Extract the Ignition config file from the cluster by running the following command:
$ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
-
Upload the
worker.ign
Ignition config file you exported from your cluster to your HTTP server. Note the URL of this file. You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:
$ curl -k http://<HTTP_server>/worker.ign
Download the RHEL live
kernel
,initramfs
, androotfs
files by running the following commands:$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')
$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')
$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location')
-
Move the downloaded RHEL live
kernel
,initramfs
androotfs
files to an HTTP or HTTPS server before you launchvirt-install
. Create the new KVM guest nodes using the RHEL
kernel
,initramfs
, and Ignition files; the new disk image; and adjusted parm line arguments.$ virt-install \ --connect qemu:///system \ --name <vm_name> \ --autostart \ --os-variant rhel9.2 \ 1 --cpu host \ --vcpus <vcpus> \ --memory <memory_mb> \ --disk <vm_name>.qcow2,size=<image_size> \ --network network=<virt_network_parm> \ --location <media_location>,kernel=<rhcos_kernel>,initrd=<rhcos_initrd> \ 2 --extra-args "rd.neednet=1" \ --extra-args "coreos.inst.install_dev=/dev/vda" \ --extra-args "coreos.inst.ignition_url=<worker_ign>" \ 3 --extra-args "coreos.live.rootfs_url=<rhcos_rootfs>" \ 4 --extra-args "ip=<ip>::<default_gateway>:<subnet_mask_length>:<hostname>::none:<MTU>" \ 5 --extra-args "nameserver=<dns>" \ --extra-args "console=ttysclp0" \ --noautoconsole \ --wait
- 1
- For
os-variant
, specify the RHEL version for the RHCOS compute machine.rhel9.2
is the recommended version. To query the supported RHEL version of your operating system, run the following command:$ osinfo-query os -f short-id
NoteThe
os-variant
is case sensitive. - 2
- For
--location
, specify the location of the kernel/initrd on the HTTP or HTTPS server. - 3
- For
coreos.inst.ignition_url=
, specify theworker.ign
Ignition file for the machine role. Only HTTP and HTTPS protocols are supported. - 4
- For
coreos.live.rootfs_url=
, specify the matching rootfs artifact for thekernel
andinitramfs
you are booting. Only HTTP and HTTPS protocols are supported. - 5
- Optional: For
hostname
, specify the fully qualified hostname of the client machine.
NoteIf you are using HAProxy as a load balancer, update your HAProxy rules for
ingress-router-443
andingress-router-80
in the/etc/haproxy/haproxy.cfg
configuration file.- Continue to create more compute machines for your cluster.
4.7.3. Approving the certificate signing requests for your machines
When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.
Prerequisites
- You added machines to your cluster.
Procedure
Confirm that the cluster recognizes the machines:
$ oc get nodes
Example output
NAME STATUS ROLES AGE VERSION master-0 Ready master 63m v1.28.5 master-1 Ready master 63m v1.28.5 master-2 Ready master 64m v1.28.5
The output lists all of the machines that you created.
NoteThe preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.
Review the pending CSRs and ensure that you see the client requests with the
Pending
orApproved
status for each machine that you added to the cluster:$ oc get csr
Example output
NAME AGE REQUESTOR CONDITION csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-8vnps 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...
In this example, two machines are joining the cluster. You might see more approved CSRs in the list.
If the CSRs were not approved, after all of the pending CSRs for the machines you added are in
Pending
status, approve the CSRs for your cluster machines:NoteBecause the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the
machine-approver
if the Kubelet requests a new certificate with identical parameters.NoteFor clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the
oc exec
,oc rsh
, andoc logs
commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by thenode-bootstrapper
service account in thesystem:node
orsystem:admin
groups, and confirm the identity of the node.To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name> 1
- 1
<csr_name>
is the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
NoteSome Operators might not become available until some CSRs are approved.
Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:
$ oc get csr
Example output
NAME AGE REQUESTOR CONDITION csr-bfd72 5m26s system:node:ip-10-0-50-126.us-east-2.compute.internal Pending csr-c57lv 5m26s system:node:ip-10-0-95-157.us-east-2.compute.internal Pending ...
If the remaining CSRs are not approved, and are in the
Pending
status, approve the CSRs for your cluster machines:To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name> 1
- 1
<csr_name>
is the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
After all client and server CSRs have been approved, the machines have the
Ready
status. Verify this by running the following command:$ oc get nodes
Example output
NAME STATUS ROLES AGE VERSION master-0 Ready master 73m v1.28.5 master-1 Ready master 73m v1.28.5 master-2 Ready master 74m v1.28.5 worker-0 Ready worker 11m v1.28.5 worker-1 Ready worker 11m v1.28.5
NoteIt can take a few minutes after approval of the server CSRs for the machines to transition to the
Ready
status.
Additional information
- For more information on CSRs, see Certificate Signing Requests.
4.8. Creating a cluster with multi-architecture compute machines on IBM Power
To create a cluster with multi-architecture compute machines on IBM Power® (ppc64le
), you must have an existing single-architecture (x86_64
) cluster. You can then add ppc64le
compute machines to your OpenShift Container Platform cluster.
Before you can add ppc64le
nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.
The following procedures explain how to create a RHCOS compute machine using an ISO image or network PXE booting. This will allow you to add ppc64le
nodes to your cluster and deploy a cluster with multi-architecture compute machines.
To create an IBM Power® (ppc64le
) cluster with multi-architecture compute machines on x86_64
, follow the instructions for Installing a cluster on IBM Power®. You can then add x86_64
compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.
4.8.1. Verifying cluster compatibility
Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.
Prerequisites
-
You installed the OpenShift CLI (
oc
)
When using multiple architectures, hosts for OpenShift Container Platform nodes must share the same storage layer. If they do not have the same storage layer, use a storage provider such as nfs-provisioner
.
You should limit the number of network hops between the compute and control plane as much as possible.
Procedure
You can check that your cluster uses the architecture payload by running the following command:
$ oc adm release info -o jsonpath="{ .metadata.metadata}"
Verification
If you see the following output, then your cluster is using the multi-architecture payload:
{ "release.openshift.io/architecture": "multi", "url": "https://access.redhat.com/errata/<errata_version>" }
You can then begin adding multi-arch compute nodes to your cluster.
If you see the following output, then your cluster is not using the multi-architecture payload:
{ "url": "https://access.redhat.com/errata/<errata_version>" }
ImportantTo migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.
4.8.2. Creating RHCOS machines using an ISO image
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your cluster by using an ISO image to create the machines.
Prerequisites
- Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
-
You must have the OpenShift CLI (
oc
) installed.
Procedure
Extract the Ignition config file from the cluster by running the following command:
$ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
-
Upload the
worker.ign
Ignition config file you exported from your cluster to your HTTP server. Note the URLs of these files. You can validate that the ignition files are available on the URLs. The following example gets the Ignition config files for the compute node:
$ curl -k http://<HTTP_server>/worker.ign
You can access the ISO image for booting your new machine by running to following command:
RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.<architecture>.artifacts.metal.formats.iso.disk.location')
Use the ISO file to install RHCOS on more compute machines. Use the same method that you used when you created machines before you installed the cluster:
- Burn the ISO image to a disk and boot it directly.
- Use ISO redirection with a LOM interface.
Boot the RHCOS ISO image without specifying any options, or interrupting the live boot sequence. Wait for the installer to boot into a shell prompt in the RHCOS live environment.
NoteYou can interrupt the RHCOS installation boot process to add kernel arguments. However, for this ISO procedure you must use the
coreos-installer
command as outlined in the following steps, instead of adding kernel arguments.Run the
coreos-installer
command and specify the options that meet your installation requirements. At a minimum, you must specify the URL that points to the Ignition config file for the node type, and the device that you are installing to:$ sudo coreos-installer install --ignition-url=http://<HTTP_server>/<node_type>.ign <device> --ignition-hash=sha512-<digest> 12
- 1
- You must run the
coreos-installer
command by usingsudo
, because thecore
user does not have the required root privileges to perform the installation. - 2
- The
--ignition-hash
option is required when the Ignition config file is obtained through an HTTP URL to validate the authenticity of the Ignition config file on the cluster node.<digest>
is the Ignition config file SHA512 digest obtained in a preceding step.
NoteIf you want to provide your Ignition config files through an HTTPS server that uses TLS, you can add the internal certificate authority (CA) to the system trust store before running
coreos-installer
.The following example initializes a bootstrap node installation to the
/dev/sda
device. The Ignition config file for the bootstrap node is obtained from an HTTP web server with the IP address 192.168.1.2:$ sudo coreos-installer install --ignition-url=http://192.168.1.2:80/installation_directory/bootstrap.ign /dev/sda --ignition-hash=sha512-a5a2d43879223273c9b60af66b44202a1d1248fc01cf156c46d4a79f552b6bad47bc8cc78ddf0116e80c59d2ea9e32ba53bc807afbca581aa059311def2c3e3b
Monitor the progress of the RHCOS installation on the console of the machine.
ImportantEnsure that the installation is successful on each node before commencing with the OpenShift Container Platform installation. Observing the installation process can also help to determine the cause of RHCOS installation issues that might arise.
- Continue to create more compute machines for your cluster.
4.8.3. Creating RHCOS machines by PXE or iPXE booting
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using PXE or iPXE booting.
Prerequisites
- Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
-
Obtain the URLs of the RHCOS ISO image, compressed metal BIOS,
kernel
, andinitramfs
files that you uploaded to your HTTP server during cluster installation. - You have access to the PXE booting infrastructure that you used to create the machines for your OpenShift Container Platform cluster during installation. The machines must boot from their local disks after RHCOS is installed on them.
-
If you use UEFI, you have access to the
grub.conf
file that you modified during OpenShift Container Platform installation.
Procedure
Confirm that your PXE or iPXE installation for the RHCOS images is correct.
For PXE:
DEFAULT pxeboot TIMEOUT 20 PROMPT 0 LABEL pxeboot KERNEL http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> 1 APPEND initrd=http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img 2
- 1
- Specify the location of the live
kernel
file that you uploaded to your HTTP server. - 2
- Specify locations of the RHCOS files that you uploaded to your HTTP server. The
initrd
parameter value is the location of the liveinitramfs
file, thecoreos.inst.ignition_url
parameter value is the location of the worker Ignition config file, and thecoreos.live.rootfs_url
parameter value is the location of the liverootfs
file. Thecoreos.inst.ignition_url
andcoreos.live.rootfs_url
parameters only support HTTP and HTTPS.
NoteThis configuration does not enable serial console access on machines with a graphical console. To configure a different console, add one or more
console=
arguments to theAPPEND
line. For example, addconsole=tty0 console=ttyS0
to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux?.For iPXE (
x86_64
+ppc64le
):kernel http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> initrd=main coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 1 2 initrd --name main http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img 3 boot
- 1
- Specify the locations of the RHCOS files that you uploaded to your HTTP server. The
kernel
parameter value is the location of thekernel
file, theinitrd=main
argument is needed for booting on UEFI systems, thecoreos.live.rootfs_url
parameter value is the location of therootfs
file, and thecoreos.inst.ignition_url
parameter value is the location of the worker Ignition config file. - 2
- If you use multiple NICs, specify a single interface in the
ip
option. For example, to use DHCP on a NIC that is namedeno1
, setip=eno1:dhcp
. - 3
- Specify the location of the
initramfs
file that you uploaded to your HTTP server.
NoteThis configuration does not enable serial console access on machines with a graphical console To configure a different console, add one or more
console=
arguments to thekernel
line. For example, addconsole=tty0 console=ttyS0
to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux? and "Enabling the serial console for PXE and ISO installation" in the "Advanced RHCOS installation configuration" section.NoteTo network boot the CoreOS
kernel
onppc64le
architecture, you need to use a version of iPXE build with theIMAGE_GZIP
option enabled. SeeIMAGE_GZIP
option in iPXE.For PXE (with UEFI and GRUB as second stage) on
ppc64le
:menuentry 'Install CoreOS' { linux rhcos-<version>-live-kernel-<architecture> coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 1 2 initrd rhcos-<version>-live-initramfs.<architecture>.img 3 }
- 1
- Specify the locations of the RHCOS files that you uploaded to your HTTP/TFTP server. The
kernel
parameter value is the location of thekernel
file on your TFTP server. Thecoreos.live.rootfs_url
parameter value is the location of therootfs
file, and thecoreos.inst.ignition_url
parameter value is the location of the worker Ignition config file on your HTTP Server. - 2
- If you use multiple NICs, specify a single interface in the
ip
option. For example, to use DHCP on a NIC that is namedeno1
, setip=eno1:dhcp
. - 3
- Specify the location of the
initramfs
file that you uploaded to your TFTP server.
- Use the PXE or iPXE infrastructure to create the required compute machines for your cluster.
4.8.4. Approving the certificate signing requests for your machines
When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.
Prerequisites
- You added machines to your cluster.
Procedure
Confirm that the cluster recognizes the machines:
$ oc get nodes
Example output
NAME STATUS ROLES AGE VERSION master-0 Ready master 63m v1.28.5 master-1 Ready master 63m v1.28.5 master-2 Ready master 64m v1.28.5
The output lists all of the machines that you created.
NoteThe preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.
Review the pending CSRs and ensure that you see the client requests with the
Pending
orApproved
status for each machine that you added to the cluster:$ oc get csr
Example output
NAME AGE REQUESTOR CONDITION csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-8vnps 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...
In this example, two machines are joining the cluster. You might see more approved CSRs in the list.
If the CSRs were not approved, after all of the pending CSRs for the machines you added are in
Pending
status, approve the CSRs for your cluster machines:NoteBecause the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the
machine-approver
if the Kubelet requests a new certificate with identical parameters.NoteFor clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the
oc exec
,oc rsh
, andoc logs
commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by thenode-bootstrapper
service account in thesystem:node
orsystem:admin
groups, and confirm the identity of the node.To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name> 1
- 1
<csr_name>
is the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
NoteSome Operators might not become available until some CSRs are approved.
Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:
$ oc get csr
Example output
NAME AGE REQUESTOR CONDITION csr-bfd72 5m26s system:node:ip-10-0-50-126.us-east-2.compute.internal Pending csr-c57lv 5m26s system:node:ip-10-0-95-157.us-east-2.compute.internal Pending ...
If the remaining CSRs are not approved, and are in the
Pending
status, approve the CSRs for your cluster machines:To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name> 1
- 1
<csr_name>
is the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
After all client and server CSRs have been approved, the machines have the
Ready
status. Verify this by running the following command:$ oc get nodes -o wide
Example output
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME worker-0-ppc64le Ready worker 42d v1.28.2+e3ba6d9 192.168.200.21 <none> Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.ppc64le cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9 worker-1-ppc64le Ready worker 42d v1.28.2+e3ba6d9 192.168.200.20 <none> Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.ppc64le cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9 master-0-x86 Ready control-plane,master 75d v1.28.2+e3ba6d9 10.248.0.38 10.248.0.38 Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.x86_64 cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9 master-1-x86 Ready control-plane,master 75d v1.28.2+e3ba6d9 10.248.0.39 10.248.0.39 Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.x86_64 cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9 master-2-x86 Ready control-plane,master 75d v1.28.2+e3ba6d9 10.248.0.40 10.248.0.40 Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.x86_64 cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9 worker-0-x86 Ready worker 75d v1.28.2+e3ba6d9 10.248.0.43 10.248.0.43 Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.x86_64 cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9 worker-1-x86 Ready worker 75d v1.28.2+e3ba6d9 10.248.0.44 10.248.0.44 Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.x86_64 cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9
NoteIt can take a few minutes after approval of the server CSRs for the machines to transition to the
Ready
status.
Additional information
- For more information on CSRs, see Certificate Signing Requests.
4.9. Managing your cluster with multi-architecture compute machines
4.9.1. Scheduling workloads on clusters with multi-architecture compute machines
Deploying a workload on a cluster with compute nodes of different architectures requires attention and monitoring of your cluster. There might be further actions you need to take in order to successfully place pods in the nodes of your cluster.
For more detailed information on node affinity, scheduling, taints and tolerlations, see the following documentatinon:
4.9.1.1. Sample multi-architecture node workload deployments
Before you schedule workloads on a cluster with compute nodes of different architectures, consider the following use cases:
- Using node affinity to schedule workloads on a node
You can allow a workload to be scheduled on only a set of nodes with architectures supported by its images, you can set the
spec.affinity.nodeAffinity
field in your pod’s template specification.Example deployment with the
nodeAffinity
set to certain architecturesapiVersion: apps/v1 kind: Deployment metadata: # ... spec: # ... template: # ... spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/arch operator: In values: 1 - amd64 - arm64
- 1
- Specify the supported architectures. Valid values include
amd64
,arm64
, or both values.
- Tainting every node for a specific architecture
You can taint a node to avoid workloads that are not compatible with its architecture to be scheduled on that node. In the case where your cluster is using a
MachineSet
object, you can add parameters to the.spec.template.spec.taints
field to avoid workloads being scheduled on nodes with non-supported architectures.Before you can taint a node, you must scale down the
MachineSet
object or remove available machines. You can scale down the machine set by using one of following commands:$ oc scale --replicas=0 machineset <machineset> -n openshift-machine-api
Or:
$ oc edit machineset <machineset> -n openshift-machine-api
For more information on scaling machine sets, see "Modifying a compute machine set".
Example
MachineSet
with a taint setapiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: # ... spec: # ... template: # ... spec: # ... taints: - effect: NoSchedule key: multi-arch.openshift.io/arch value: arm64
You can also set a taint on a specific node by running the following command:
$ oc adm taint nodes <node-name> multi-arch.openshift.io/arch=arm64:NoSchedule
- Creating a default toleration
You can annotate a namespace so all of the workloads get the same default toleration by running the following command:
$ oc annotate namespace my-namespace \ 'scheduler.alpha.kubernetes.io/defaultTolerations'='[{"operator": "Exists", "effect": "NoSchedule", "key": "multi-arch.openshift.io/arch"}]'
- Tolerating architecture taints in workloads
On a node with a defined taint, workloads will not be scheduled on that node. However, you can allow them to be scheduled by setting a toleration in the pod’s specification.
Example deployment with a toleration
apiVersion: apps/v1 kind: Deployment metadata: # ... spec: # ... template: # ... spec: tolerations: - key: "multi-arch.openshift.io/arch" value: "arm64" operator: "Equal" effect: "NoSchedule"
This example deployment can also be allowed on nodes with the
multi-arch.openshift.io/arch=arm64
taint specified.- Using node affinity with taints and tolerations
When a scheduler computes the set of nodes to schedule a pod, tolerations can broaden the set while node affinity restricts the set. If you set a taint to the nodes of a specific architecture, the following example toleration is required for scheduling pods.
Example deployment with a node affinity and toleration set.
apiVersion: apps/v1 kind: Deployment metadata: # ... spec: # ... template: # ... spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/arch operator: In values: - amd64 - arm64 tolerations: - key: "multi-arch.openshift.io/arch" value: "arm64" operator: "Equal" effect: "NoSchedule"
Additional resources
4.9.2. Enabling 64k pages on the Red Hat Enterprise Linux CoreOS (RHCOS) kernel
You can enable the 64k memory page in the Red Hat Enterprise Linux CoreOS (RHCOS) kernel on the 64-bit ARM compute machines in your cluster. The 64k page size kernel specification can be used for large GPU or high memory workloads. This is done using the Machine Config Operator (MCO) which uses a machine config pool to update the kernel. To enable 64k page sizes, you must dedicate a machine config pool for ARM64 to enable on the kernel.
Using 64k pages is exclusive to 64-bit ARM architecture compute nodes or clusters installed on 64-bit ARM machines. If you configure the 64k pages kernel on a machine config pool using 64-bit x86 machines, the machine config pool and MCO will degrade.
Prerequisites
-
You installed the OpenShift CLI (
oc
). - You created a cluster with compute nodes of different architecture on one of the supported platforms.
Procedure
Label the nodes where you want to run the 64k page size kernel:
$ oc label node <node_name> <label>
Example command
$ oc label node worker-arm64-01 node-role.kubernetes.io/worker-64k-pages=
Create a machine config pool that contains the worker role that uses the ARM64 architecture and the
worker-64k-pages
role:apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigPool metadata: name: worker-64k-pages spec: machineConfigSelector: matchExpressions: - key: machineconfiguration.openshift.io/role operator: In values: - worker - worker-64k-pages nodeSelector: matchLabels: node-role.kubernetes.io/worker-64k-pages: "" kubernetes.io/arch: arm64
Create a machine config on your compute node to enable
64k-pages
with the64k-pages
parameter.$ oc create -f <filename>.yaml
Example MachineConfig
apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: "worker-64k-pages" 1 name: 99-worker-64kpages spec: kernelType: 64k-pages 2
NoteThe
64k-pages
type is supported on only 64-bit ARM architecture based compute nodes. Therealtime
type is supported on only 64-bit x86 architecture based compute nodes.
Verification
To view your new
worker-64k-pages
machine config pool, run the following command:$ oc get mcp
Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-9d55ac9a91127c36314e1efe7d77fbf8 True False False 3 3 3 0 361d worker rendered-worker-e7b61751c4a5b7ff995d64b967c421ff True False False 7 7 7 0 361d worker-64k-pages rendered-worker-64k-pages-e7b61751c4a5b7ff995d64b967c421ff True False False 2 2 2 0 35m
4.9.3. Importing manifest lists in image streams on your multi-architecture compute machines
On an OpenShift Container Platform 4.15 cluster with multi-architecture compute machines, the image streams in the cluster do not import manifest lists automatically. You must manually change the default importMode
option to the PreserveOriginal
option in order to import the manifest list.
Prerequisites
-
You installed the OpenShift Container Platform CLI (
oc
).
Procedure
The following example command shows how to patch the
ImageStream
cli-artifacts so that thecli-artifacts:latest
image stream tag is imported as a manifest list.$ oc patch is/cli-artifacts -n openshift -p '{"spec":{"tags":[{"name":"latest","importPolicy":{"importMode":"PreserveOriginal"}}]}}'
Verification
You can check that the manifest lists imported properly by inspecting the image stream tag. The following command will list the individual architecture manifests for a particular tag.
$ oc get istag cli-artifacts:latest -n openshift -oyaml
If the
dockerImageManifests
object is present, then the manifest list import was successful.Example output of the
dockerImageManifests
objectdockerImageManifests: - architecture: amd64 digest: sha256:16d4c96c52923a9968fbfa69425ec703aff711f1db822e4e9788bf5d2bee5d77 manifestSize: 1252 mediaType: application/vnd.docker.distribution.manifest.v2+json os: linux - architecture: arm64 digest: sha256:6ec8ad0d897bcdf727531f7d0b716931728999492709d19d8b09f0d90d57f626 manifestSize: 1252 mediaType: application/vnd.docker.distribution.manifest.v2+json os: linux - architecture: ppc64le digest: sha256:65949e3a80349cdc42acd8c5b34cde6ebc3241eae8daaeea458498fedb359a6a manifestSize: 1252 mediaType: application/vnd.docker.distribution.manifest.v2+json os: linux - architecture: s390x digest: sha256:75f4fa21224b5d5d511bea8f92dfa8e1c00231e5c81ab95e83c3013d245d1719 manifestSize: 1252 mediaType: application/vnd.docker.distribution.manifest.v2+json os: linux