Chapter 3. Configuring multi-architecture compute machines on an OpenShift cluster
3.1. About clusters with multi-architecture compute machines Copy linkLink copied to clipboard!
An OpenShift Container Platform cluster with multi-architecture compute machines is a cluster that supports compute machines with different architectures.
Configuring multi-architecture compute machines involves some additional considerations:
- When there are nodes with multiple architectures in your cluster, the architecture of the container image that you deploy to a node must be consistent with the architecture of that node. You need to ensure that the pod is assigned to the node with the appropriate architecture and that it matches the container image architecture. For more information on assigning pods to nodes, see Assigning pods to nodes.
- In installer-provisioned installations, you are restricted to using the infrastructure provided by a single cloud provider. Adding external nodes, regardless of their architecture, to these clusters is not supported.
Clusters that are installed with the platform type
noneare unable to use some features, such as managing compute machines with the Machine API. This limitation applies even if the compute machines that are attached to the cluster are installed on a platform that would normally support the feature. This parameter cannot be changed after installation.ImportantReview the information in the guidelines for deploying OpenShift Container Platform on non-tested platforms before you attempt to install an OpenShift Container Platform cluster in virtualized or cloud environments.
- The Cluster Samples Operator is not supported on clusters with multi-architecture compute machines. Your cluster can be created without this capability. For more information, see Cluster capabilities.
- For information on migrating your single-architecture cluster to a cluster that supports multi-architecture compute machines, see Migrating to a cluster with multi-architecture compute machines.
3.1.1. Configuring your cluster with multi-architecture compute machines Copy linkLink copied to clipboard!
To create a cluster with multi-architecture compute machines with different installation options and platforms, you can use the documentation in the following table:
| Documentation section | Platform | User-provisioned installation | Installer-provisioned installation | Control Plane | Compute node |
|---|---|---|---|---|---|
| Creating a cluster with multi-architecture compute machines on Azure | Microsoft Azure | ✓ | ✓ |
|
|
| Creating a cluster with multi-architecture compute machines on AWS | Amazon Web Services (AWS) | ✓ | ✓ |
|
|
| Creating a cluster with multi-architecture compute machines on Google Cloud | Google Cloud | ✓ |
|
| |
| Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z | Bare metal | ✓ |
|
| |
| IBM Power | ✓ |
|
| ||
| IBM Z | ✓ |
|
| ||
| Creating a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE with z/VM | IBM Z® and IBM® LinuxONE | ✓ |
|
| |
| IBM Z® and IBM® LinuxONE | ✓ |
|
| ||
| Creating a cluster with multi-architecture compute machines on IBM Power® | IBM Power® | ✓ |
|
|
Autoscaling from zero is currently not supported on Google Cloud.
3.1.2. Verifying cluster compatibility Copy linkLink copied to clipboard!
Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.
Prerequisites
-
You installed the OpenShift CLI (
oc). IBM Power only: Ensure that you meet the following prerequisites:
-
When using multiple architectures, hosts for OpenShift Container Platform nodes must share the same storage layer. If they do not have the same storage layer, use a storage provider such as
nfs-provisioner. - You should limit the number of network hops between the compute and control plane as much as possible.
-
When using multiple architectures, hosts for OpenShift Container Platform nodes must share the same storage layer. If they do not have the same storage layer, use a storage provider such as
Procedure
-
Log in to the OpenShift CLI (
oc). You can check that your cluster uses the architecture payload by running the following command:
$ oc adm release info -o jsonpath="{ .metadata.metadata}"
Verification
If you see the following output, your cluster is using the multi-architecture payload:
{ "release.openshift.io/architecture": "multi", "url": "https://access.redhat.com/errata/<errata_version>" }You can then begin adding multi-arch compute nodes to your cluster.
If you see the following output, your cluster is not using the multi-architecture payload:
{ "url": "https://access.redhat.com/errata/<errata_version>" }ImportantTo migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.
3.2. Creating a cluster with multi-architecture compute machines on AWS Copy linkLink copied to clipboard!
To deploy a cluster on Amazon Web Services (AWS) with multi-architecture compute machines, you must first create a single-architecture installer-provisioned cluster that uses the multi-architecture installer binary.
You can also migrate your current cluster with single-architecture compute machines to a cluster with multi-architecture compute machines. After creating a multi-architecture cluster, you can add nodes with different architectures to the cluster.
3.2.1. Adding a multi-architecture compute machine set to your AWS cluster Copy linkLink copied to clipboard!
After creating a multi-architecture cluster, you can add nodes with different architectures.
You can add multi-architecture compute machines to a multi-architecture cluster in the following ways:
- Adding 64-bit x86 compute machines to a cluster that uses 64-bit ARM control plane machines and already includes 64-bit ARM compute machines. In this case, 64-bit x86 is considered the secondary architecture.
- Adding 64-bit ARM compute machines to a cluster that uses 64-bit x86 control plane machines and already includes 64-bit x86 compute machines. In this case, 64-bit ARM is considered the secondary architecture.
Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig custom resource. For more information, see "Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator".
Prerequisites
-
You installed the OpenShift CLI (
oc). - You used the installation program to create a 64-bit ARM or 64-bit x86 single-architecture cluster with the multi-architecture installer binary.
Procedure
-
Log in to the OpenShift CLI (
oc). Create a YAML file and add the configuration to create a compute machine set to control the 64-bit ARM or 64-bit x86 compute nodes in your cluster.
Example
MachineSetobject for an AWS 64-bit ARM or x86 compute nodeapiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> name: <infrastructure_id>-aws-machine-set-0 namespace: openshift-machine-api spec: replicas: 1 selector: matchLabels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone> template: metadata: labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machine-role: <role> machine.openshift.io/cluster-api-machine-type: <role> machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone> spec: metadata: labels: node-role.kubernetes.io/<role>: "" providerSpec: value: ami: id: ami-02a574449d4f4d280 apiVersion: awsproviderconfig.openshift.io/v1beta1 blockDevices: - ebs: iops: 0 volumeSize: 120 volumeType: gp2 credentialsSecret: name: aws-cloud-credentials deviceIndex: 0 iamInstanceProfile: id: <infrastructure_id>-worker-profile instanceType: m6g.xlarge kind: AWSMachineProviderConfig placement: availabilityZone: us-east-1a region: <region> securityGroups: - filters: - name: tag:Name values: - <infrastructure_id>-node subnet: filters: - name: tag:Name values: - <infrastructure_id>-subnet-private-<zone> tags: - name: kubernetes.io/cluster/<infrastructure_id> value: owned - name: <custom_tag_name> value: <custom_tag_value> userDataSecret: name: worker-user-datawhere:
<infrastructure_id>Specifies the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. If you have the OpenShift CLI (
oc) installed, you can obtain the infrastructure ID by running the following command:$ oc get -o jsonpath="{.status.infrastructureName}{'\n'}" infrastructure cluster<role>-<zone>- Specifies the infrastructure ID, role node label, and zone.
<role>- Specifies the role node label to add.
ami.idSpecifies a Red Hat Enterprise Linux CoreOS (RHCOS) Amazon Machine Image (AMI) for your AWS region for the nodes. The RHCOS AMI must be compatible with the machine architecture.
$ oc get configmap/coreos-bootimages \ -n openshift-machine-config-operator \ -o jsonpath='{.data.stream}' | jq \ -r '.architectures.<arch>.images.aws.regions."<region>".image'instanceType- Specifies a machine type that aligns with the CPU architecture of the chosen AMI. For more information, see "Tested instance types for AWS 64-bit ARM".
availabilityZone-
Specifies the zone. For example,
us-east-1a. Ensure that the zone you select has machines with the required architecture. region-
Specifies the region. For example,
us-east-1. Ensure that the zone you select has machines with the required architecture.
Create the compute machine set by running the following command:
$ oc create -f <file_name>-
Replace
<file_name>with the name of the YAML file with compute machine set configuration. For example:aws-arm64-machine-set-0.yaml, oraws-amd64-machine-set-0.yaml.
-
Replace
Verification
Verify that the new machines are running by running the following command:
$ oc get machineset -n openshift-machine-apiThe output must include the machine set that you created.
Example output
NAME DESIRED CURRENT READY AVAILABLE AGE <infrastructure_id>-aws-machine-set-0 2 2 2 2 10mYou can check if the nodes are ready and schedulable by running the following command:
$ oc get nodes
3.3. Creating a cluster with multi-architecture compute machines on Azure Copy linkLink copied to clipboard!
To deploy a cluster on Microsoft Azure with multi-architecture compute machines, you must first create a single-architecture installer-provisioned cluster that uses the multi-architecture installer binary.
You can also migrate your current cluster with single-architecture compute machines to a cluster with multi-architecture compute machines. After creating a multi-architecture cluster, you can add nodes with different architectures to the cluster.
3.3.1. Creating a 64-bit ARM boot image using the Azure image gallery Copy linkLink copied to clipboard!
You can generate a 64-bit x86 boot image or a 64-bit ARM boot image by using the Azure image gallery.
The procedure example describes how to manually generate a 64-bit ARM boot image. If you want to generate a 64-bit x86 boot image, replace aarch64 with x86_64; Additionally, replace any instances of rhcos-arm64 with rhcos-x86_64.
Prerequisites
-
You installed the Azure CLI (
az). - You created a single-architecture Azure installer-provisioned cluster with the multi-architecture installer binary.
Procedure
Log in to your Azure account by running the following command:
$ az loginCreate a storage account and upload the
aarch64virtual hard drive (VHD) to your storage account. The OpenShift Container Platform installation program creates a resource group, however, the boot image can also be uploaded to a custom named resource group:$ az storage account create -n ${STORAGE_ACCOUNT_NAME} -g ${RESOURCE_GROUP} -l westus --sku Standard_LRS-
The
westusobject is an example region.
-
The
Create a storage container using the storage account you generated by entering the following command:
$ az storage container create -n ${CONTAINER_NAME} --account-name ${STORAGE_ACCOUNT_NAME}You must use the OpenShift Container Platform installation program JSON file to extract the URL and
aarch64VHD name:Extract the
URLfield and set it toRHCOS_VHD_ORIGIN_URLas the file name by running the following command:$ RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.aarch64."rhel-coreos-extensions"."azure-disk".url')-
For a 64-bit x86 boot image, replace
architectures.aarch64witharchitectures.x86_64.
-
For a 64-bit x86 boot image, replace
Extract the
aarch64VHD name and set it toBLOB_NAMEas the file name by running the following command:$ BLOB_NAME=rhcos-$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.aarch64."rhel-coreos-extensions"."azure-disk".release')-azure.aarch64.vhd-
For a 64-bit x86 boot image, replace
architectures.aarch64witharchitectures.x86_64and replaceaarch64.vhdwithx86_64.vhd.
-
For a 64-bit x86 boot image, replace
Generate a shared access signature (SAS) token. Use this token to upload the RHCOS VHD to your storage container with the following commands:
$ end=`date -u -d "30 minutes" '+%Y-%m-%dT%H:%MZ'`$ sas=`az storage container generate-sas -n ${CONTAINER_NAME} --account-name ${STORAGE_ACCOUNT_NAME} --https-only --permissions dlrw --expiry $end -o tsv`Copy the RHCOS VHD into the storage container:
$ az storage blob copy start --account-name ${STORAGE_ACCOUNT_NAME} --sas-token "$sas" \ --source-uri "${RHCOS_VHD_ORIGIN_URL}" \ --destination-blob "${BLOB_NAME}" --destination-container ${CONTAINER_NAME}You can check the status of the copying process with the following command:
$ az storage blob show -c ${CONTAINER_NAME} -n ${BLOB_NAME} --account-name ${STORAGE_ACCOUNT_NAME} | jq .properties.copyExample output
{ "completionTime": null, "destinationSnapshot": null, "id": "1fd97630-03ca-489a-8c4e-cfe839c9627d", "incrementalCopy": null, "progress": "17179869696/17179869696", "source": "https://rhcos.blob.core.windows.net/imagebucket/rhcos-411.86.202207130959-0-azure.aarch64.vhd", "status": "success", "statusDescription": null }If the status parameter displays the
successobject, the copying process is complete.Create an image gallery using the following command:
$ az sig create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME}Use the image gallery to create an image definition. In the following example command,
rhcos-arm64is the name of the image definition.$ az sig image-definition create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME} --gallery-image-definition rhcos-arm64 --publisher RedHat --offer arm --sku arm64 --os-type linux --architecture Arm64 --hyper-v-generation V2-
For a 64-bit x86 boot image, replace
--offer arm --sku arm64with `--offer x86_64 --sku x86_64 `.
-
For a 64-bit x86 boot image, replace
To get the URL of the VHD and set it to
RHCOS_VHD_URLas the file name, run the following command:$ RHCOS_VHD_URL=$(az storage blob url --account-name ${STORAGE_ACCOUNT_NAME} -c ${CONTAINER_NAME} -n "${BLOB_NAME}" -o tsv)Use the
RHCOS_VHD_URLfile, your storage account, resource group, and image gallery to create an image version. In the following example,1.0.0is the image version.$ az sig image-version create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME} --gallery-image-definition rhcos-arm64 --gallery-image-version 1.0.0 --os-vhd-storage-account ${STORAGE_ACCOUNT_NAME} --os-vhd-uri ${RHCOS_VHD_URL}Optional: Now that your
arm64boot image is now generated, you can access the ID of your image with the following command:$ az sig image-version show -r $GALLERY_NAME -g $RESOURCE_GROUP -i rhcos-arm64 -e 1.0.0The following example image ID is used in the
recourseIDparameter of the compute machine set:Example
resourceID/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.Compute/galleries/${GALLERY_NAME}/images/rhcos-arm64/versions/1.0.0
3.3.2. Adding a multi-architecture compute machine set to your Azure cluster Copy linkLink copied to clipboard!
After creating a multi-architecture cluster, you can add nodes with different architectures.
You can add multi-architecture compute machines to a multi-architecture cluster in the following ways:
- Adding 64-bit x86 compute machines to a cluster that uses 64-bit ARM control plane machines and already includes 64-bit ARM compute machines. In this case, 64-bit x86 is considered the secondary architecture.
- Adding 64-bit ARM compute machines to a cluster that uses 64-bit x86 control plane machines and already includes 64-bit x86 compute machines. In this case, 64-bit ARM is considered the secondary architecture.
To create a custom compute machine set on Azure, see "Creating a compute machine set on Azure".
Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig custom resource. For more information, see "Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator".
Prerequisites
-
You installed the OpenShift CLI (
oc). - You created a 64-bit ARM or 64-bit x86 boot image.
- You used the installation program to create a 64-bit ARM or 64-bit x86 single-architecture cluster with the multi-architecture installer binary.
Procedure
-
Log in to the OpenShift CLI (
oc). Create a YAML file and add the configuration to create a compute machine set to control the 64-bit ARM or 64-bit x86 compute nodes in your cluster.
Example
MachineSetobject for an Azure 64-bit ARM or x86 compute nodeapiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker name: <infrastructure_id>-machine-set-0 namespace: openshift-machine-api spec: replicas: 2 selector: matchLabels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machineset: <infrastructure_id>-machine-set-0 template: metadata: labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker machine.openshift.io/cluster-api-machineset: <infrastructure_id>-machine-set-0 spec: lifecycleHooks: {} metadata: {} providerSpec: value: acceleratedNetworking: true apiVersion: machine.openshift.io/v1beta1 credentialsSecret: name: azure-cloud-credentials namespace: openshift-machine-api image: offer: "" publisher: "" resourceID: /resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.Compute/galleries/${GALLERY_NAME}/images/rhcos-arm64/versions/1.0.0 sku: "" version: "" kind: AzureMachineProviderSpec location: <region> managedIdentity: <infrastructure_id>-identity networkResourceGroup: <infrastructure_id>-rg osDisk: diskSettings: {} diskSizeGB: 128 managedDisk: storageAccountType: Premium_LRS osType: Linux publicIP: false publicLoadBalancer: <infrastructure_id> resourceGroup: <infrastructure_id>-rg subnet: <infrastructure_id>-worker-subnet userDataSecret: name: worker-user-data vmSize: Standard_D4ps_v5 vnet: <infrastructure_id>-vnet zone: "<zone>"where:
image.resourceID-
Specifies the boot image, such as
arm64oramd64. vmSize-
Specifies the instance type used in your installation. Some example instance types are
Standard_D4ps_v5orD8ps.
Create the compute machine set by running the following command:
$ oc create -f <file_name>-
Replace
<file_name>with the name of the YAML file with compute machine set configuration. For example:arm64-machine-set-0.yaml, oramd64-machine-set-0.yaml.
-
Replace
Verification
Verify that the new machines are running by running the following command:
$ oc get machineset -n openshift-machine-apiThe output must include the machine set that you created.
Example output
NAME DESIRED CURRENT READY AVAILABLE AGE <infrastructure_id>-machine-set-0 2 2 2 2 10mYou can check if the nodes are ready and schedulable by running the following command:
$ oc get nodes
3.4. Creating a cluster with multi-architecture compute machines on Google Cloud Copy linkLink copied to clipboard!
To deploy a cluster on Google Cloud with multi-architecture compute machines, you must first create a single-architecture installer-provisioned cluster that uses the multi-architecture installer binary.
You can also migrate your current cluster with single-architecture compute machines to a cluster with multi-architecture compute machines. After creating a multi-architecture cluster, you can add nodes with different architectures to the cluster.
3.4.1. Adding a multi-architecture compute machine set to your Google Cloud cluster Copy linkLink copied to clipboard!
After creating a multi-architecture cluster, you can add nodes with different architectures.
You can add multi-architecture compute machines to a multi-architecture cluster in the following ways:
- Adding 64-bit x86 compute machines to a cluster that uses 64-bit ARM control plane machines and already includes 64-bit ARM compute machines. In this case, 64-bit x86 is considered the secondary architecture.
- Adding 64-bit ARM compute machines to a cluster that uses 64-bit x86 control plane machines and already includes 64-bit x86 compute machines. In this case, 64-bit ARM is considered the secondary architecture.
Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig custom resource. For more information, see "Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator".
Prerequisites
-
You installed the OpenShift CLI (
oc). - You used the installation program to create a 64-bit ARM or 64-bit x86 single-architecture cluster with the multi-architecture installer binary.
Procedure
-
Log in to the OpenShift CLI (
oc). Create a YAML file and add the configuration to create a compute machine set to control the 64-bit ARM or 64-bit x86 compute nodes in your cluster.
Example
MachineSetobject for a Google Cloud 64-bit ARM or 64-bit x86 compute nodeapiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> name: <infrastructure_id>-w-a namespace: openshift-machine-api spec: replicas: 1 selector: matchLabels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machineset: <infrastructure_id>-w-a template: metadata: creationTimestamp: null labels: machine.openshift.io/cluster-api-cluster: <infrastructure_id> machine.openshift.io/cluster-api-machine-role: <role> machine.openshift.io/cluster-api-machine-type: <role> machine.openshift.io/cluster-api-machineset: <infrastructure_id>-w-a spec: metadata: labels: node-role.kubernetes.io/<role>: "" providerSpec: value: apiVersion: gcpprovider.openshift.io/v1beta1 canIPForward: false credentialsSecret: name: gcp-cloud-credentials deletionProtection: false disks: - autoDelete: true boot: true image: <path_to_image> labels: null sizeGb: 128 type: pd-ssd gcpMetadata: - key: <custom_metadata_key> value: <custom_metadata_value> kind: GCPMachineProviderSpec machineType: n1-standard-4 metadata: creationTimestamp: null networkInterfaces: - network: <infrastructure_id>-network subnetwork: <infrastructure_id>-worker-subnet projectID: <project_name> region: us-central1 serviceAccounts: - email: <infrastructure_id>-w@<project_name>.iam.gserviceaccount.com scopes: - https://www.googleapis.com/auth/cloud-platform tags: - <infrastructure_id>-worker userDataSecret: name: worker-user-data zone: us-central1-awhere:
<infrastructure_id>Specifies the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. You can obtain the infrastructure ID by running the following command:
$ oc get -o jsonpath='{.status.infrastructureName}{"\n"}' infrastructure cluster<role>- Specifies the role node label to add.
<path_to_image>Specifies the path to the image that is used in current compute machine sets. You need the project and image name for your path to image.
To access the project and image name, run the following command:
$ oc get configmap/coreos-bootimages \ -n openshift-machine-config-operator \ -o jsonpath='{.data.stream}' | jq \ -r '.architectures.aarch64.images.gcp'Example output
"gcp": { "release": "415.92.202309142014-0", "project": "rhcos-cloud", "name": "rhcos-415-92-202309142014-0-gcp-aarch64" }Use the
projectandnameparameters from the output to create the path to image field in your machine set. The path to the image should follow the following format:$ projects/<project>/global/images/<image_name>gcpMetadata-
Optional parameter. Specifies custom metadata in the form of a
key:valuepair. For example use cases, see "Setting custom metadata". machineType- Specifies a machine type that aligns with the CPU architecture of the chosen OS image. For more information, see "Tested instance types for Google Cloud on 64-bit ARM infrastructures".
projectID- Specifies the name of the Google Cloud project that you use for your cluster.
region-
Specifies the region. For example,
us-central1. Ensure that the zone you select has machines with the required architecture.
Create the compute machine set by running the following command:
$ oc create -f <file_name>-
Replace
<file_name>with the name of the YAML file with compute machine set configuration. For example:gcp-arm64-machine-set-0.yaml, orgcp-amd64-machine-set-0.yaml.
-
Replace
Verification
Verify that the new machines are running by running the following command:
$ oc get machineset -n openshift-machine-apiThe output must include the machine set that you created.
Example output
NAME DESIRED CURRENT READY AVAILABLE AGE <infrastructure_id>-gcp-machine-set-0 2 2 2 2 10mYou can check if the nodes are ready and schedulable by running the following command:
$ oc get nodes
3.5. Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z Copy linkLink copied to clipboard!
To create a cluster with multi-architecture compute machines on bare metal (x86_64 or aarch64), IBM Power® (ppc64le), or IBM Z® (s390x) you must have an existing single-architecture cluster on one of these platforms. Follow the installations procedures for your platform:
- Installing a user provisioned cluster on bare metal. You can then add 64-bit ARM compute machines to your OpenShift Container Platform cluster on bare metal.
-
Installing a cluster on IBM Power®. You can then add
x86_64compute machines to your OpenShift Container Platform cluster on IBM Power®. -
Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add
x86_64compute machines to your OpenShift Container Platform cluster on IBM Z® and IBM® LinuxONE.
The bare metal installer-provisioned infrastructure and the Bare Metal Operator do not support adding secondary architecture nodes during the initial cluster setup. You can add secondary architecture nodes manually only after the initial cluster setup.
Before you can add additional compute nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.
The following procedures explain how to create a RHCOS compute machine using an ISO image or network PXE booting. This allows you to add additional nodes to your cluster and deploy a cluster with multi-architecture compute machines.
Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig object. For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.
3.5.1. Creating RHCOS machines using an ISO image Copy linkLink copied to clipboard!
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using an ISO image to create the machines.
Prerequisites
- Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
-
You must have the OpenShift CLI (
oc) installed.
Procedure
Extract the Ignition config file from the cluster by running the following command:
$ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign-
Upload the
worker.ignIgnition config file you exported from your cluster to your HTTP server. Note the URLs of these files. You can validate that the ignition files are available on the URLs. The following example gets the Ignition config files for the compute node:
$ curl -k http://<HTTP_server>/worker.ignYou can access the ISO image for booting your new machine by running to following command:
RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.<architecture>.artifacts.metal.formats.iso.disk.location')Use the ISO file to install RHCOS on more compute machines. Use the same method that you used when you created machines before you installed the cluster:
- Burn the ISO image to a disk and boot it directly.
- Use ISO redirection with a LOM interface.
Boot the RHCOS ISO image without specifying any options, or interrupting the live boot sequence. Wait for the installer to boot into a shell prompt in the RHCOS live environment.
NoteYou can interrupt the RHCOS installation boot process to add kernel arguments. However, for this ISO procedure you must use the
coreos-installercommand as outlined in the following steps, instead of adding kernel arguments.Run the
coreos-installercommand and specify the options that meet your installation requirements. At a minimum, you must specify the URL that points to the Ignition config file for the node type, and the device that you are installing to:$ sudo coreos-installer install --ignition-url=http://<HTTP_server>/<node_type>.ign <device> --ignition-hash=sha512-<digest>1 2 - 1
- You must run the
coreos-installercommand by usingsudo, because thecoreuser does not have the required root privileges to perform the installation. - 2
- The
--ignition-hashoption is required when the Ignition config file is obtained through an HTTP URL to validate the authenticity of the Ignition config file on the cluster node.<digest>is the Ignition config file SHA512 digest obtained in a preceding step.
NoteIf you want to provide your Ignition config files through an HTTPS server that uses TLS, you can add the internal certificate authority (CA) to the system trust store before running
coreos-installer.The following example initializes a compute node installation to the
/dev/sdadevice. The Ignition config file for the compute node is obtained from an HTTP web server with the IP address 192.168.1.2:$ sudo coreos-installer install --ignition-url=http://192.168.1.2:80/installation_directory/worker.ign /dev/sda --ignition-hash=sha512-a5a2d43879223273c9b60af66b44202a1d1248fc01cf156c46d4a79f552b6bad47bc8cc78ddf0116e80c59d2ea9e32ba53bc807afbca581aa059311def2c3e3bMonitor the progress of the RHCOS installation on the console of the machine.
ImportantEnsure that the installation is successful on each node before commencing with the OpenShift Container Platform installation. Observing the installation process can also help to determine the cause of RHCOS installation issues that might arise.
- Continue to create more compute machines for your cluster.
3.5.2. Creating RHCOS machines by PXE or iPXE booting Copy linkLink copied to clipboard!
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using PXE or iPXE booting.
Prerequisites
- Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
-
Obtain the URLs of the RHCOS ISO image, compressed metal BIOS,
kernel, andinitramfsfiles that you uploaded to your HTTP server during cluster installation. - You have access to the PXE booting infrastructure that you used to create the machines for your OpenShift Container Platform cluster during installation. The machines must boot from their local disks after RHCOS is installed on them.
-
If you use UEFI, you have access to the
grub.conffile that you modified during OpenShift Container Platform installation.
Procedure
Confirm that your PXE or iPXE installation for the RHCOS images is correct.
For PXE:
DEFAULT pxeboot TIMEOUT 20 PROMPT 0 LABEL pxeboot KERNEL http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture>1 APPEND initrd=http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img2 - 1
- Specify the location of the live
kernelfile that you uploaded to your HTTP server. - 2
- Specify locations of the RHCOS files that you uploaded to your HTTP server. The
initrdparameter value is the location of the liveinitramfsfile, thecoreos.inst.ignition_urlparameter value is the location of the worker Ignition config file, and thecoreos.live.rootfs_urlparameter value is the location of the liverootfsfile. Thecoreos.inst.ignition_urlandcoreos.live.rootfs_urlparameters only support HTTP and HTTPS.
NoteThis configuration does not enable serial console access on machines with a graphical console. To configure a different console, add one or more
console=arguments to theAPPENDline. For example, addconsole=tty0 console=ttyS0to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux?.For iPXE (
x86_64+aarch64):kernel http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> initrd=main coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign1 2 initrd --name main http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img3 boot- 1
- Specify the locations of the RHCOS files that you uploaded to your HTTP server. The
kernelparameter value is the location of thekernelfile, theinitrd=mainargument is needed for booting on UEFI systems, thecoreos.live.rootfs_urlparameter value is the location of therootfsfile, and thecoreos.inst.ignition_urlparameter value is the location of the worker Ignition config file. - 2
- If you use multiple NICs, specify a single interface in the
ipoption. For example, to use DHCP on a NIC that is namedeno1, setip=eno1:dhcp. - 3
- Specify the location of the
initramfsfile that you uploaded to your HTTP server.
NoteThis configuration does not enable serial console access on machines with a graphical console To configure a different console, add one or more
console=arguments to thekernelline. For example, addconsole=tty0 console=ttyS0to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux? and "Enabling the serial console for PXE and ISO installation" in the "Advanced RHCOS installation configuration" section.NoteTo network boot the CoreOS
kernelonaarch64architecture, you need to use a version of iPXE build with theIMAGE_GZIPoption enabled. SeeIMAGE_GZIPoption in iPXE.For PXE (with UEFI and GRUB as second stage) on
aarch64:menuentry 'Install CoreOS' { linux rhcos-<version>-live-kernel-<architecture> coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign1 2 initrd rhcos-<version>-live-initramfs.<architecture>.img3 }- 1
- Specify the locations of the RHCOS files that you uploaded to your HTTP/TFTP server. The
kernelparameter value is the location of thekernelfile on your TFTP server. Thecoreos.live.rootfs_urlparameter value is the location of therootfsfile, and thecoreos.inst.ignition_urlparameter value is the location of the worker Ignition config file on your HTTP Server. - 2
- If you use multiple NICs, specify a single interface in the
ipoption. For example, to use DHCP on a NIC that is namedeno1, setip=eno1:dhcp. - 3
- Specify the location of the
initramfsfile that you uploaded to your TFTP server.
- Use the PXE or iPXE infrastructure to create the required compute machines for your cluster.
3.5.3. Approving the certificate signing requests for your machines Copy linkLink copied to clipboard!
To add machines to a cluster, verify the status of the certificate signing requests (CSRs) generated for each machine. If manual approval is required, approve the client requests first, followed by the server requests.
Prerequisites
- You added machines to your cluster.
Procedure
Confirm that the cluster recognizes the machines:
$ oc get nodesExample output
NAME STATUS ROLES AGE VERSION master-0 Ready master 63m v1.32.3 master-1 Ready master 63m v1.32.3 master-2 Ready master 64m v1.32.3The output lists all of the machines that you created.
NoteThe preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.
Review the pending CSRs and ensure that you see the client requests with the
PendingorApprovedstatus for each machine that you added to the cluster:$ oc get csrExample output
NAME AGE REQUESTOR CONDITION csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-8vnps 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...In this example, two machines are joining the cluster. You might see more approved CSRs in the list.
If the CSRs were not approved, after all of the pending CSRs for the machines you added are in
Pendingstatus, approve the CSRs for your cluster machines:NoteBecause the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the
machine-approverif the Kubelet requests a new certificate with identical parameters.NoteFor clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the
oc exec,oc rsh, andoc logscommands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by thenode-bootstrapperservice account in thesystem:nodeorsystem:admingroups, and confirm the identity of the node.To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>- Specifies the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approveNoteSome Operators might not become available until some CSRs are approved.
Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:
$ oc get csrExample output
NAME AGE REQUESTOR CONDITION csr-bfd72 5m26s system:node:ip-10-0-50-126.us-east-2.compute.internal Pending csr-c57lv 5m26s system:node:ip-10-0-95-157.us-east-2.compute.internal Pending ...If the remaining CSRs are not approved, and are in the
Pendingstatus, approve the CSRs for your cluster machines:To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>- Specifies the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
After all client and server CSRs have been approved, the machines have the
Readystatus. Verify this by running the following command:$ oc get nodesExample output
NAME STATUS ROLES AGE VERSION master-0 Ready master 73m v1.32.3 master-1 Ready master 73m v1.32.3 master-2 Ready master 74m v1.32.3 worker-0 Ready worker 11m v1.32.3 worker-1 Ready worker 11m v1.32.3NoteIt can take a few minutes after approval of the server CSRs for the machines to transition to the
Readystatus.
3.6. Creating a cluster with multi-architecture compute machines on IBM Z and IBM LinuxONE with z/VM Copy linkLink copied to clipboard!
To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x) with z/VM, you must have an existing single-architecture x86_64 cluster. You can then add s390x compute machines to your OpenShift Container Platform cluster.
Before you can add s390x nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.
The following procedures explain how to create a RHCOS compute machine using a z/VM instance. This will allow you to add s390x nodes to your cluster and deploy a cluster with multi-architecture compute machines.
To create an IBM Z® or IBM® LinuxONE (s390x) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.
Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig object. For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.
3.6.1. Creating RHCOS machines on IBM Z with z/VM Copy linkLink copied to clipboard!
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines running on IBM Z® with z/VM and attach them to your existing cluster.
Prerequisites
- You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
- You have an HTTP or HTTPS server running on your provisioning machine that is accessible to the machines you create.
Procedure
Extract the Ignition config file from the cluster by running the following command:
$ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign-
Upload the
worker.ignIgnition config file you exported from your cluster to your HTTP server. Note the URL of this file. You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:
$ curl -k http://<http_server>/worker.ignDownload the RHEL live
kernel,initramfs, androotfsfiles by running the following commands:$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location')-
Move the downloaded RHEL live
kernel,initramfs, androotfsfiles to an HTTP or HTTPS server that is accessible from the RHCOS guest you want to add. Create a parameter file for the guest. The following parameters are specific for the virtual machine:
Optional: To specify a static IP address, add an
ip=parameter with the following entries, with each separated by a colon:- The IP address for the machine.
- An empty string.
- The gateway.
- The netmask.
-
The machine host and domain name in the form
hostname.domainname. If you omit this value, RHCOS obtains the hostname through a reverse DNS lookup. - The network interface name. If you omit this value, RHCOS applies the IP configuration to all available interfaces.
-
The value
none.
-
For
coreos.inst.ignition_url=, specify the URL to theworker.ignfile. Only HTTP and HTTPS protocols are supported. -
For
coreos.live.rootfs_url=, specify the matching rootfs artifact for thekernelandinitramfsyou are booting. Only HTTP and HTTPS protocols are supported. For installations on DASD-type disks, complete the following tasks:
-
For
coreos.inst.install_dev=, specify/dev/dasda. -
Use
rd.dasd=to specify the DASD where RHCOS is to be installed. You can adjust further parameters if required.
The following is an example parameter file,
additional-worker-dasd.parm:cio_ignore=all,!condev rd.neednet=1 \ console=ttysclp0 \ coreos.inst.install_dev=/dev/dasda \ coreos.inst.ignition_url=http://<http_server>/worker.ign \ coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img \ ip=<ip>::<gateway>:<netmask>:<hostname>::none nameserver=<dns> \ rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \ rd.dasd=0.0.3490 \ zfcp.allow_lun_scan=0Write all options in the parameter file as a single line and make sure that you have no newline characters.
-
For
For installations on FCP-type disks, complete the following tasks:
Use
rd.zfcp=<adapter>,<wwpn>,<lun>to specify the FCP disk where RHCOS is to be installed. For multipathing, repeat this step for each additional path.NoteWhen you install with multiple paths, you must enable multipathing directly after the installation, not at a later point in time, as this can cause problems.
Set the install device as:
coreos.inst.install_dev=/dev/sda.NoteIf additional LUNs are configured with NPIV, FCP requires
zfcp.allow_lun_scan=0. If you must enablezfcp.allow_lun_scan=1because you use a CSI driver, for example, you must configure your NPIV so that each node cannot access the boot partition of another node.You can adjust further parameters if required.
ImportantAdditional postinstallation steps are required to fully enable multipathing. For more information, see “Enabling multipathing with kernel arguments on RHCOS" in Machine configuration.
The following is an example parameter file,
additional-worker-fcp.parmfor a worker node with multipathing:cio_ignore=all,!condev rd.neednet=1 \ console=ttysclp0 \ coreos.inst.install_dev=/dev/sda \ coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img \ coreos.inst.ignition_url=http://<http_server>/worker.ign \ ip=<ip>::<gateway>:<netmask>:<hostname>::none nameserver=<dns> \ rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \ zfcp.allow_lun_scan=0 \ rd.zfcp=0.0.1987,0x50050763070bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.19C7,0x50050763070bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.1987,0x50050763071bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.19C7,0x50050763071bc5e3,0x4008400B00000000Write all options in the parameter file as a single line and make sure that you have no newline characters.
-
Transfer the
initramfs,kernel, parameter files, and RHCOS images to z/VM, for example, by using FTP. For details about how to transfer the files with FTP and boot from the virtual reader, see Booting the installation on IBM Z® to install RHEL in z/VM. Punch the files to the virtual reader of the z/VM guest virtual machine.
See PUNCH in IBM® Documentation.
TipYou can use the CP PUNCH command or, if you use Linux, the vmur command to transfer files between two z/VM guest virtual machines.
- Log in to CMS on the bootstrap machine.
IPL the bootstrap machine from the reader by running the following command:
$ ipl cSee IPL in IBM® Documentation.
3.6.2. Approving the certificate signing requests for your machines Copy linkLink copied to clipboard!
To add machines to a cluster, verify the status of the certificate signing requests (CSRs) generated for each machine. If manual approval is required, approve the client requests first, followed by the server requests.
Prerequisites
- You added machines to your cluster.
Procedure
Confirm that the cluster recognizes the machines:
$ oc get nodesExample output
NAME STATUS ROLES AGE VERSION master-0 Ready master 63m v1.32.3 master-1 Ready master 63m v1.32.3 master-2 Ready master 64m v1.32.3The output lists all of the machines that you created.
NoteThe preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.
Review the pending CSRs and ensure that you see the client requests with the
PendingorApprovedstatus for each machine that you added to the cluster:$ oc get csrExample output
NAME AGE REQUESTOR CONDITION csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-8vnps 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...In this example, two machines are joining the cluster. You might see more approved CSRs in the list.
If the CSRs were not approved, after all of the pending CSRs for the machines you added are in
Pendingstatus, approve the CSRs for your cluster machines:NoteBecause the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the
machine-approverif the Kubelet requests a new certificate with identical parameters.NoteFor clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the
oc exec,oc rsh, andoc logscommands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by thenode-bootstrapperservice account in thesystem:nodeorsystem:admingroups, and confirm the identity of the node.To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>- Specifies the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approveNoteSome Operators might not become available until some CSRs are approved.
Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:
$ oc get csrExample output
NAME AGE REQUESTOR CONDITION csr-bfd72 5m26s system:node:ip-10-0-50-126.us-east-2.compute.internal Pending csr-c57lv 5m26s system:node:ip-10-0-95-157.us-east-2.compute.internal Pending ...If the remaining CSRs are not approved, and are in the
Pendingstatus, approve the CSRs for your cluster machines:To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>- Specifies the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
After all client and server CSRs have been approved, the machines have the
Readystatus. Verify this by running the following command:$ oc get nodesExample output
NAME STATUS ROLES AGE VERSION master-0 Ready master 73m v1.32.3 master-1 Ready master 73m v1.32.3 master-2 Ready master 74m v1.32.3 worker-0 Ready worker 11m v1.32.3 worker-1 Ready worker 11m v1.32.3NoteIt can take a few minutes after approval of the server CSRs for the machines to transition to the
Readystatus.
3.7. Creating a cluster with multi-architecture compute machines on IBM Z and IBM LinuxONE in an LPAR Copy linkLink copied to clipboard!
To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x) in an LPAR, you must have an existing single-architecture x86_64 cluster. You can then add s390x compute machines to your OpenShift Container Platform cluster.
Before you can add s390x nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.
The following procedures explain how to create a RHCOS compute machine using an LPAR instance. This will allow you to add s390x nodes to your cluster and deploy a cluster with multi-architecture compute machines.
To create an IBM Z® or IBM® LinuxONE (s390x) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.
3.7.1. Creating RHCOS machines on IBM Z in an LPAR Copy linkLink copied to clipboard!
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines running on IBM Z® in a logical partition (LPAR) and attach them to your existing cluster.
Prerequisites
- You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
- You have an HTTP or HTTPS server running on your provisioning machine that is accessible to the machines you create.
Procedure
Extract the Ignition config file from the cluster by running the following command:
$ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign-
Upload the
worker.ignIgnition config file you exported from your cluster to your HTTP server. Note the URL of this file. You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:
$ curl -k http://<http_server>/worker.ignDownload the RHEL live
kernel,initramfs, androotfsfiles by running the following commands:$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location')-
Move the downloaded RHEL live
kernel,initramfs, androotfsfiles to an HTTP or HTTPS server that is accessible from the RHCOS guest you want to add. Create a parameter file for the guest. The following parameters are specific for the virtual machine:
Optional: To specify a static IP address, add an
ip=parameter with the following entries, with each separated by a colon:- The IP address for the machine.
- An empty string.
- The gateway.
- The netmask.
-
The machine host and domain name in the form
hostname.domainname. If you omit this value, RHCOS obtains the hostname through a reverse DNS lookup. - The network interface name. If you omit this value, RHCOS applies the IP configuration to all available interfaces.
-
The value
none.
-
For
coreos.inst.ignition_url=, specify the URL to theworker.ignfile. Only HTTP and HTTPS protocols are supported. -
For
coreos.live.rootfs_url=, specify the matching rootfs artifact for thekernelandinitramfsyou are booting. Only HTTP and HTTPS protocols are supported. For installations on DASD-type disks, complete the following tasks:
-
For
coreos.inst.install_dev=, specify/dev/dasda. -
Use
rd.dasd=to specify the DASD where RHCOS is to be installed. You can adjust further parameters if required.
The following is an example parameter file,
additional-worker-dasd.parm:cio_ignore=all,!condev rd.neednet=1 \ console=ttysclp0 \ coreos.inst.install_dev=/dev/dasda \ coreos.inst.ignition_url=http://<http_server>/worker.ign \ coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img \ ip=<ip>::<gateway>:<netmask>:<hostname>::none nameserver=<dns> \ rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \ rd.dasd=0.0.3490 \ zfcp.allow_lun_scan=0Write all options in the parameter file as a single line and make sure that you have no newline characters.
-
For
For installations on FCP-type disks, complete the following tasks:
Use
rd.zfcp=<adapter>,<wwpn>,<lun>to specify the FCP disk where RHCOS is to be installed. For multipathing, repeat this step for each additional path.NoteWhen you install with multiple paths, you must enable multipathing directly after the installation, not at a later point in time, as this can cause problems.
Set the install device as:
coreos.inst.install_dev=/dev/sda.NoteIf additional LUNs are configured with NPIV, FCP requires
zfcp.allow_lun_scan=0. If you must enablezfcp.allow_lun_scan=1because you use a CSI driver, for example, you must configure your NPIV so that each node cannot access the boot partition of another node.You can adjust further parameters if required.
ImportantAdditional postinstallation steps are required to fully enable multipathing. For more information, see “Enabling multipathing with kernel arguments on RHCOS" in Machine configuration.
The following is an example parameter file,
additional-worker-fcp.parmfor a worker node with multipathing:cio_ignore=all,!condev rd.neednet=1 \ console=ttysclp0 \ coreos.inst.install_dev=/dev/sda \ coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img \ coreos.inst.ignition_url=http://<http_server>/worker.ign \ ip=<ip>::<gateway>:<netmask>:<hostname>::none nameserver=<dns> \ rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \ zfcp.allow_lun_scan=0 \ rd.zfcp=0.0.1987,0x50050763070bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.19C7,0x50050763070bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.1987,0x50050763071bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.19C7,0x50050763071bc5e3,0x4008400B00000000Write all options in the parameter file as a single line and make sure that you have no newline characters.
- Transfer the initramfs, kernel, parameter files, and RHCOS images to the LPAR, for example with FTP. For details about how to transfer the files with FTP and boot, see Booting the installation on IBM Z® to install RHEL in an LPAR.
- Boot the machine
3.7.2. Approving the certificate signing requests for your machines Copy linkLink copied to clipboard!
To add machines to a cluster, verify the status of the certificate signing requests (CSRs) generated for each machine. If manual approval is required, approve the client requests first, followed by the server requests.
Prerequisites
- You added machines to your cluster.
Procedure
Confirm that the cluster recognizes the machines:
$ oc get nodesExample output
NAME STATUS ROLES AGE VERSION master-0 Ready master 63m v1.32.3 master-1 Ready master 63m v1.32.3 master-2 Ready master 64m v1.32.3The output lists all of the machines that you created.
NoteThe preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.
Review the pending CSRs and ensure that you see the client requests with the
PendingorApprovedstatus for each machine that you added to the cluster:$ oc get csrExample output
NAME AGE REQUESTOR CONDITION csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-8vnps 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...In this example, two machines are joining the cluster. You might see more approved CSRs in the list.
If the CSRs were not approved, after all of the pending CSRs for the machines you added are in
Pendingstatus, approve the CSRs for your cluster machines:NoteBecause the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the
machine-approverif the Kubelet requests a new certificate with identical parameters.NoteFor clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the
oc exec,oc rsh, andoc logscommands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by thenode-bootstrapperservice account in thesystem:nodeorsystem:admingroups, and confirm the identity of the node.To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>- Specifies the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approveNoteSome Operators might not become available until some CSRs are approved.
Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:
$ oc get csrExample output
NAME AGE REQUESTOR CONDITION csr-bfd72 5m26s system:node:ip-10-0-50-126.us-east-2.compute.internal Pending csr-c57lv 5m26s system:node:ip-10-0-95-157.us-east-2.compute.internal Pending ...If the remaining CSRs are not approved, and are in the
Pendingstatus, approve the CSRs for your cluster machines:To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>- Specifies the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
After all client and server CSRs have been approved, the machines have the
Readystatus. Verify this by running the following command:$ oc get nodesExample output
NAME STATUS ROLES AGE VERSION master-0 Ready master 73m v1.32.3 master-1 Ready master 73m v1.32.3 master-2 Ready master 74m v1.32.3 worker-0 Ready worker 11m v1.32.3 worker-1 Ready worker 11m v1.32.3NoteIt can take a few minutes after approval of the server CSRs for the machines to transition to the
Readystatus.
3.8. Creating a cluster with multi-architecture compute machines on IBM Z and IBM LinuxONE with RHEL KVM Copy linkLink copied to clipboard!
To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x) with RHEL KVM, you must have an existing single-architecture x86_64 cluster. You can then add s390x compute machines to your OpenShift Container Platform cluster.
Before you can add s390x nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.
The following procedures explain how to create a RHCOS compute machine using a RHEL KVM instance. This will allow you to add s390x nodes to your cluster and deploy a cluster with multi-architecture compute machines.
To create an IBM Z® or IBM® LinuxONE (s390x) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.
Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig object. For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.
3.8.1. Creating RHCOS machines using virt-install Copy linkLink copied to clipboard!
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your cluster by using virt-install.
Prerequisites
- You have at least one LPAR running on RHEL 8.7 or later with KVM, referred to as RHEL KVM host in this procedure.
- The KVM/QEMU hypervisor is installed on the RHEL KVM host.
- You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
- An HTTP or HTTPS server is set up.
Procedure
Extract the Ignition config file from the cluster by running the following command:
$ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign-
Upload the
worker.ignIgnition config file you exported from your cluster to your HTTP server. Note the URL of this file. You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:
$ curl -k http://<HTTP_server>/worker.ignDownload the RHEL live
kernel,initramfs, androotfsfiles by running the following commands:$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location')-
Move the downloaded RHEL live
kernel,initramfs, androotfsfiles to an HTTP or HTTPS server before you launchvirt-install. Create the new KVM guest nodes using the RHEL
kernel,initramfs, and Ignition files; the new disk image; and adjusted parm line arguments.$ virt-install \ --connect qemu:///system \ --name <vm_name> \ --autostart \ --os-variant rhel9.4 \1 --cpu host \ --vcpus <vcpus> \ --memory <memory_mb> \ --disk <vm_name>.qcow2,size=<image_size> \ --network network=<virt_network_parm> \ --location <media_location>,kernel=<rhcos_kernel>,initrd=<rhcos_initrd> \2 --extra-args "rd.neednet=1" \ --extra-args "coreos.inst.install_dev=/dev/vda" \ --extra-args "coreos.inst.ignition_url=http://<http_server>/worker.ign " \3 --extra-args "coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img" \4 --extra-args "ip=<ip>::<gateway>:<netmask>:<hostname>::none" \5 --extra-args "nameserver=<dns>" \ --extra-args "console=ttysclp0" \ --noautoconsole \ --wait- 1
- For
os-variant, specify the RHEL version for the RHCOS compute machine.rhel9.4is the recommended version. To query the supported RHEL version of your operating system, run the following command:$ osinfo-query os -f short-idNoteThe
os-variantis case sensitive. - 2
- For
--location, specify the location of the kernel/initrd on the HTTP or HTTPS server. - 3
- Specify the location of the
worker.ignconfig file. Only HTTP and HTTPS protocols are supported. - 4
- Specify the location of the
rootfsartifact for thekernelandinitramfsyou are booting. Only HTTP and HTTPS protocols are supported - 5
- Optional: For
hostname, specify the fully qualified hostname of the client machine.
NoteIf you are using HAProxy as a load balancer, update your HAProxy rules for
ingress-router-443andingress-router-80in the/etc/haproxy/haproxy.cfgconfiguration file.- Continue to create more compute machines for your cluster.
3.8.2. Approving the certificate signing requests for your machines Copy linkLink copied to clipboard!
To add machines to a cluster, verify the status of the certificate signing requests (CSRs) generated for each machine. If manual approval is required, approve the client requests first, followed by the server requests.
Prerequisites
- You added machines to your cluster.
Procedure
Confirm that the cluster recognizes the machines:
$ oc get nodesExample output
NAME STATUS ROLES AGE VERSION master-0 Ready master 63m v1.32.3 master-1 Ready master 63m v1.32.3 master-2 Ready master 64m v1.32.3The output lists all of the machines that you created.
NoteThe preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.
Review the pending CSRs and ensure that you see the client requests with the
PendingorApprovedstatus for each machine that you added to the cluster:$ oc get csrExample output
NAME AGE REQUESTOR CONDITION csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-8vnps 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...In this example, two machines are joining the cluster. You might see more approved CSRs in the list.
If the CSRs were not approved, after all of the pending CSRs for the machines you added are in
Pendingstatus, approve the CSRs for your cluster machines:NoteBecause the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the
machine-approverif the Kubelet requests a new certificate with identical parameters.NoteFor clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the
oc exec,oc rsh, andoc logscommands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by thenode-bootstrapperservice account in thesystem:nodeorsystem:admingroups, and confirm the identity of the node.To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>- Specifies the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approveNoteSome Operators might not become available until some CSRs are approved.
Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:
$ oc get csrExample output
NAME AGE REQUESTOR CONDITION csr-bfd72 5m26s system:node:ip-10-0-50-126.us-east-2.compute.internal Pending csr-c57lv 5m26s system:node:ip-10-0-95-157.us-east-2.compute.internal Pending ...If the remaining CSRs are not approved, and are in the
Pendingstatus, approve the CSRs for your cluster machines:To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>- Specifies the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
After all client and server CSRs have been approved, the machines have the
Readystatus. Verify this by running the following command:$ oc get nodesExample output
NAME STATUS ROLES AGE VERSION master-0 Ready master 73m v1.32.3 master-1 Ready master 73m v1.32.3 master-2 Ready master 74m v1.32.3 worker-0 Ready worker 11m v1.32.3 worker-1 Ready worker 11m v1.32.3NoteIt can take a few minutes after approval of the server CSRs for the machines to transition to the
Readystatus.
3.9. Creating a cluster with multi-architecture compute machines on IBM Power Copy linkLink copied to clipboard!
To create a cluster with multi-architecture compute machines on IBM Power® (ppc64le), you must have an existing single-architecture (x86_64) cluster. You can then add ppc64le compute machines to your OpenShift Container Platform cluster.
Before you can add ppc64le nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.
The following procedures explain how to create a RHCOS compute machine using an ISO image or network PXE booting. This will allow you to add ppc64le nodes to your cluster and deploy a cluster with multi-architecture compute machines.
To create an IBM Power® (ppc64le) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Power®. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.
Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig object. For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.
3.9.1. Creating RHCOS machines using an ISO image Copy linkLink copied to clipboard!
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your cluster by using an ISO image to create the machines.
Prerequisites
- Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
-
You must have the OpenShift CLI (
oc) installed.
Procedure
Extract the Ignition config file from the cluster by running the following command:
$ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign-
Upload the
worker.ignIgnition config file you exported from your cluster to your HTTP server. Note the URLs of these files. You can validate that the ignition files are available on the URLs. The following example gets the Ignition config files for the compute node:
$ curl -k http://<HTTP_server>/worker.ignYou can access the ISO image for booting your new machine by running to following command:
RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.<architecture>.artifacts.metal.formats.iso.disk.location')Use the ISO file to install RHCOS on more compute machines. Use the same method that you used when you created machines before you installed the cluster:
- Burn the ISO image to a disk and boot it directly.
- Use ISO redirection with a LOM interface.
Boot the RHCOS ISO image without specifying any options, or interrupting the live boot sequence. Wait for the installer to boot into a shell prompt in the RHCOS live environment.
NoteYou can interrupt the RHCOS installation boot process to add kernel arguments. However, for this ISO procedure you must use the
coreos-installercommand as outlined in the following steps, instead of adding kernel arguments.Run the
coreos-installercommand and specify the options that meet your installation requirements. At a minimum, you must specify the URL that points to the Ignition config file for the node type, and the device that you are installing to:$ sudo coreos-installer install --ignition-url=http://<HTTP_server>/<node_type>.ign <device> --ignition-hash=sha512-<digest>1 2 - 1
- You must run the
coreos-installercommand by usingsudo, because thecoreuser does not have the required root privileges to perform the installation. - 2
- The
--ignition-hashoption is required when the Ignition config file is obtained through an HTTP URL to validate the authenticity of the Ignition config file on the cluster node.<digest>is the Ignition config file SHA512 digest obtained in a preceding step.
NoteIf you want to provide your Ignition config files through an HTTPS server that uses TLS, you can add the internal certificate authority (CA) to the system trust store before running
coreos-installer.The following example initializes a compute node installation to the
/dev/sdadevice. The Ignition config file for the compute node is obtained from an HTTP web server with the IP address 192.168.1.2:$ sudo coreos-installer install --ignition-url=http://192.168.1.2:80/installation_directory/worker.ign /dev/sda --ignition-hash=sha512-a5a2d43879223273c9b60af66b44202a1d1248fc01cf156c46d4a79f552b6bad47bc8cc78ddf0116e80c59d2ea9e32ba53bc807afbca581aa059311def2c3e3bMonitor the progress of the RHCOS installation on the console of the machine.
ImportantEnsure that the installation is successful on each node before commencing with the OpenShift Container Platform installation. Observing the installation process can also help to determine the cause of RHCOS installation issues that might arise.
- Continue to create more compute machines for your cluster.
3.9.2. Creating RHCOS machines by PXE or iPXE booting Copy linkLink copied to clipboard!
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using PXE or iPXE booting.
Prerequisites
- Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
-
Obtain the URLs of the RHCOS ISO image, compressed metal BIOS,
kernel, andinitramfsfiles that you uploaded to your HTTP server during cluster installation. - You have access to the PXE booting infrastructure that you used to create the machines for your OpenShift Container Platform cluster during installation. The machines must boot from their local disks after RHCOS is installed on them.
-
If you use UEFI, you have access to the
grub.conffile that you modified during OpenShift Container Platform installation.
Procedure
Confirm that your PXE or iPXE installation for the RHCOS images is correct.
For PXE:
DEFAULT pxeboot TIMEOUT 20 PROMPT 0 LABEL pxeboot KERNEL http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture>1 APPEND initrd=http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img2 - 1
- Specify the location of the live
kernelfile that you uploaded to your HTTP server. - 2
- Specify locations of the RHCOS files that you uploaded to your HTTP server. The
initrdparameter value is the location of the liveinitramfsfile, thecoreos.inst.ignition_urlparameter value is the location of the worker Ignition config file, and thecoreos.live.rootfs_urlparameter value is the location of the liverootfsfile. Thecoreos.inst.ignition_urlandcoreos.live.rootfs_urlparameters only support HTTP and HTTPS.
NoteThis configuration does not enable serial console access on machines with a graphical console. To configure a different console, add one or more
console=arguments to theAPPENDline. For example, addconsole=tty0 console=ttyS0to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux?.For iPXE (
x86_64+ppc64le):kernel http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> initrd=main coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign1 2 initrd --name main http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img3 boot- 1
- Specify the locations of the RHCOS files that you uploaded to your HTTP server. The
kernelparameter value is the location of thekernelfile, theinitrd=mainargument is needed for booting on UEFI systems, thecoreos.live.rootfs_urlparameter value is the location of therootfsfile, and thecoreos.inst.ignition_urlparameter value is the location of the worker Ignition config file. - 2
- If you use multiple NICs, specify a single interface in the
ipoption. For example, to use DHCP on a NIC that is namedeno1, setip=eno1:dhcp. - 3
- Specify the location of the
initramfsfile that you uploaded to your HTTP server.
NoteThis configuration does not enable serial console access on machines with a graphical console To configure a different console, add one or more
console=arguments to thekernelline. For example, addconsole=tty0 console=ttyS0to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux? and "Enabling the serial console for PXE and ISO installation" in the "Advanced RHCOS installation configuration" section.NoteTo network boot the CoreOS
kernelonppc64learchitecture, you need to use a version of iPXE build with theIMAGE_GZIPoption enabled. SeeIMAGE_GZIPoption in iPXE.For PXE (with UEFI and GRUB as second stage) on
ppc64le:menuentry 'Install CoreOS' { linux rhcos-<version>-live-kernel-<architecture> coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign1 2 initrd rhcos-<version>-live-initramfs.<architecture>.img3 }- 1
- Specify the locations of the RHCOS files that you uploaded to your HTTP/TFTP server. The
kernelparameter value is the location of thekernelfile on your TFTP server. Thecoreos.live.rootfs_urlparameter value is the location of therootfsfile, and thecoreos.inst.ignition_urlparameter value is the location of the worker Ignition config file on your HTTP Server. - 2
- If you use multiple NICs, specify a single interface in the
ipoption. For example, to use DHCP on a NIC that is namedeno1, setip=eno1:dhcp. - 3
- Specify the location of the
initramfsfile that you uploaded to your TFTP server.
- Use the PXE or iPXE infrastructure to create the required compute machines for your cluster.
3.9.3. Approving the certificate signing requests for your machines Copy linkLink copied to clipboard!
To add machines to a cluster, verify the status of the certificate signing requests (CSRs) generated for each machine. If manual approval is required, approve the client requests first, followed by the server requests.
Prerequisites
- You added machines to your cluster.
Procedure
Confirm that the cluster recognizes the machines:
$ oc get nodesExample output
NAME STATUS ROLES AGE VERSION master-0 Ready master 63m v1.32.3 master-1 Ready master 63m v1.32.3 master-2 Ready master 64m v1.32.3The output lists all of the machines that you created.
NoteThe preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.
Review the pending CSRs and ensure that you see the client requests with the
PendingorApprovedstatus for each machine that you added to the cluster:$ oc get csrExample output
NAME AGE REQUESTOR CONDITION csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-8vnps 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...In this example, two machines are joining the cluster. You might see more approved CSRs in the list.
If the CSRs were not approved, after all of the pending CSRs for the machines you added are in
Pendingstatus, approve the CSRs for your cluster machines:NoteBecause the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the
machine-approverif the Kubelet requests a new certificate with identical parameters.NoteFor clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the
oc exec,oc rsh, andoc logscommands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by thenode-bootstrapperservice account in thesystem:nodeorsystem:admingroups, and confirm the identity of the node.To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>- Specifies the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approveNoteSome Operators might not become available until some CSRs are approved.
Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:
$ oc get csrExample output
NAME AGE REQUESTOR CONDITION csr-bfd72 5m26s system:node:ip-10-0-50-126.us-east-2.compute.internal Pending csr-c57lv 5m26s system:node:ip-10-0-95-157.us-east-2.compute.internal Pending ...If the remaining CSRs are not approved, and are in the
Pendingstatus, approve the CSRs for your cluster machines:To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>- Specifies the name of a CSR from the list of current CSRs.
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
After all client and server CSRs have been approved, the machines have the
Readystatus. Verify this by running the following command:$ oc get nodes -o wideExample output
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME worker-0-ppc64le Ready worker 42d v1.32.3 192.168.200.21 <none> Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.ppc64le cri-o://1.32.3-3.rhaos4.15.gitb36169e.el9 worker-1-ppc64le Ready worker 42d v1.32.3 192.168.200.20 <none> Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.ppc64le cri-o://1.32.3-3.rhaos4.15.gitb36169e.el9 master-0-x86 Ready control-plane,master 75d v1.32.3 10.248.0.38 10.248.0.38 Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.x86_64 cri-o://1.32.3-3.rhaos4.15.gitb36169e.el9 master-1-x86 Ready control-plane,master 75d v1.32.3 10.248.0.39 10.248.0.39 Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.x86_64 cri-o://1.32.3-3.rhaos4.15.gitb36169e.el9 master-2-x86 Ready control-plane,master 75d v1.32.3 10.248.0.40 10.248.0.40 Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.x86_64 cri-o://1.32.3-3.rhaos4.15.gitb36169e.el9 worker-0-x86 Ready worker 75d v1.32.3 10.248.0.43 10.248.0.43 Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.x86_64 cri-o://1.32.3-3.rhaos4.15.gitb36169e.el9 worker-1-x86 Ready worker 75d v1.32.3 10.248.0.44 10.248.0.44 Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow) 5.14.0-284.34.1.el9_2.x86_64 cri-o://1.32.3-3.rhaos4.15.gitb36169e.el9NoteIt can take a few minutes after approval of the server CSRs for the machines to transition to the
Readystatus.
3.10. Managing a cluster with multi-architecture compute machines Copy linkLink copied to clipboard!
Managing a cluster that has nodes with multiple architectures requires you to consider node architecture as you monitor the cluster and manage your workloads. This requires you to take additional considerations into account when you configure cluster resource requirements and behavior, or schedule workloads in a multi-architecture cluster.
3.10.1. Scheduling workloads on clusters with multi-architecture compute machines Copy linkLink copied to clipboard!
When you deploy workloads on a cluster with compute nodes that use different architectures, you must align pod architecture with the architecture of the underlying node. Your workload may also require additional configuration to particular resources depending on the underlying node architecture.
You can use the Multiarch Tuning Operator to enable architecture-aware scheduling of workloads on clusters with multi-architecture compute machines. The Multiarch Tuning Operator implements additional scheduler predicates in the pods specifications based on the architectures that the pods can support at creation time.
3.10.1.1. Sample multi-architecture node workload deployments Copy linkLink copied to clipboard!
Scheduling a workload to an appropriate node based on architecture works in the same way as scheduling based on any other node characteristic. Consider the following options when determining how to schedule your workloads.
- Using
nodeAffinityto schedule nodes with specific architectures You can allow a workload to be scheduled on only a set of nodes with architectures supported by its images, you can set the
spec.affinity.nodeAffinityfield in your pod’s template specification.Example deployment with node affinity set
apiVersion: apps/v1 kind: Deployment metadata: # ... spec: # ... template: # ... spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/arch operator: In values:1 - amd64 - arm64- 1
- Specify the supported architectures. Valid values include
amd64,arm64, or both values.
- Tainting each node for a specific architecture
You can taint a node to avoid the node scheduling workloads that are incompatible with its architecture. When your cluster uses a
MachineSetobject, you can add parameters to the.spec.template.spec.taintsfield to avoid workloads being scheduled on nodes with non-supported architectures.Before you add a taint to a node, you must scale down the
MachineSetobject or remove existing available machines. For more information, see Modifying a compute machine set.Example machine set with taint set
apiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: # ... spec: # ... template: # ... spec: # ... taints: - effect: NoSchedule key: multiarch.openshift.io/arch value: arm64You can also set a taint on a specific node by running the following command:
$ oc adm taint nodes <node-name> multiarch.openshift.io/arch=arm64:NoSchedule
- Creating a default toleration in a namespace
When a node or machine set has a taint, only workloads that tolerate that taint can be scheduled. You can annotate a namespace so all of the workloads get the same default toleration by running the following command:
Example default toleration set on a namespace
$ oc annotate namespace my-namespace \ 'scheduler.alpha.kubernetes.io/defaultTolerations'='[{"operator": "Exists", "effect": "NoSchedule", "key": "multiarch.openshift.io/arch"}]'
- Tolerating architecture taints in workloads
When a node or machine set has a taint, only workloads that tolerate that taint can be scheduled. You can configure your workload with a
tolerationso that it is scheduled on nodes with specific architecture taints.Example deployment with toleration set
apiVersion: apps/v1 kind: Deployment metadata: # ... spec: # ... template: # ... spec: tolerations: - key: "multiarch.openshift.io/arch" value: "arm64" operator: "Equal" effect: "NoSchedule"This example deployment can be scheduled on nodes and machine sets that have the
multiarch.openshift.io/arch=arm64taint specified.
- Using node affinity with taints and tolerations
When a scheduler computes the set of nodes to schedule a pod, tolerations can broaden the set while node affinity restricts the set. If you set a taint on nodes that have a specific architecture, you must also add a toleration to workloads that you want to be scheduled there.
Example deployment with node affinity and toleration set
apiVersion: apps/v1 kind: Deployment metadata: # ... spec: # ... template: # ... spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/arch operator: In values: - amd64 - arm64 tolerations: - key: "multiarch.openshift.io/arch" value: "arm64" operator: "Equal" effect: "NoSchedule"
3.10.2. Enabling 64k pages on the Red Hat Enterprise Linux CoreOS (RHCOS) kernel Copy linkLink copied to clipboard!
You can enable the 64k memory page in the Red Hat Enterprise Linux CoreOS (RHCOS) kernel on the 64-bit ARM compute machines in your cluster. The 64k page size kernel specification can be used for large GPU or high memory workloads. This is done using the Machine Config Operator (MCO) which uses a machine config pool to update the kernel. To enable 64k page sizes, you must dedicate a machine config pool for ARM64 to enable on the kernel.
Using 64k pages is exclusive to 64-bit ARM architecture compute nodes or clusters installed on 64-bit ARM machines. If you configure the 64k pages kernel on a machine config pool using 64-bit x86 machines, the machine config pool and MCO will degrade.
Prerequisites
-
You installed the OpenShift CLI (
oc). - You created a cluster with compute nodes of different architecture on one of the supported platforms.
Procedure
Label the nodes where you want to run the 64k page size kernel:
$ oc label node <node_name> <label>Example command
$ oc label node worker-arm64-01 node-role.kubernetes.io/worker-64k-pages=Create a machine config pool that contains the worker role that uses the ARM64 architecture and the
worker-64k-pagesrole:apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigPool metadata: name: worker-64k-pages spec: machineConfigSelector: matchExpressions: - key: machineconfiguration.openshift.io/role operator: In values: - worker - worker-64k-pages nodeSelector: matchLabels: node-role.kubernetes.io/worker-64k-pages: "" kubernetes.io/arch: arm64Create a machine config on your compute node to enable
64k-pageswith the64k-pagesparameter.$ oc create -f <filename>.yamlExample MachineConfig
apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: "worker-64k-pages"1 name: 99-worker-64kpages spec: kernelType: 64k-pages2 NoteThe
64k-pagestype is supported on only 64-bit ARM architecture based compute nodes. Therealtimetype is supported on only 64-bit x86 architecture based compute nodes.
Verification
To view your new
worker-64k-pagesmachine config pool, run the following command:$ oc get mcpExample output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-9d55ac9a91127c36314e1efe7d77fbf8 True False False 3 3 3 0 361d worker rendered-worker-e7b61751c4a5b7ff995d64b967c421ff True False False 7 7 7 0 361d worker-64k-pages rendered-worker-64k-pages-e7b61751c4a5b7ff995d64b967c421ff True False False 2 2 2 0 35m
3.10.3. Importing manifest lists in image streams on your multi-architecture compute machines Copy linkLink copied to clipboard!
On an OpenShift Container Platform 4.19 cluster with multi-architecture compute machines, the image streams in the cluster do not import manifest lists automatically. You must manually change the default importMode option to the PreserveOriginal option in order to import the manifest list.
Prerequisites
-
You installed the OpenShift Container Platform CLI (
oc).
Procedure
The following example command shows how to patch the
ImageStreamcli-artifacts so that thecli-artifacts:latestimage stream tag is imported as a manifest list.$ oc patch is/cli-artifacts -n openshift -p '{"spec":{"tags":[{"name":"latest","importPolicy":{"importMode":"PreserveOriginal"}}]}}'
Verification
You can check that the manifest lists imported properly by inspecting the image stream tag. The following command will list the individual architecture manifests for a particular tag.
$ oc get istag cli-artifacts:latest -n openshift -oyamlIf the
dockerImageManifestsobject is present, then the manifest list import was successful.Example output of the
dockerImageManifestsobjectdockerImageManifests: - architecture: amd64 digest: sha256:16d4c96c52923a9968fbfa69425ec703aff711f1db822e4e9788bf5d2bee5d77 manifestSize: 1252 mediaType: application/vnd.docker.distribution.manifest.v2+json os: linux - architecture: arm64 digest: sha256:6ec8ad0d897bcdf727531f7d0b716931728999492709d19d8b09f0d90d57f626 manifestSize: 1252 mediaType: application/vnd.docker.distribution.manifest.v2+json os: linux - architecture: ppc64le digest: sha256:65949e3a80349cdc42acd8c5b34cde6ebc3241eae8daaeea458498fedb359a6a manifestSize: 1252 mediaType: application/vnd.docker.distribution.manifest.v2+json os: linux - architecture: s390x digest: sha256:75f4fa21224b5d5d511bea8f92dfa8e1c00231e5c81ab95e83c3013d245d1719 manifestSize: 1252 mediaType: application/vnd.docker.distribution.manifest.v2+json os: linux
3.11. Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator Copy linkLink copied to clipboard!
The Multiarch Tuning Operator optimizes workload management within multi-architecture clusters and in single-architecture clusters transitioning to multi-architecture environments.
Architecture-aware workload scheduling allows the scheduler to place pods onto nodes that match the architecture of the pod images.
By default, the scheduler does not consider the architecture of a pod’s container images when determining the placement of new pods onto nodes.
To enable architecture-aware workload scheduling, you must create the ClusterPodPlacementConfig object. When you create the ClusterPodPlacementConfig object, the Multiarch Tuning Operator deploys the necessary operands to support architecture-aware workload scheduling. You can also use the nodeAffinityScoring plugin in the ClusterPodPlacementConfig object to set cluster-wide scores for node architectures. If you enable the nodeAffinityScoring plugin, the scheduler first filters nodes with compatible architectures and then places the pod on the node with the highest score.
When a pod is created, the operands perform the following actions:
-
Add the
multiarch.openshift.io/scheduling-gatescheduling gate that prevents the scheduling of the pod. -
Compute a scheduling predicate that includes the supported architecture values for the
kubernetes.io/archlabel. -
Integrate the scheduling predicate as a
nodeAffinityrequirement in the pod specification. - Remove the scheduling gate from the pod.
Note the following operand behaviors:
-
If the
nodeSelectorfield is already configured with thekubernetes.io/archlabel for a workload, the operand does not update thenodeAffinityfield for that workload. -
If the
nodeSelectorfield is not configured with thekubernetes.io/archlabel for a workload, the operand updates thenodeAffinityfield for that workload. However, in thatnodeAffinityfield, the operand updates only the node selector terms that are not configured with thekubernetes.io/archlabel. -
If the
nodeNamefield is already set, the Multiarch Tuning Operator does not process the pod. -
If the pod is owned by a DaemonSet, the operand does not update the
nodeAffinityfield. -
If both
nodeSelectorornodeAffinityandpreferredAffinityfields are set for thekubernetes.io/archlabel, the operand does not update thenodeAffinityfield. -
If only
nodeSelectorornodeAffinityfield is set for thekubernetes.io/archlabel and thenodeAffinityScoringplugin is disabled, the operand does not update thenodeAffinityfield. -
If the
nodeAffinity.preferredDuringSchedulingIgnoredDuringExecutionfield already contains terms that score nodes based on thekubernetes.io/archlabel, the operand ignores the configuration in thenodeAffinityScoringplugin.
3.11.1. Installing the Multiarch Tuning Operator by using the CLI Copy linkLink copied to clipboard!
You can install the Multiarch Tuning Operator by using the OpenShift CLI (oc).
Prerequisites
-
You have installed
oc. -
You have logged in to
ocas a user withcluster-adminprivileges.
Procedure
Create a new project named
openshift-multiarch-tuning-operatorby running the following command:$ oc create ns openshift-multiarch-tuning-operatorCreate an
OperatorGroupobject:Create a YAML file with the configuration for creating an
OperatorGroupobject.Example YAML configuration for creating an
OperatorGroupobjectapiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: openshift-multiarch-tuning-operator namespace: openshift-multiarch-tuning-operator spec: {}Create the
OperatorGroupobject by running the following command:$ oc create -f <file_name>1 - 1
- Replace
<file_name>with the name of the YAML file that contains theOperatorGroupobject configuration.
Create a
Subscriptionobject:Create a YAML file with the configuration for creating a
Subscriptionobject.Example YAML configuration for creating a
SubscriptionobjectapiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: openshift-multiarch-tuning-operator namespace: openshift-multiarch-tuning-operator spec: channel: stable name: multiarch-tuning-operator source: redhat-operators sourceNamespace: openshift-marketplace installPlanApproval: Automatic startingCSV: multiarch-tuning-operator.<version>Create the
Subscriptionobject by running the following command:$ oc create -f <file_name>1 - 1
- Replace
<file_name>with the name of the YAML file that contains theSubscriptionobject configuration.
For more details about configuring the Subscription object and OperatorGroup object, see "Installing from OperatorHub by using the CLI".
Verification
To verify that the Multiarch Tuning Operator is installed, run the following command:
$ oc get csv -n openshift-multiarch-tuning-operatorExample output
NAME DISPLAY VERSION REPLACES PHASE multiarch-tuning-operator.<version> Multiarch Tuning Operator <version> multiarch-tuning-operator.1.0.0 SucceededThe installation is successful if the Operator is in
Succeededphase.Optional: To verify that the
OperatorGroupobject is created, run the following command:$ oc get operatorgroup -n openshift-multiarch-tuning-operatorExample output
NAME AGE openshift-multiarch-tuning-operator-q8zbb 133mOptional: To verify that the
Subscriptionobject is created, run the following command:$ oc get subscription -n openshift-multiarch-tuning-operatorExample output
NAME PACKAGE SOURCE CHANNEL multiarch-tuning-operator multiarch-tuning-operator redhat-operators stable
3.11.2. Installing the Multiarch Tuning Operator by using the web console Copy linkLink copied to clipboard!
You can install the Multiarch Tuning Operator by using the OpenShift Container Platform web console.
Prerequisites
-
You have access to the cluster with
cluster-adminprivileges. - You have access to the OpenShift Container Platform web console.
Procedure
- Log in to the OpenShift Container Platform web console.
-
Navigate to Operators
OperatorHub. - Enter Multiarch Tuning Operator in the search field.
- Click Multiarch Tuning Operator.
- Select the Multiarch Tuning Operator version from the Version list.
- Click Install
Set the following options on the Operator Installation page:
- Set Update Channel to stable.
- Set Installation Mode to All namespaces on the cluster.
Set Installed Namespace to Operator recommended Namespace or Select a Namespace.
The recommended Operator namespace is
openshift-multiarch-tuning-operator. If theopenshift-multiarch-tuning-operatornamespace does not exist, it is created during the operator installation.If you select Select a namespace, you must select a namespace for the Operator from the Select Project list.
Update approval as Automatic or Manual.
If you select Automatic updates, Operator Lifecycle Manager (OLM) automatically updates the running instance of the Multiarch Tuning Operator without any intervention.
If you select Manual updates, OLM creates an update request. As a cluster administrator, you must manually approve the update request to update the Multiarch Tuning Operator to a newer version.
- Optional: Select the Enable Operator recommended cluster monitoring on this Namespace checkbox.
- Click Install.
Verification
-
Navigate to Operators
Installed Operators. -
Verify that the Multiarch Tuning Operator is listed with the Status field as Succeeded in the
openshift-multiarch-tuning-operatornamespace.
3.11.3. Multiarch Tuning Operator pod labels and architecture support overview Copy linkLink copied to clipboard!
After installing the Multiarch Tuning Operator, you can verify the multi-architecture support for workloads in your cluster. You can identify and manage pods based on their architecture compatibility by using the pod labels. These labels are automatically set on the newly created pods to provide insights into their architecture support.
The following table describes the labels that the Multiarch Tuning Operator adds when you create a pod:
| Label | Description |
|---|---|
|
| The pod supports multiple architectures. |
|
| The pod supports only a single architecture. |
|
|
The pod supports the |
|
|
The pod supports the |
|
|
The pod supports the |
|
|
The pod supports the |
|
| The Operator has set the node affinity requirement for the architecture. |
|
| The Operator did not set the node affinity requirement. For example, when the pod already has a node affinity for the architecture, the Multiarch Tuning Operator adds this label to the pod. |
|
| The pod is gated. |
|
| The pod gate has been removed. |
|
| An error has occurred while building the node affinity requirements. |
|
| The Operator has set the architecture preferences in the pod. |
|
|
The Operator did not set the architecture preferences in the pod because the user had already set them in the |
3.11.4. Creating the ClusterPodPlacementConfig object Copy linkLink copied to clipboard!
After installing the Multiarch Tuning Operator, you must create the ClusterPodPlacementConfig object. When you create this object, the Multiarch Tuning Operator deploys an operand that enables architecture-aware workload scheduling.
You can create only one instance of the ClusterPodPlacementConfig object.
Example ClusterPodPlacementConfig object configuration
apiVersion: multiarch.openshift.io/v1beta1
kind: ClusterPodPlacementConfig
metadata:
name: cluster
spec:
logVerbosityLevel: Normal
namespaceSelector:
matchExpressions:
- key: multiarch.openshift.io/exclude-pod-placement
operator: DoesNotExist
plugins:
nodeAffinityScoring:
enabled: true
platforms:
- architecture: amd64
weight: 100
- architecture: arm64
weight: 50
- 1
- You must set this field value to
cluster. - 2
- Optional: You can set the field value to
Normal,Debug,Trace, orTraceAll. The value is set toNormalby default. - 3
- Optional: You can configure the
namespaceSelectorto select the namespaces in which the Multiarch Tuning Operator’s pod placement operand must process thenodeAffinityof the pods. All namespaces are considered by default. - 4
- Optional: Includes a list of plugins for architecture-aware workload scheduling.
- 5
- Optional: You can use this plugin to set architecture preferences for pod placement. When enabled, the scheduler first filters out nodes that do not meet the pod’s requirements. Then, it prioritizes the remaining nodes based on the architecture scores defined in the
nodeAffinityScoring.platformsfield. - 6
- Optional: Set this field to
trueto enable thenodeAffinityScoringplugin. The default value isfalse. - 7
- Optional: Defines a list of architectures and their corresponding scores.
- 8
- Specify the node architecture to score. The scheduler prioritizes nodes for pod placement based on the architecture scores that you set and the scheduling requirements defined in the pod specification. Accepted values are
arm64,amd64,ppc64le, ors390x. - 9
- Assign a score to the architecture. The value for this field must be configured in the range of
1(lowest priority) to100(highest priority). The scheduler uses this score to prioritize nodes for pod placement, favoring nodes with architectures that have higher scores.
In this example, the operator field value is set to DoesNotExist. Therefore, if the key field value (multiarch.openshift.io/exclude-pod-placement) is set as a label in a namespace, the operand does not process the nodeAffinity of the pods in that namespace. Instead, the operand processes the nodeAffinity of the pods in namespaces that do not contain the label.
If you want the operand to process the nodeAffinity of the pods only in specific namespaces, you can configure the namespaceSelector as follows:
namespaceSelector:
matchExpressions:
- key: multiarch.openshift.io/include-pod-placement
operator: Exists
In this example, the operator field value is set to Exists. Therefore, the operand processes the nodeAffinity of the pods only in namespaces that contain the multiarch.openshift.io/include-pod-placement label.
This Operator excludes pods in namespaces starting with kube-. It also excludes pods that are expected to be scheduled on control plane nodes.
3.11.4.1. Creating the ClusterPodPlacementConfig object by using the CLI Copy linkLink copied to clipboard!
To deploy the pod placement operand that enables architecture-aware workload scheduling, you can create the ClusterPodPlacementConfig object by using the OpenShift CLI (oc).
Prerequisites
-
You have installed
oc. -
You have logged in to
ocas a user withcluster-adminprivileges. - You have installed the Multiarch Tuning Operator.
Procedure
Create a
ClusterPodPlacementConfigobject YAML file:Example
ClusterPodPlacementConfigobject configurationapiVersion: multiarch.openshift.io/v1beta1 kind: ClusterPodPlacementConfig metadata: name: cluster spec: logVerbosityLevel: Normal namespaceSelector: matchExpressions: - key: multiarch.openshift.io/exclude-pod-placement operator: DoesNotExist plugins: nodeAffinityScoring: enabled: true platforms: - architecture: amd64 weight: 100 - architecture: arm64 weight: 50Create the
ClusterPodPlacementConfigobject by running the following command:$ oc create -f <file_name>1 - 1
- Replace
<file_name>with the name of theClusterPodPlacementConfigobject YAML file.
Verification
To check that the
ClusterPodPlacementConfigobject is created, run the following command:$ oc get clusterpodplacementconfigExample output
NAME AGE cluster 29s
3.11.4.2. Creating the ClusterPodPlacementConfig object by using the web console Copy linkLink copied to clipboard!
To deploy the pod placement operand that enables architecture-aware workload scheduling, you can create the ClusterPodPlacementConfig object by using the OpenShift Container Platform web console.
Prerequisites
-
You have access to the cluster with
cluster-adminprivileges. - You have access to the OpenShift Container Platform web console.
- You have installed the Multiarch Tuning Operator.
Procedure
- Log in to the OpenShift Container Platform web console.
-
Navigate to Operators
Installed Operators. - On the Installed Operators page, click Multiarch Tuning Operator.
- Click the Cluster Pod Placement Config tab.
- Select either Form view or YAML view.
-
Configure the
ClusterPodPlacementConfigobject parameters. - Click Create.
Optional: If you want to edit the
ClusterPodPlacementConfigobject, perform the following actions:- Click the Cluster Pod Placement Config tab.
- Select Edit ClusterPodPlacementConfig from the options menu.
-
Click YAML and edit the
ClusterPodPlacementConfigobject parameters. - Click Save.
Verification
-
On the Cluster Pod Placement Config page, check that the
ClusterPodPlacementConfigobject is in theReadystate.
3.11.5. Deleting the ClusterPodPlacementConfig object by using the CLI Copy linkLink copied to clipboard!
You can create only one instance of the ClusterPodPlacementConfig object. If you want to re-create this object, you must first delete the existing instance.
You can delete this object by using the OpenShift CLI (oc).
Prerequisites
-
You have installed
oc. -
You have logged in to
ocas a user withcluster-adminprivileges.
Procedure
-
Log in to the OpenShift CLI (
oc). Delete the
ClusterPodPlacementConfigobject by running the following command:$ oc delete clusterpodplacementconfig cluster
Verification
To check that the
ClusterPodPlacementConfigobject is deleted, run the following command:$ oc get clusterpodplacementconfigExample output
No resources found
3.11.6. Deleting the ClusterPodPlacementConfig object by using the web console Copy linkLink copied to clipboard!
You can create only one instance of the ClusterPodPlacementConfig object. If you want to re-create this object, you must first delete the existing instance.
You can delete this object by using the OpenShift Container Platform web console.
Prerequisites
-
You have access to the cluster with
cluster-adminprivileges. - You have access to the OpenShift Container Platform web console.
-
You have created the
ClusterPodPlacementConfigobject.
Procedure
- Log in to the OpenShift Container Platform web console.
-
Navigate to Operators
Installed Operators. - On the Installed Operators page, click Multiarch Tuning Operator.
- Click the Cluster Pod Placement Config tab.
- Select Delete ClusterPodPlacementConfig from the options menu.
- Click Delete.
Verification
-
On the Cluster Pod Placement Config page, check that the
ClusterPodPlacementConfigobject has been deleted.
3.11.7. Uninstalling the Multiarch Tuning Operator by using the CLI Copy linkLink copied to clipboard!
You can uninstall the Multiarch Tuning Operator by using the OpenShift CLI (oc).
Prerequisites
-
You have installed
oc. -
You have logged in to
ocas a user withcluster-adminprivileges. You deleted the
ClusterPodPlacementConfigobject.ImportantYou must delete the
ClusterPodPlacementConfigobject before uninstalling the Multiarch Tuning Operator. Uninstalling the Operator without deleting theClusterPodPlacementConfigobject can lead to unexpected behavior.
Procedure
Get the
Subscriptionobject name for the Multiarch Tuning Operator by running the following command:$ oc get subscription.operators.coreos.com -n <namespace>1 - 1
- Replace
<namespace>with the name of the namespace where you want to uninstall the Multiarch Tuning Operator.
Example output
NAME PACKAGE SOURCE CHANNEL openshift-multiarch-tuning-operator multiarch-tuning-operator redhat-operators stableGet the
currentCSVvalue for the Multiarch Tuning Operator by running the following command:$ oc get subscription.operators.coreos.com <subscription_name> -n <namespace> -o yaml | grep currentCSV1 - 1
- Replace
<subscription_name>with theSubscriptionobject name. For example:openshift-multiarch-tuning-operator. Replace<namespace>with the name of the namespace where you want to uninstall the Multiarch Tuning Operator.
Example output
currentCSV: multiarch-tuning-operator.<version>Delete the
Subscriptionobject by running the following command:$ oc delete subscription.operators.coreos.com <subscription_name> -n <namespace>1 - 1
- Replace
<subscription_name>with theSubscriptionobject name. Replace<namespace>with the name of the namespace where you want to uninstall the Multiarch Tuning Operator.
Example output
subscription.operators.coreos.com "openshift-multiarch-tuning-operator" deletedDelete the CSV for the Multiarch Tuning Operator in the target namespace using the
currentCSVvalue by running the following command:$ oc delete clusterserviceversion <currentCSV_value> -n <namespace>1 - 1
- Replace
<currentCSV>with thecurrentCSVvalue for the Multiarch Tuning Operator. For example:multiarch-tuning-operator.<version>. Replace<namespace>with the name of the namespace where you want to uninstall the Multiarch Tuning Operator.
Example output
clusterserviceversion.operators.coreos.com "multiarch-tuning-operator.<version>" deleted
Verification
To verify that the Multiarch Tuning Operator is uninstalled, run the following command:
$ oc get csv -n <namespace>1 - 1
- Replace
<namespace>with the name of the namespace where you have uninstalled the Multiarch Tuning Operator.
Example output
No resources found in openshift-multiarch-tuning-operator namespace.
3.11.8. Uninstalling the Multiarch Tuning Operator by using the web console Copy linkLink copied to clipboard!
You can uninstall the Multiarch Tuning Operator by using the OpenShift Container Platform web console.
Prerequisites
-
You have access to the cluster with
cluster-adminpermissions. You deleted the
ClusterPodPlacementConfigobject.ImportantYou must delete the
ClusterPodPlacementConfigobject before uninstalling the Multiarch Tuning Operator. Uninstalling the Operator without deleting theClusterPodPlacementConfigobject can lead to unexpected behavior.
Procedure
- Log in to the OpenShift Container Platform web console.
-
Navigate to Operators
OperatorHub. - Enter Multiarch Tuning Operator in the search field.
- Click Multiarch Tuning Operator.
- Click the Details tab.
- From the Actions menu, select Uninstall Operator.
- When prompted, click Uninstall.
Verification
-
Navigate to Operators
Installed Operators. - On the Installed Operators page, verify that the Multiarch Tuning Operator is not listed.
3.12. Multiarch Tuning Operator release notes Copy linkLink copied to clipboard!
The Multiarch Tuning Operator (MTO) optimizes workload management within multi-architecture clusters and in single-architecture clusters transitioning to multi-architecture environments.
These release notes track the development of the Multiarch Tuning Operator.
For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.
3.12.1. Release notes for the Multiarch Tuning Operator 1.2.1 Copy linkLink copied to clipboard!
Issued: 15 December 2025
3.12.1.1. Bug fixes Copy linkLink copied to clipboard!
- Previously, the Multiarch Tuning Operator image inspector incorrectly processed images whose registry address included a digest, tag, and port number. The port portion of the registry was incorrectly interpreted as an image tag and was trimmed, causing the inspector to construct an invalid image reference. With this update, image references that contain a digest, tag, and registry port are now correctly parsed and handled. (MULTIARCH-5767)
3.12.2. Release notes for the Multiarch Tuning Operator 1.2.0 Copy linkLink copied to clipboard!
Issued: 22 October 2025
3.12.2.1. New features and enhancements Copy linkLink copied to clipboard!
-
With this release, you can enable the exec format error monitor plugin for the Multiarch Tuning Operator. This plugin detects
ENOEXECerrors, which occur when a pod attempts to execute a binary incompatible with the node’s architecture. You enable this plugin by setting theplugins.execFormatErrorMonitor.enabledparameter totruein theClusterPodPlacementConfigobject. For more information, see Creating the ClusterPodPlacementConfig object.
3.12.2.2. Bug fixes Copy linkLink copied to clipboard!
- Previously, the Multiarch Tuning Operator incorrectly handled the Operator bundle image inspector, restricting it to a single architecture, which could cause OLM to fail when installing Operators. With this update, MTO now sets the bundle image to support all architectures, allowing Operators to be successfully installed on single-architecture clusters when the Multiarch Tuning Operator is deployed. (MULTIARCH-5546)
- Previously, when a cluster global pull secret was changed, stale authentication information could remain in the Multiarch Tuning Operator cache. With this update, the cache is cleared whenever a cluster global pull secret is changed. (MULTIARCH-5538)
- Previously, the Multiarch Tuning Operator failed to process pods if an image reference contained both a tag and a digest. With this update, the image inspector prioritizes the digest if both are present. (MULTIARCH-5584)
-
Previously, the Multiarch Tuning Operator did not respect the
.spec.registrySources.containerRuntimeSearchRegistriesfield in theconfig.openshift.io/Imagecustom resource when a workload image did not specify a registry URL. With this update, the Operator can now handle this case, allowing workload images without an explicit registry URL to be pulled successfully. (MULTIARCH-5611) -
Previously, if the
ClusterPodPlacementConfigobject was deleted less than 1 second after its creation, some finalizers were not removed in time, causing certain resources to remain. With this update, all finalizers are properly deleted when theClusterPodPlacementConfigobject is deleted. (MULTIARCH-5372)
3.12.3. Release notes for the Multiarch Tuning Operator 1.1.1 Copy linkLink copied to clipboard!
Issued: 27 May 2025
3.12.3.1. Bug fixes Copy linkLink copied to clipboard!
Previously, the pod placement operand did not support authenticating registries using wildcard entries in the hostname of their pull secret. This caused inconsistent behavior with Kubelet when pulling images, because Kubelet supported wildcard entries while the operand required exact hostname matches. As a result, image pulls could fail unexpectedly when registries used wildcard hostnames.
With this release, the pod placement operand supports pull secrets that include wildcard hostnames, ensuring consistent and reliable image authentication and pulling.
Previously, when image inspection failed after all retries and the
nodeAffinityScoringplugin was enabled, the pod placement operand applied incorrectnodeAffinityScoringlabels.With this release, the operand sets
nodeAffinityScoringlabels correctly, even when image inspection fails. It now applies these labels independently of the required affinity process to ensure accurate and consistent scheduling.
3.12.4. Release notes for the Multiarch Tuning Operator 1.1.0 Copy linkLink copied to clipboard!
Issued: 18 March 2024
3.12.4.1. New features and enhancements Copy linkLink copied to clipboard!
- The Multiarch Tuning Operator is now supported on managed offerings, including ROSA with Hosted Control Planes (HCP) and other HCP environments.
-
With this release, you can configure architecture-aware workload scheduling by using the new
pluginsfield in theClusterPodPlacementConfigobject. You can use theplugins.nodeAffinityScoringfield to set architecture preferences for pod placement. If you enable thenodeAffinityScoringplugin, the scheduler first filters out nodes that do not meet the pod requirements. Then, the scheduler prioritizes the remaining nodes based on the architecture scores defined in thenodeAffinityScoring.platformsfield.
3.12.4.1.1. Bug fixes Copy linkLink copied to clipboard!
-
With this release, the Multiarch Tuning Operator does not update the
nodeAffinityfield for pods that are managed by a daemon set. (OCPBUGS-45885)
3.12.5. Release notes for the Multiarch Tuning Operator 1.0.0 Copy linkLink copied to clipboard!
Issued: 31 October 2024
3.12.5.1. New features and enhancements Copy linkLink copied to clipboard!
- With this release, the Multiarch Tuning Operator supports custom network scenarios and cluster-wide custom registries configurations.
- With this release, you can identify pods based on their architecture compatibility by using the pod labels that the Multiarch Tuning Operator adds to newly created pods.
- With this release, you can monitor the behavior of the Multiarch Tuning Operator by using the metrics and alerts that are registered in the Cluster Monitoring Operator.