Chapter 3. Configuring multi-architecture compute machines on an OpenShift cluster


An OpenShift Container Platform cluster with multi-architecture compute machines is a cluster that supports compute machines with different architectures.

Note

When there are nodes with multiple architectures in your cluster, the architecture of your image must be consistent with the architecture of the node. You need to ensure that the pod is assigned to the node with the appropriate architecture and that it matches the image architecture. For more information on assigning pods to nodes, see Assigning pods to nodes.

Important

The Cluster Samples Operator is not supported on clusters with multi-architecture compute machines. Your cluster can be created without this capability. For more information, see Cluster capabilities.

For information on migrating your single-architecture cluster to a cluster that supports multi-architecture compute machines, see Migrating to a cluster with multi-architecture compute machines.

To create a cluster with multi-architecture compute machines with different installation options and platforms, you can use the documentation in the following table:

Expand
Table 3.1. Cluster with multi-architecture compute machine installation options
Documentation sectionPlatformUser-provisioned installationInstaller-provisioned installationControl PlaneCompute node

Creating a cluster with multi-architecture compute machines on Azure

Microsoft Azure

 

aarch64 or x86_64

aarch64, x86_64

Creating a cluster with multi-architecture compute machines on AWS

Amazon Web Services (AWS)

 

aarch64 or x86_64

aarch64, x86_64

Creating a cluster with multi-architecture compute machines on GCP

Google Cloud Platform (GCP)

 

aarch64 or x86_64

aarch64, x86_64

Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z

Bare metal

aarch64 or x86_64

aarch64, x86_64

IBM Power

 

x86_64 or ppc64le

x86_64, ppc64le

IBM Z

 

x86_64 or s390x

x86_64, s390x

Creating a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE with z/VM

IBM Z® and IBM® LinuxONE

 

x86_64

x86_64, s390x

Creating a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE with RHEL KVM

IBM Z® and IBM® LinuxONE

 

x86_64

x86_64, s390x

Creating a cluster with multi-architecture compute machines on IBM Power®

IBM Power®

 

x86_64

x86_64, ppc64le

Important

Autoscaling from zero is currently not supported on Google Cloud Platform (GCP).

To deploy an Azure cluster with multi-architecture compute machines, you must first create a single-architecture Azure installer-provisioned cluster that uses the multi-architecture installer binary. For more information on Azure installations, see Installing a cluster on Azure with customizations.

You can also migrate your current cluster with single-architecture compute machines to a cluster with multi-architecture compute machines. For more information, see Migrating to a cluster with multi-architecture compute machines.

After creating a multi-architecture cluster, you can add nodes with different architectures to the cluster.

3.2.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc).

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"
    Copy to Clipboard Toggle word wrap

Verification

  • If you see the following output, your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap

    You can then begin adding multi-arch compute nodes to your cluster.

  • If you see the following output, your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

The following procedure describes how to manually generate a 64-bit ARM boot image.

Prerequisites

  • You installed the Azure CLI (az).
  • You created a single-architecture Azure installer-provisioned cluster with the multi-architecture installer binary.

Procedure

  1. Log in to your Azure account:

    $ az login
    Copy to Clipboard Toggle word wrap
  2. Create a storage account and upload the aarch64 virtual hard disk (VHD) to your storage account. The OpenShift Container Platform installation program creates a resource group, however, the boot image can also be uploaded to a custom named resource group:

    $ az storage account create -n ${STORAGE_ACCOUNT_NAME} -g ${RESOURCE_GROUP} -l westus --sku Standard_LRS 
    1
    Copy to Clipboard Toggle word wrap
    1
    The westus object is an example region.
  3. Create a storage container using the storage account you generated:

    $ az storage container create -n ${CONTAINER_NAME} --account-name ${STORAGE_ACCOUNT_NAME}
    Copy to Clipboard Toggle word wrap
  4. You must use the OpenShift Container Platform installation program JSON file to extract the URL and aarch64 VHD name:

    1. Extract the URL field and set it to RHCOS_VHD_ORIGIN_URL as the file name by running the following command:

      $ RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.aarch64."rhel-coreos-extensions"."azure-disk".url')
      Copy to Clipboard Toggle word wrap
    2. Extract the aarch64 VHD name and set it to BLOB_NAME as the file name by running the following command:

      $ BLOB_NAME=rhcos-$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.aarch64."rhel-coreos-extensions"."azure-disk".release')-azure.aarch64.vhd
      Copy to Clipboard Toggle word wrap
  5. Generate a shared access signature (SAS) token. Use this token to upload the RHCOS VHD to your storage container with the following commands:

    $ end=`date -u -d "30 minutes" '+%Y-%m-%dT%H:%MZ'`
    Copy to Clipboard Toggle word wrap
    $ sas=`az storage container generate-sas -n ${CONTAINER_NAME} --account-name ${STORAGE_ACCOUNT_NAME} --https-only --permissions dlrw --expiry $end -o tsv`
    Copy to Clipboard Toggle word wrap
  6. Copy the RHCOS VHD into the storage container:

    $ az storage blob copy start --account-name ${STORAGE_ACCOUNT_NAME} --sas-token "$sas" \
     --source-uri "${RHCOS_VHD_ORIGIN_URL}" \
     --destination-blob "${BLOB_NAME}" --destination-container ${CONTAINER_NAME}
    Copy to Clipboard Toggle word wrap

    You can check the status of the copying process with the following command:

    $ az storage blob show -c ${CONTAINER_NAME} -n ${BLOB_NAME} --account-name ${STORAGE_ACCOUNT_NAME} | jq .properties.copy
    Copy to Clipboard Toggle word wrap

    Example output

    {
     "completionTime": null,
     "destinationSnapshot": null,
     "id": "1fd97630-03ca-489a-8c4e-cfe839c9627d",
     "incrementalCopy": null,
     "progress": "17179869696/17179869696",
     "source": "https://rhcos.blob.core.windows.net/imagebucket/rhcos-411.86.202207130959-0-azure.aarch64.vhd",
     "status": "success", 
    1
    
     "statusDescription": null
    }
    Copy to Clipboard Toggle word wrap

    1
    If the status parameter displays the success object, the copying process is complete.
  7. Create an image gallery using the following command:

    $ az sig create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME}
    Copy to Clipboard Toggle word wrap

    Use the image gallery to create an image definition. In the following example command, rhcos-arm64 is the name of the image definition.

    $ az sig image-definition create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME} --gallery-image-definition rhcos-arm64 --publisher RedHat --offer arm --sku arm64 --os-type linux --architecture Arm64 --hyper-v-generation V2
    Copy to Clipboard Toggle word wrap
  8. To get the URL of the VHD and set it to RHCOS_VHD_URL as the file name, run the following command:

    $ RHCOS_VHD_URL=$(az storage blob url --account-name ${STORAGE_ACCOUNT_NAME} -c ${CONTAINER_NAME} -n "${BLOB_NAME}" -o tsv)
    Copy to Clipboard Toggle word wrap
  9. Use the RHCOS_VHD_URL file, your storage account, resource group, and image gallery to create an image version. In the following example, 1.0.0 is the image version.

    $ az sig image-version create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME} --gallery-image-definition rhcos-arm64 --gallery-image-version 1.0.0 --os-vhd-storage-account ${STORAGE_ACCOUNT_NAME} --os-vhd-uri ${RHCOS_VHD_URL}
    Copy to Clipboard Toggle word wrap
  10. Your arm64 boot image is now generated. You can access the ID of your image with the following command:

    $ az sig image-version show -r $GALLERY_NAME -g $RESOURCE_GROUP -i rhcos-arm64 -e 1.0.0
    Copy to Clipboard Toggle word wrap

    The following example image ID is used in the recourseID parameter of the compute machine set:

    Example resourceID

    /resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.Compute/galleries/${GALLERY_NAME}/images/rhcos-arm64/versions/1.0.0
    Copy to Clipboard Toggle word wrap

The following procedure describes how to manually generate a 64-bit x86 boot image.

Prerequisites

  • You installed the Azure CLI (az).
  • You created a single-architecture Azure installer-provisioned cluster with the multi-architecture installer binary.

Procedure

  1. Log in to your Azure account by running the following command:

    $ az login
    Copy to Clipboard Toggle word wrap
  2. Create a storage account and upload the x86_64 virtual hard disk (VHD) to your storage account by running the following command. The OpenShift Container Platform installation program creates a resource group. However, the boot image can also be uploaded to a custom named resource group:

    $ az storage account create -n ${STORAGE_ACCOUNT_NAME} -g ${RESOURCE_GROUP} -l westus --sku Standard_LRS 
    1
    Copy to Clipboard Toggle word wrap
    1
    The westus object is an example region.
  3. Create a storage container using the storage account you generated by running the following command:

    $ az storage container create -n ${CONTAINER_NAME} --account-name ${STORAGE_ACCOUNT_NAME}
    Copy to Clipboard Toggle word wrap
  4. Use the OpenShift Container Platform installation program JSON file to extract the URL and x86_64 VHD name:

    1. Extract the URL field and set it to RHCOS_VHD_ORIGIN_URL as the file name by running the following command:

      $ RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.x86_64."rhel-coreos-extensions"."azure-disk".url')
      Copy to Clipboard Toggle word wrap
    2. Extract the x86_64 VHD name and set it to BLOB_NAME as the file name by running the following command:

      $ BLOB_NAME=rhcos-$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.x86_64."rhel-coreos-extensions"."azure-disk".release')-azure.x86_64.vhd
      Copy to Clipboard Toggle word wrap
  5. Generate a shared access signature (SAS) token. Use this token to upload the RHCOS VHD to your storage container by running the following commands:

    $ end=`date -u -d "30 minutes" '+%Y-%m-%dT%H:%MZ'`
    Copy to Clipboard Toggle word wrap
    $ sas=`az storage container generate-sas -n ${CONTAINER_NAME} --account-name ${STORAGE_ACCOUNT_NAME} --https-only --permissions dlrw --expiry $end -o tsv`
    Copy to Clipboard Toggle word wrap
  6. Copy the RHCOS VHD into the storage container by running the following command:

    $ az storage blob copy start --account-name ${STORAGE_ACCOUNT_NAME} --sas-token "$sas" \
     --source-uri "${RHCOS_VHD_ORIGIN_URL}" \
     --destination-blob "${BLOB_NAME}" --destination-container ${CONTAINER_NAME}
    Copy to Clipboard Toggle word wrap

    You can check the status of the copying process by running the following command:

    $ az storage blob show -c ${CONTAINER_NAME} -n ${BLOB_NAME} --account-name ${STORAGE_ACCOUNT_NAME} | jq .properties.copy
    Copy to Clipboard Toggle word wrap

    Example output

    {
     "completionTime": null,
     "destinationSnapshot": null,
     "id": "1fd97630-03ca-489a-8c4e-cfe839c9627d",
     "incrementalCopy": null,
     "progress": "17179869696/17179869696",
     "source": "https://rhcos.blob.core.windows.net/imagebucket/rhcos-411.86.202207130959-0-azure.aarch64.vhd",
     "status": "success", 
    1
    
     "statusDescription": null
    }
    Copy to Clipboard Toggle word wrap

    1
    If the status parameter displays the success object, the copying process is complete.
  7. Create an image gallery by running the following command:

    $ az sig create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME}
    Copy to Clipboard Toggle word wrap
  8. Use the image gallery to create an image definition by running the following command:

    $ az sig image-definition create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME} --gallery-image-definition rhcos-x86_64 --publisher RedHat --offer x86_64 --sku x86_64 --os-type linux --architecture x64 --hyper-v-generation V2
    Copy to Clipboard Toggle word wrap

    In this example command, rhcos-x86_64 is the name of the image definition.

  9. To get the URL of the VHD and set it to RHCOS_VHD_URL as the file name, run the following command:

    $ RHCOS_VHD_URL=$(az storage blob url --account-name ${STORAGE_ACCOUNT_NAME} -c ${CONTAINER_NAME} -n "${BLOB_NAME}" -o tsv)
    Copy to Clipboard Toggle word wrap
  10. Use the RHCOS_VHD_URL file, your storage account, resource group, and image gallery to create an image version by running the following command:

    $ az sig image-version create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME} --gallery-image-definition rhcos-arm64 --gallery-image-version 1.0.0 --os-vhd-storage-account ${STORAGE_ACCOUNT_NAME} --os-vhd-uri ${RHCOS_VHD_URL}
    Copy to Clipboard Toggle word wrap

    In this example, 1.0.0 is the image version.

  11. Optional: Access the ID of the generated x86_64 boot image by running the following command:

    $ az sig image-version show -r $GALLERY_NAME -g $RESOURCE_GROUP -i rhcos-x86_64 -e 1.0.0
    Copy to Clipboard Toggle word wrap

    The following example image ID is used in the recourseID parameter of the compute machine set:

    Example resourceID

    /resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.Compute/galleries/${GALLERY_NAME}/images/rhcos-x86_64/versions/1.0.0
    Copy to Clipboard Toggle word wrap

After creating a multi-architecture cluster, you can add nodes with different architectures.

You can add multi-architecture compute machines to a multi-architecture cluster in the following ways:

  • Adding 64-bit x86 compute machines to a cluster that uses 64-bit ARM control plane machines and already includes 64-bit ARM compute machines. In this case, 64-bit x86 is considered the secondary architecture.
  • Adding 64-bit ARM compute machines to a cluster that uses 64-bit x86 control plane machines and already includes 64-bit x86 compute machines. In this case, 64-bit ARM is considered the secondary architecture.

To create a custom compute machine set on Azure, see "Creating a compute machine set on Azure".

Note

Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig custom resource. For more information, see "Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator".

Prerequisites

  • You installed the OpenShift CLI (oc).
  • You created a 64-bit ARM or 64-bit x86 boot image.
  • You used the installation program to create a 64-bit ARM or 64-bit x86 single-architecture Azure cluster with the multi-architecture installer binary.

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. Create a YAML file, and add the configuration to create a compute machine set to control the 64-bit ARM or 64-bit x86 compute nodes in your cluster.

    Example MachineSet object for an Azure 64-bit ARM or 64-bit x86 compute node

    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
      name: <infrastructure_id>-machine-set-0
      namespace: openshift-machine-api
    spec:
      replicas: 2
      selector:
        matchLabels:
          machine.openshift.io/cluster-api-cluster: <infrastructure_id>
          machine.openshift.io/cluster-api-machineset: <infrastructure_id>-machine-set-0
      template:
        metadata:
          labels:
            machine.openshift.io/cluster-api-cluster: <infrastructure_id>
            machine.openshift.io/cluster-api-machine-role: worker
            machine.openshift.io/cluster-api-machine-type: worker
            machine.openshift.io/cluster-api-machineset: <infrastructure_id>-machine-set-0
        spec:
          lifecycleHooks: {}
          metadata: {}
          providerSpec:
            value:
              acceleratedNetworking: true
              apiVersion: machine.openshift.io/v1beta1
              credentialsSecret:
                name: azure-cloud-credentials
                namespace: openshift-machine-api
              image:
                offer: ""
                publisher: ""
                resourceID: /resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.Compute/galleries/${GALLERY_NAME}/images/rhcos-arm64/versions/1.0.0 
    1
    
                sku: ""
                version: ""
              kind: AzureMachineProviderSpec
              location: <region>
              managedIdentity: <infrastructure_id>-identity
              networkResourceGroup: <infrastructure_id>-rg
              osDisk:
                diskSettings: {}
                diskSizeGB: 128
                managedDisk:
                  storageAccountType: Premium_LRS
                osType: Linux
              publicIP: false
              publicLoadBalancer: <infrastructure_id>
              resourceGroup: <infrastructure_id>-rg
              subnet: <infrastructure_id>-worker-subnet
              userDataSecret:
                name: worker-user-data
              vmSize: Standard_D4ps_v5 
    2
    
              vnet: <infrastructure_id>-vnet
              zone: "<zone>"
    Copy to Clipboard Toggle word wrap

    1
    Set the resourceID parameter to either arm64 or amd64 boot image.
    2
    Set the vmSize parameter to the instance type used in your installation. Some example instance types are Standard_D4ps_v5 or D8ps.
  3. Create the compute machine set by running the following command:

    $ oc create -f <file_name> 
    1
    Copy to Clipboard Toggle word wrap
    1
    Replace <file_name> with the name of the YAML file with compute machine set configuration. For example: arm64-machine-set-0.yaml, or amd64-machine-set-0.yaml.

Verification

  1. Verify that the new machines are running by running the following command:

    $ oc get machineset -n openshift-machine-api
    Copy to Clipboard Toggle word wrap

    The output must include the machine set that you created.

    Example output

    NAME                                                DESIRED  CURRENT  READY  AVAILABLE  AGE
    <infrastructure_id>-machine-set-0                   2        2      2          2  10m
    Copy to Clipboard Toggle word wrap

  2. You can check if the nodes are ready and schedulable by running the following command:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

To create an AWS cluster with multi-architecture compute machines, you must first create a single-architecture AWS installer-provisioned cluster with the multi-architecture installer binary. For more information on AWS installations, see Installing a cluster on AWS with customizations.

You can also migrate your current cluster with single-architecture compute machines to a cluster with multi-architecture compute machines. For more information, see Migrating to a cluster with multi-architecture compute machines.

After creating a multi-architecture cluster, you can add nodes with different architectures to the cluster.

3.3.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc).

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"
    Copy to Clipboard Toggle word wrap

Verification

  • If you see the following output, your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap

    You can then begin adding multi-arch compute nodes to your cluster.

  • If you see the following output, your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

After creating a multi-architecture cluster, you can add nodes with different architectures.

You can add multi-architecture compute machines to a multi-architecture cluster in the following ways:

  • Adding 64-bit x86 compute machines to a cluster that uses 64-bit ARM control plane machines and already includes 64-bit ARM compute machines. In this case, 64-bit x86 is considered the secondary architecture.
  • Adding 64-bit ARM compute machines to a cluster that uses 64-bit x86 control plane machines and already includes 64-bit x86 compute machines. In this case, 64-bit ARM is considered the secondary architecture.
Note

Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig custom resource. For more information, see "Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator".

Prerequisites

  • You installed the OpenShift CLI (oc).
  • You used the installation program to create an 64-bit ARM or 64-bit x86 single-architecture AWS cluster with the multi-architecture installer binary.

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. Create a YAML file, and add the configuration to create a compute machine set to control the 64-bit ARM or 64-bit x86 compute nodes in your cluster.

    Example MachineSet object for an AWS 64-bit ARM or x86 compute node

    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id> 
    1
    
      name: <infrastructure_id>-aws-machine-set-0 
    2
    
      namespace: openshift-machine-api
    spec:
      replicas: 1
      selector:
        matchLabels:
          machine.openshift.io/cluster-api-cluster: <infrastructure_id> 
    3
    
          machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone> 
    4
    
      template:
        metadata:
          labels:
            machine.openshift.io/cluster-api-cluster: <infrastructure_id>
            machine.openshift.io/cluster-api-machine-role: <role> 
    5
    
            machine.openshift.io/cluster-api-machine-type: <role> 
    6
    
            machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone> 
    7
    
        spec:
          metadata:
            labels:
              node-role.kubernetes.io/<role>: ""
          providerSpec:
            value:
              ami:
                id: ami-02a574449d4f4d280 
    8
    
              apiVersion: awsproviderconfig.openshift.io/v1beta1
              blockDevices:
                - ebs:
                    iops: 0
                    volumeSize: 120
                    volumeType: gp2
              credentialsSecret:
                name: aws-cloud-credentials
              deviceIndex: 0
              iamInstanceProfile:
                id: <infrastructure_id>-worker-profile 
    9
    
              instanceType: m6g.xlarge 
    10
    
              kind: AWSMachineProviderConfig
              placement:
                availabilityZone: us-east-1a 
    11
    
                region: <region> 
    12
    
              securityGroups:
                - filters:
                    - name: tag:Name
                      values:
                        - <infrastructure_id>-node 
    13
    
              subnet:
                filters:
                  - name: tag:Name
                    values:
                      - <infrastructure_id>-subnet-private-<zone>
              tags:
                - name: kubernetes.io/cluster/<infrastructure_id> 
    14
    
                  value: owned
                - name: <custom_tag_name>
                  value: <custom_tag_value>
              userDataSecret:
                name: worker-user-data
    Copy to Clipboard Toggle word wrap

    1 2 3 9 13 14
    Specify the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. If you have the OpenShift CLI (oc) installed, you can obtain the infrastructure ID by running the following command:
    $ oc get -o jsonpath="{.status.infrastructureName}{'\n'}" infrastructure cluster
    Copy to Clipboard Toggle word wrap
    4 7
    Specify the infrastructure ID, role node label, and zone.
    5 6
    Specify the role node label to add.
    8
    Specify a Red Hat Enterprise Linux CoreOS (RHCOS) Amazon Machine Image (AMI) for your AWS region for the nodes. The RHCOS AMI must be compatible with the machine architecture.
    $ oc get configmap/coreos-bootimages \
    	  -n openshift-machine-config-operator \
    	  -o jsonpath='{.data.stream}' | jq \
    	  -r '.architectures.<arch>.images.aws.regions."<region>".image'
    Copy to Clipboard Toggle word wrap
    10
    Specify a machine type that aligns with the CPU architecture of the chosen AMI. For more information, see "Tested instance types for AWS 64-bit ARM"
    11
    Specify the zone. For example, us-east-1a. Ensure that the zone you select has machines with the required architecture.
    12
    Specify the region. For example, us-east-1. Ensure that the zone you select has machines with the required architecture.
  3. Create the compute machine set by running the following command:

    $ oc create -f <file_name> 
    1
    Copy to Clipboard Toggle word wrap
    1
    Replace <file_name> with the name of the YAML file with compute machine set configuration. For example: aws-arm64-machine-set-0.yaml, or aws-amd64-machine-set-0.yaml.

Verification

  1. View the list of compute machine sets by running the following command:

    $ oc get machineset -n openshift-machine-api
    Copy to Clipboard Toggle word wrap

    The output must include the machine set that you created.

    Example output

    NAME                                                DESIRED  CURRENT  READY  AVAILABLE  AGE
    <infrastructure_id>-aws-machine-set-0                   2        2      2          2  10m
    Copy to Clipboard Toggle word wrap

  2. You can check if the nodes are ready and schedulable by running the following command:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

To create a Google Cloud Platform (GCP) cluster with multi-architecture compute machines, you must first create a single-architecture GCP installer-provisioned cluster with the multi-architecture installer binary. For more information on AWS installations, see Installing a cluster on GCP with customizations.

You can also migrate your current cluster with single-architecture compute machines to a cluster with multi-architecture compute machines. For more information, see Migrating to a cluster with multi-architecture compute machines.

After creating a multi-architecture cluster, you can add nodes with different architectures to the cluster.

Note

Secure booting is currently not supported on 64-bit ARM machines for GCP

3.4.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc).

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"
    Copy to Clipboard Toggle word wrap

Verification

  • If you see the following output, your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap

    You can then begin adding multi-arch compute nodes to your cluster.

  • If you see the following output, your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

After creating a multi-architecture cluster, you can add nodes with different architectures.

You can add multi-architecture compute machines to a multi-architecture cluster in the following ways:

  • Adding 64-bit x86 compute machines to a cluster that uses 64-bit ARM control plane machines and already includes 64-bit ARM compute machines. In this case, 64-bit x86 is considered the secondary architecture.
  • Adding 64-bit ARM compute machines to a cluster that uses 64-bit x86 control plane machines and already includes 64-bit x86 compute machines. In this case, 64-bit ARM is considered the secondary architecture.
Note

Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig custom resource. For more information, see "Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator".

Prerequisites

  • You installed the OpenShift CLI (oc).
  • You used the installation program to create a 64-bit x86 or 64-bit ARM single-architecture GCP cluster with the multi-architecture installer binary.

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. Create a YAML file, and add the configuration to create a compute machine set to control the 64-bit ARM or 64-bit x86 compute nodes in your cluster.

    Example MachineSet object for a GCP 64-bit ARM or 64-bit x86 compute node

    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id> 
    1
    
      name: <infrastructure_id>-w-a
      namespace: openshift-machine-api
    spec:
      replicas: 1
      selector:
        matchLabels:
          machine.openshift.io/cluster-api-cluster: <infrastructure_id>
          machine.openshift.io/cluster-api-machineset: <infrastructure_id>-w-a
      template:
        metadata:
          creationTimestamp: null
          labels:
            machine.openshift.io/cluster-api-cluster: <infrastructure_id>
            machine.openshift.io/cluster-api-machine-role: <role> 
    2
    
            machine.openshift.io/cluster-api-machine-type: <role>
            machine.openshift.io/cluster-api-machineset: <infrastructure_id>-w-a
        spec:
          metadata:
            labels:
              node-role.kubernetes.io/<role>: ""
          providerSpec:
            value:
              apiVersion: gcpprovider.openshift.io/v1beta1
              canIPForward: false
              credentialsSecret:
                name: gcp-cloud-credentials
              deletionProtection: false
              disks:
              - autoDelete: true
                boot: true
                image: <path_to_image> 
    3
    
                labels: null
                sizeGb: 128
                type: pd-ssd
              gcpMetadata: 
    4
    
              - key: <custom_metadata_key>
                value: <custom_metadata_value>
              kind: GCPMachineProviderSpec
              machineType: n1-standard-4 
    5
    
              metadata:
                creationTimestamp: null
              networkInterfaces:
              - network: <infrastructure_id>-network
                subnetwork: <infrastructure_id>-worker-subnet
              projectID: <project_name> 
    6
    
              region: us-central1 
    7
    
              serviceAccounts:
              - email: <infrastructure_id>-w@<project_name>.iam.gserviceaccount.com
                scopes:
                - https://www.googleapis.com/auth/cloud-platform
              tags:
                - <infrastructure_id>-worker
              userDataSecret:
                name: worker-user-data
              zone: us-central1-a
    Copy to Clipboard Toggle word wrap

    1
    Specify the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. You can obtain the infrastructure ID by running the following command:
    $ oc get -o jsonpath='{.status.infrastructureName}{"\n"}' infrastructure cluster
    Copy to Clipboard Toggle word wrap
    2
    Specify the role node label to add.
    3
    Specify the path to the image that is used in current compute machine sets. You need the project and image name for your path to image.

    To access the project and image name, run the following command:

    $ oc get configmap/coreos-bootimages \
      -n openshift-machine-config-operator \
      -o jsonpath='{.data.stream}' | jq \
      -r '.architectures.aarch64.images.gcp'
    Copy to Clipboard Toggle word wrap

    Example output

      "gcp": {
        "release": "415.92.202309142014-0",
        "project": "rhcos-cloud",
        "name": "rhcos-415-92-202309142014-0-gcp-aarch64"
      }
    Copy to Clipboard Toggle word wrap

    Use the project and name parameters from the output to create the path to image field in your machine set. The path to the image should follow the following format:

    $ projects/<project>/global/images/<image_name>
    Copy to Clipboard Toggle word wrap
    4
    Optional: Specify custom metadata in the form of a key:value pair. For example use cases, see the GCP documentation for setting custom metadata.
    5
    Specify a machine type that aligns with the CPU architecture of the chosen OS image. For more information, see "Tested instance types for GCP on 64-bit ARM infrastructures".
    6
    Specify the name of the GCP project that you use for your cluster.
    7
    Specify the region. For example, us-central1. Ensure that the zone you select has machines with the required architecture.
  3. Create the compute machine set by running the following command:

    $ oc create -f <file_name> 
    1
    Copy to Clipboard Toggle word wrap
    1
    Replace <file_name> with the name of the YAML file with compute machine set configuration. For example: gcp-arm64-machine-set-0.yaml, or gcp-amd64-machine-set-0.yaml.

Verification

  1. View the list of compute machine sets by running the following command:

    $ oc get machineset -n openshift-machine-api
    Copy to Clipboard Toggle word wrap

    The output must include the machine set that you created.

    Example output

    NAME                                                DESIRED  CURRENT  READY  AVAILABLE  AGE
    <infrastructure_id>-gcp-machine-set-0                   2        2      2          2  10m
    Copy to Clipboard Toggle word wrap

  2. You can check if the nodes are ready and schedulable by running the following command:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

To create a cluster with multi-architecture compute machines on bare metal (x86_64 or aarch64), IBM Power® (ppc64le), or IBM Z® (s390x) you must have an existing single-architecture cluster on one of these platforms. Follow the installations procedures for your platform:

Important

The bare metal installer-provisioned infrastructure and the Bare Metal Operator do not support adding secondary architecture nodes during the initial cluster setup. You can add secondary architecture nodes manually only after the initial cluster setup.

Before you can add additional compute nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.

The following procedures explain how to create a RHCOS compute machine using an ISO image or network PXE booting. This allows you to add additional nodes to your cluster and deploy a cluster with multi-architecture compute machines.

Note

Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig object. For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.

3.5.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc).

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"
    Copy to Clipboard Toggle word wrap

Verification

  • If you see the following output, your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap

    You can then begin adding multi-arch compute nodes to your cluster.

  • If you see the following output, your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

3.5.2. Creating RHCOS machines using an ISO image

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using an ISO image to create the machines.

Prerequisites

  • Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
  • You must have the OpenShift CLI (oc) installed.

Procedure

  1. Extract the Ignition config file from the cluster by running the following command:

    $ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
    Copy to Clipboard Toggle word wrap
  2. Upload the worker.ign Ignition config file you exported from your cluster to your HTTP server. Note the URLs of these files.
  3. You can validate that the ignition files are available on the URLs. The following example gets the Ignition config files for the compute node:

    $ curl -k http://<HTTP_server>/worker.ign
    Copy to Clipboard Toggle word wrap
  4. You can access the ISO image for booting your new machine by running to following command:

    RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.<architecture>.artifacts.metal.formats.iso.disk.location')
    Copy to Clipboard Toggle word wrap
  5. Use the ISO file to install RHCOS on more compute machines. Use the same method that you used when you created machines before you installed the cluster:

    • Burn the ISO image to a disk and boot it directly.
    • Use ISO redirection with a LOM interface.
  6. Boot the RHCOS ISO image without specifying any options, or interrupting the live boot sequence. Wait for the installer to boot into a shell prompt in the RHCOS live environment.

    Note

    You can interrupt the RHCOS installation boot process to add kernel arguments. However, for this ISO procedure you must use the coreos-installer command as outlined in the following steps, instead of adding kernel arguments.

  7. Run the coreos-installer command and specify the options that meet your installation requirements. At a minimum, you must specify the URL that points to the Ignition config file for the node type, and the device that you are installing to:

    $ sudo coreos-installer install --ignition-url=http://<HTTP_server>/<node_type>.ign <device> --ignition-hash=sha512-<digest> 
    1
    2
    Copy to Clipboard Toggle word wrap
    1
    You must run the coreos-installer command by using sudo, because the core user does not have the required root privileges to perform the installation.
    2
    The --ignition-hash option is required when the Ignition config file is obtained through an HTTP URL to validate the authenticity of the Ignition config file on the cluster node. <digest> is the Ignition config file SHA512 digest obtained in a preceding step.
    Note

    If you want to provide your Ignition config files through an HTTPS server that uses TLS, you can add the internal certificate authority (CA) to the system trust store before running coreos-installer.

    The following example initializes a bootstrap node installation to the /dev/sda device. The Ignition config file for the bootstrap node is obtained from an HTTP web server with the IP address 192.168.1.2:

    $ sudo coreos-installer install --ignition-url=http://192.168.1.2:80/installation_directory/bootstrap.ign /dev/sda --ignition-hash=sha512-a5a2d43879223273c9b60af66b44202a1d1248fc01cf156c46d4a79f552b6bad47bc8cc78ddf0116e80c59d2ea9e32ba53bc807afbca581aa059311def2c3e3b
    Copy to Clipboard Toggle word wrap
  8. Monitor the progress of the RHCOS installation on the console of the machine.

    Important

    Ensure that the installation is successful on each node before commencing with the OpenShift Container Platform installation. Observing the installation process can also help to determine the cause of RHCOS installation issues that might arise.

  9. Continue to create more compute machines for your cluster.

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using PXE or iPXE booting.

Prerequisites

  • Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
  • Obtain the URLs of the RHCOS ISO image, compressed metal BIOS, kernel, and initramfs files that you uploaded to your HTTP server during cluster installation.
  • You have access to the PXE booting infrastructure that you used to create the machines for your OpenShift Container Platform cluster during installation. The machines must boot from their local disks after RHCOS is installed on them.
  • If you use UEFI, you have access to the grub.conf file that you modified during OpenShift Container Platform installation.

Procedure

  1. Confirm that your PXE or iPXE installation for the RHCOS images is correct.

    • For PXE:

      DEFAULT pxeboot
      TIMEOUT 20
      PROMPT 0
      LABEL pxeboot
          KERNEL http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> 
      1
      
          APPEND initrd=http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img 
      2
      Copy to Clipboard Toggle word wrap
      1
      Specify the location of the live kernel file that you uploaded to your HTTP server.
      2
      Specify locations of the RHCOS files that you uploaded to your HTTP server. The initrd parameter value is the location of the live initramfs file, the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file, and the coreos.live.rootfs_url parameter value is the location of the live rootfs file. The coreos.inst.ignition_url and coreos.live.rootfs_url parameters only support HTTP and HTTPS.
      Note

      This configuration does not enable serial console access on machines with a graphical console. To configure a different console, add one or more console= arguments to the APPEND line. For example, add console=tty0 console=ttyS0 to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux?.

    • For iPXE (x86_64 + aarch64):

      kernel http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> initrd=main coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 
      1
       
      2
      
      initrd --name main http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img 
      3
      
      boot
      Copy to Clipboard Toggle word wrap
      1
      Specify the locations of the RHCOS files that you uploaded to your HTTP server. The kernel parameter value is the location of the kernel file, the initrd=main argument is needed for booting on UEFI systems, the coreos.live.rootfs_url parameter value is the location of the rootfs file, and the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file.
      2
      If you use multiple NICs, specify a single interface in the ip option. For example, to use DHCP on a NIC that is named eno1, set ip=eno1:dhcp.
      3
      Specify the location of the initramfs file that you uploaded to your HTTP server.
      Note

      This configuration does not enable serial console access on machines with a graphical console To configure a different console, add one or more console= arguments to the kernel line. For example, add console=tty0 console=ttyS0 to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux? and "Enabling the serial console for PXE and ISO installation" in the "Advanced RHCOS installation configuration" section.

      Note

      To network boot the CoreOS kernel on aarch64 architecture, you need to use a version of iPXE build with the IMAGE_GZIP option enabled. See IMAGE_GZIP option in iPXE.

    • For PXE (with UEFI and GRUB as second stage) on aarch64:

      menuentry 'Install CoreOS' {
          linux rhcos-<version>-live-kernel-<architecture>  coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 
      1
       
      2
      
          initrd rhcos-<version>-live-initramfs.<architecture>.img 
      3
      
      }
      Copy to Clipboard Toggle word wrap
      1
      Specify the locations of the RHCOS files that you uploaded to your HTTP/TFTP server. The kernel parameter value is the location of the kernel file on your TFTP server. The coreos.live.rootfs_url parameter value is the location of the rootfs file, and the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file on your HTTP Server.
      2
      If you use multiple NICs, specify a single interface in the ip option. For example, to use DHCP on a NIC that is named eno1, set ip=eno1:dhcp.
      3
      Specify the location of the initramfs file that you uploaded to your TFTP server.
  2. Use the PXE or iPXE infrastructure to create the required compute machines for your cluster.

When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.

Prerequisites

  • You added machines to your cluster.

Procedure

  1. Confirm that the cluster recognizes the machines:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  63m  v1.33.4
    master-1  Ready     master  63m  v1.33.4
    master-2  Ready     master  64m  v1.33.4
    Copy to Clipboard Toggle word wrap

    The output lists all of the machines that you created.

    Note

    The preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.

  2. Review the pending CSRs and ensure that you see the client requests with the Pending or Approved status for each machine that you added to the cluster:

    $ oc get csr
    Copy to Clipboard Toggle word wrap

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-8b2br   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    csr-8vnps   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    ...
    Copy to Clipboard Toggle word wrap

    In this example, two machines are joining the cluster. You might see more approved CSRs in the list.

  3. If the CSRs were not approved, after all of the pending CSRs for the machines you added are in Pending status, approve the CSRs for your cluster machines:

    Note

    Because the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the machine-approver if the Kubelet requests a new certificate with identical parameters.

    Note

    For clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the oc exec, oc rsh, and oc logs commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by the node-bootstrapper service account in the system:node or system:admin groups, and confirm the identity of the node.

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
      Copy to Clipboard Toggle word wrap
      Note

      Some Operators might not become available until some CSRs are approved.

  4. Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:

    $ oc get csr
    Copy to Clipboard Toggle word wrap

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-bfd72   5m26s   system:node:ip-10-0-50-126.us-east-2.compute.internal                       Pending
    csr-c57lv   5m26s   system:node:ip-10-0-95-157.us-east-2.compute.internal                       Pending
    ...
    Copy to Clipboard Toggle word wrap

  5. If the remaining CSRs are not approved, and are in the Pending status, approve the CSRs for your cluster machines:

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
      Copy to Clipboard Toggle word wrap
  6. After all client and server CSRs have been approved, the machines have the Ready status. Verify this by running the following command:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  73m  v1.33.4
    master-1  Ready     master  73m  v1.33.4
    master-2  Ready     master  74m  v1.33.4
    worker-0  Ready     worker  11m  v1.33.4
    worker-1  Ready     worker  11m  v1.33.4
    Copy to Clipboard Toggle word wrap

    Note

    It can take a few minutes after approval of the server CSRs for the machines to transition to the Ready status.

Additional information

To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x) with z/VM, you must have an existing single-architecture x86_64 cluster. You can then add s390x compute machines to your OpenShift Container Platform cluster.

Before you can add s390x nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.

The following procedures explain how to create a RHCOS compute machine using a z/VM instance. This will allow you to add s390x nodes to your cluster and deploy a cluster with multi-architecture compute machines.

To create an IBM Z® or IBM® LinuxONE (s390x) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.

Note

Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig object. For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.

3.6.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc).

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"
    Copy to Clipboard Toggle word wrap

Verification

  • If you see the following output, your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap

    You can then begin adding multi-arch compute nodes to your cluster.

  • If you see the following output, your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

3.6.2. Creating RHCOS machines on IBM Z with z/VM

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines running on IBM Z® with z/VM and attach them to your existing cluster.

Prerequisites

  • You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
  • You have an HTTP or HTTPS server running on your provisioning machine that is accessible to the machines you create.

Procedure

  1. Extract the Ignition config file from the cluster by running the following command:

    $ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
    Copy to Clipboard Toggle word wrap
  2. Upload the worker.ign Ignition config file you exported from your cluster to your HTTP server. Note the URL of this file.
  3. You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:

    $ curl -k http://<http_server>/worker.ign
    Copy to Clipboard Toggle word wrap
  4. Download the RHEL live kernel, initramfs, and rootfs files by running the following commands:

    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')
    Copy to Clipboard Toggle word wrap
    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')
    Copy to Clipboard Toggle word wrap
    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location')
    Copy to Clipboard Toggle word wrap
  5. Move the downloaded RHEL live kernel, initramfs, and rootfs files to an HTTP or HTTPS server that is accessible from the RHCOS guest you want to add.
  6. Create a parameter file for the guest. The following parameters are specific for the virtual machine:

    • Optional: To specify a static IP address, add an ip= parameter with the following entries, with each separated by a colon:

      1. The IP address for the machine.
      2. An empty string.
      3. The gateway.
      4. The netmask.
      5. The machine host and domain name in the form hostname.domainname. If you omit this value, RHCOS obtains the hostname through a reverse DNS lookup.
      6. The network interface name. If you omit this value, RHCOS applies the IP configuration to all available interfaces.
      7. The value none.
    • For coreos.inst.ignition_url=, specify the URL to the worker.ign file. Only HTTP and HTTPS protocols are supported.
    • For coreos.live.rootfs_url=, specify the matching rootfs artifact for the kernel and initramfs you are booting. Only HTTP and HTTPS protocols are supported.
    • For installations on DASD-type disks, complete the following tasks:

      1. For coreos.inst.install_dev=, specify /dev/dasda.
      2. Use rd.dasd= to specify the DASD where RHCOS is to be installed.
      3. You can adjust further parameters if required.

        The following is an example parameter file, additional-worker-dasd.parm:

        cio_ignore=all,!condev rd.neednet=1 \
        console=ttysclp0 \
        coreos.inst.install_dev=/dev/dasda \
        coreos.inst.ignition_url=http://<http_server>/worker.ign \
        coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img \
        ip=<ip>::<gateway>:<netmask>:<hostname>::none nameserver=<dns> \
        rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \
        rd.dasd=0.0.3490 \
        zfcp.allow_lun_scan=0
        Copy to Clipboard Toggle word wrap

        Write all options in the parameter file as a single line and make sure that you have no newline characters.

    • For installations on FCP-type disks, complete the following tasks:

      1. Use rd.zfcp=<adapter>,<wwpn>,<lun> to specify the FCP disk where RHCOS is to be installed. For multipathing, repeat this step for each additional path.

        Note

        When you install with multiple paths, you must enable multipathing directly after the installation, not at a later point in time, as this can cause problems.

      2. Set the install device as: coreos.inst.install_dev=/dev/sda.

        Note

        If additional LUNs are configured with NPIV, FCP requires zfcp.allow_lun_scan=0. If you must enable zfcp.allow_lun_scan=1 because you use a CSI driver, for example, you must configure your NPIV so that each node cannot access the boot partition of another node.

      3. You can adjust further parameters if required.

        Important

        Additional postinstallation steps are required to fully enable multipathing. For more information, see “Enabling multipathing with kernel arguments on RHCOS" in Machine configuration.

        The following is an example parameter file, additional-worker-fcp.parm for a worker node with multipathing:

        cio_ignore=all,!condev rd.neednet=1 \
        console=ttysclp0 \
        coreos.inst.install_dev=/dev/sda \
        coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img \
        coreos.inst.ignition_url=http://<http_server>/worker.ign \
        ip=<ip>::<gateway>:<netmask>:<hostname>::none nameserver=<dns> \
        rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \
        zfcp.allow_lun_scan=0 \
        rd.zfcp=0.0.1987,0x50050763070bc5e3,0x4008400B00000000 \
        rd.zfcp=0.0.19C7,0x50050763070bc5e3,0x4008400B00000000 \
        rd.zfcp=0.0.1987,0x50050763071bc5e3,0x4008400B00000000 \
        rd.zfcp=0.0.19C7,0x50050763071bc5e3,0x4008400B00000000
        Copy to Clipboard Toggle word wrap

        Write all options in the parameter file as a single line and make sure that you have no newline characters.

  7. Transfer the initramfs, kernel, parameter files, and RHCOS images to z/VM, for example, by using FTP. For details about how to transfer the files with FTP and boot from the virtual reader, see Booting the installation on IBM Z® to install RHEL in z/VM.
  8. Punch the files to the virtual reader of the z/VM guest virtual machine.

    See PUNCH in IBM® Documentation.

    Tip

    You can use the CP PUNCH command or, if you use Linux, the vmur command to transfer files between two z/VM guest virtual machines.

  9. Log in to CMS on the bootstrap machine.
  10. IPL the bootstrap machine from the reader by running the following command:

    $ ipl c
    Copy to Clipboard Toggle word wrap

    See IPL in IBM® Documentation.

When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.

Prerequisites

  • You added machines to your cluster.

Procedure

  1. Confirm that the cluster recognizes the machines:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  63m  v1.33.4
    master-1  Ready     master  63m  v1.33.4
    master-2  Ready     master  64m  v1.33.4
    Copy to Clipboard Toggle word wrap

    The output lists all of the machines that you created.

    Note

    The preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.

  2. Review the pending CSRs and ensure that you see the client requests with the Pending or Approved status for each machine that you added to the cluster:

    $ oc get csr
    Copy to Clipboard Toggle word wrap

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-8b2br   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    csr-8vnps   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    ...
    Copy to Clipboard Toggle word wrap

    In this example, two machines are joining the cluster. You might see more approved CSRs in the list.

  3. If the CSRs were not approved, after all of the pending CSRs for the machines you added are in Pending status, approve the CSRs for your cluster machines:

    Note

    Because the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the machine-approver if the Kubelet requests a new certificate with identical parameters.

    Note

    For clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the oc exec, oc rsh, and oc logs commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by the node-bootstrapper service account in the system:node or system:admin groups, and confirm the identity of the node.

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
      Copy to Clipboard Toggle word wrap
      Note

      Some Operators might not become available until some CSRs are approved.

  4. Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:

    $ oc get csr
    Copy to Clipboard Toggle word wrap

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-bfd72   5m26s   system:node:ip-10-0-50-126.us-east-2.compute.internal                       Pending
    csr-c57lv   5m26s   system:node:ip-10-0-95-157.us-east-2.compute.internal                       Pending
    ...
    Copy to Clipboard Toggle word wrap

  5. If the remaining CSRs are not approved, and are in the Pending status, approve the CSRs for your cluster machines:

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
      Copy to Clipboard Toggle word wrap
  6. After all client and server CSRs have been approved, the machines have the Ready status. Verify this by running the following command:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  73m  v1.33.4
    master-1  Ready     master  73m  v1.33.4
    master-2  Ready     master  74m  v1.33.4
    worker-0  Ready     worker  11m  v1.33.4
    worker-1  Ready     worker  11m  v1.33.4
    Copy to Clipboard Toggle word wrap

    Note

    It can take a few minutes after approval of the server CSRs for the machines to transition to the Ready status.

Additional information

To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x) in an LPAR, you must have an existing single-architecture x86_64 cluster. You can then add s390x compute machines to your OpenShift Container Platform cluster.

Before you can add s390x nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.

The following procedures explain how to create a RHCOS compute machine using an LPAR instance. This will allow you to add s390x nodes to your cluster and deploy a cluster with multi-architecture compute machines.

Note

To create an IBM Z® or IBM® LinuxONE (s390x) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.

3.7.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc).

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"
    Copy to Clipboard Toggle word wrap

Verification

  • If you see the following output, your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap

    You can then begin adding multi-arch compute nodes to your cluster.

  • If you see the following output, your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

3.7.2. Creating RHCOS machines on IBM Z in an LPAR

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines running on IBM Z® in a logical partition (LPAR) and attach them to your existing cluster.

Prerequisites

  • You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
  • You have an HTTP or HTTPS server running on your provisioning machine that is accessible to the machines you create.

Procedure

  1. Extract the Ignition config file from the cluster by running the following command:

    $ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
    Copy to Clipboard Toggle word wrap
  2. Upload the worker.ign Ignition config file you exported from your cluster to your HTTP server. Note the URL of this file.
  3. You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:

    $ curl -k http://<http_server>/worker.ign
    Copy to Clipboard Toggle word wrap
  4. Download the RHEL live kernel, initramfs, and rootfs files by running the following commands:

    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')
    Copy to Clipboard Toggle word wrap
    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')
    Copy to Clipboard Toggle word wrap
    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location')
    Copy to Clipboard Toggle word wrap
  5. Move the downloaded RHEL live kernel, initramfs, and rootfs files to an HTTP or HTTPS server that is accessible from the RHCOS guest you want to add.
  6. Create a parameter file for the guest. The following parameters are specific for the virtual machine:

    • Optional: To specify a static IP address, add an ip= parameter with the following entries, with each separated by a colon:

      1. The IP address for the machine.
      2. An empty string.
      3. The gateway.
      4. The netmask.
      5. The machine host and domain name in the form hostname.domainname. If you omit this value, RHCOS obtains the hostname through a reverse DNS lookup.
      6. The network interface name. If you omit this value, RHCOS applies the IP configuration to all available interfaces.
      7. The value none.
    • For coreos.inst.ignition_url=, specify the URL to the worker.ign file. Only HTTP and HTTPS protocols are supported.
    • For coreos.live.rootfs_url=, specify the matching rootfs artifact for the kernel and initramfs you are booting. Only HTTP and HTTPS protocols are supported.
    • For installations on DASD-type disks, complete the following tasks:

      1. For coreos.inst.install_dev=, specify /dev/dasda.
      2. Use rd.dasd= to specify the DASD where RHCOS is to be installed.
      3. You can adjust further parameters if required.

        The following is an example parameter file, additional-worker-dasd.parm:

        cio_ignore=all,!condev rd.neednet=1 \
        console=ttysclp0 \
        coreos.inst.install_dev=/dev/dasda \
        coreos.inst.ignition_url=http://<http_server>/worker.ign \
        coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img \
        ip=<ip>::<gateway>:<netmask>:<hostname>::none nameserver=<dns> \
        rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \
        rd.dasd=0.0.3490 \
        zfcp.allow_lun_scan=0
        Copy to Clipboard Toggle word wrap

        Write all options in the parameter file as a single line and make sure that you have no newline characters.

    • For installations on FCP-type disks, complete the following tasks:

      1. Use rd.zfcp=<adapter>,<wwpn>,<lun> to specify the FCP disk where RHCOS is to be installed. For multipathing, repeat this step for each additional path.

        Note

        When you install with multiple paths, you must enable multipathing directly after the installation, not at a later point in time, as this can cause problems.

      2. Set the install device as: coreos.inst.install_dev=/dev/sda.

        Note

        If additional LUNs are configured with NPIV, FCP requires zfcp.allow_lun_scan=0. If you must enable zfcp.allow_lun_scan=1 because you use a CSI driver, for example, you must configure your NPIV so that each node cannot access the boot partition of another node.

      3. You can adjust further parameters if required.

        Important

        Additional postinstallation steps are required to fully enable multipathing. For more information, see “Enabling multipathing with kernel arguments on RHCOS" in Machine configuration.

        The following is an example parameter file, additional-worker-fcp.parm for a worker node with multipathing:

        cio_ignore=all,!condev rd.neednet=1 \
        console=ttysclp0 \
        coreos.inst.install_dev=/dev/sda \
        coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img \
        coreos.inst.ignition_url=http://<http_server>/worker.ign \
        ip=<ip>::<gateway>:<netmask>:<hostname>::none nameserver=<dns> \
        rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \
        zfcp.allow_lun_scan=0 \
        rd.zfcp=0.0.1987,0x50050763070bc5e3,0x4008400B00000000 \
        rd.zfcp=0.0.19C7,0x50050763070bc5e3,0x4008400B00000000 \
        rd.zfcp=0.0.1987,0x50050763071bc5e3,0x4008400B00000000 \
        rd.zfcp=0.0.19C7,0x50050763071bc5e3,0x4008400B00000000
        Copy to Clipboard Toggle word wrap

        Write all options in the parameter file as a single line and make sure that you have no newline characters.

  7. Transfer the initramfs, kernel, parameter files, and RHCOS images to the LPAR, for example with FTP. For details about how to transfer the files with FTP and boot, see Booting the installation on IBM Z® to install RHEL in an LPAR.
  8. Boot the machine

When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.

Prerequisites

  • You added machines to your cluster.

Procedure

  1. Confirm that the cluster recognizes the machines:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  63m  v1.33.4
    master-1  Ready     master  63m  v1.33.4
    master-2  Ready     master  64m  v1.33.4
    Copy to Clipboard Toggle word wrap

    The output lists all of the machines that you created.

    Note

    The preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.

  2. Review the pending CSRs and ensure that you see the client requests with the Pending or Approved status for each machine that you added to the cluster:

    $ oc get csr
    Copy to Clipboard Toggle word wrap

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-8b2br   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    csr-8vnps   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    ...
    Copy to Clipboard Toggle word wrap

    In this example, two machines are joining the cluster. You might see more approved CSRs in the list.

  3. If the CSRs were not approved, after all of the pending CSRs for the machines you added are in Pending status, approve the CSRs for your cluster machines:

    Note

    Because the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the machine-approver if the Kubelet requests a new certificate with identical parameters.

    Note

    For clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the oc exec, oc rsh, and oc logs commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by the node-bootstrapper service account in the system:node or system:admin groups, and confirm the identity of the node.

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
      Copy to Clipboard Toggle word wrap
      Note

      Some Operators might not become available until some CSRs are approved.

  4. Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:

    $ oc get csr
    Copy to Clipboard Toggle word wrap

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-bfd72   5m26s   system:node:ip-10-0-50-126.us-east-2.compute.internal                       Pending
    csr-c57lv   5m26s   system:node:ip-10-0-95-157.us-east-2.compute.internal                       Pending
    ...
    Copy to Clipboard Toggle word wrap

  5. If the remaining CSRs are not approved, and are in the Pending status, approve the CSRs for your cluster machines:

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
      Copy to Clipboard Toggle word wrap
  6. After all client and server CSRs have been approved, the machines have the Ready status. Verify this by running the following command:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  73m  v1.33.4
    master-1  Ready     master  73m  v1.33.4
    master-2  Ready     master  74m  v1.33.4
    worker-0  Ready     worker  11m  v1.33.4
    worker-1  Ready     worker  11m  v1.33.4
    Copy to Clipboard Toggle word wrap

    Note

    It can take a few minutes after approval of the server CSRs for the machines to transition to the Ready status.

Additional information

To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x) with RHEL KVM, you must have an existing single-architecture x86_64 cluster. You can then add s390x compute machines to your OpenShift Container Platform cluster.

Before you can add s390x nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.

The following procedures explain how to create a RHCOS compute machine using a RHEL KVM instance. This will allow you to add s390x nodes to your cluster and deploy a cluster with multi-architecture compute machines.

To create an IBM Z® or IBM® LinuxONE (s390x) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.

Note

Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig object. For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.

3.8.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc).

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"
    Copy to Clipboard Toggle word wrap

Verification

  • If you see the following output, your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap

    You can then begin adding multi-arch compute nodes to your cluster.

  • If you see the following output, your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

3.8.2. Creating RHCOS machines using virt-install

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your cluster by using virt-install.

Prerequisites

  • You have at least one LPAR running on RHEL 8.7 or later with KVM, referred to as RHEL KVM host in this procedure.
  • The KVM/QEMU hypervisor is installed on the RHEL KVM host.
  • You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
  • An HTTP or HTTPS server is set up.

Procedure

  1. Extract the Ignition config file from the cluster by running the following command:

    $ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
    Copy to Clipboard Toggle word wrap
  2. Upload the worker.ign Ignition config file you exported from your cluster to your HTTP server. Note the URL of this file.
  3. You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:

    $ curl -k http://<HTTP_server>/worker.ign
    Copy to Clipboard Toggle word wrap
  4. Download the RHEL live kernel, initramfs, and rootfs files by running the following commands:

     $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')
    Copy to Clipboard Toggle word wrap
    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')
    Copy to Clipboard Toggle word wrap
    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location')
    Copy to Clipboard Toggle word wrap
  5. Move the downloaded RHEL live kernel, initramfs, and rootfs files to an HTTP or HTTPS server before you launch virt-install.
  6. Create the new KVM guest nodes using the RHEL kernel, initramfs, and Ignition files; the new disk image; and adjusted parm line arguments.

    $ virt-install \
       --connect qemu:///system \
       --name <vm_name> \
       --autostart \
       --os-variant rhel9.4 \ 
    1
    
       --cpu host \
       --vcpus <vcpus> \
       --memory <memory_mb> \
       --disk <vm_name>.qcow2,size=<image_size> \
       --network network=<virt_network_parm> \
       --location <media_location>,kernel=<rhcos_kernel>,initrd=<rhcos_initrd> \ 
    2
    
       --extra-args "rd.neednet=1" \
       --extra-args "coreos.inst.install_dev=/dev/vda" \
       --extra-args "coreos.inst.ignition_url=http://<http_server>/worker.ign " \ 
    3
    
       --extra-args "coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img" \ 
    4
    
       --extra-args "ip=<ip>::<gateway>:<netmask>:<hostname>::none" \ 
    5
    
       --extra-args "nameserver=<dns>" \
       --extra-args "console=ttysclp0" \
       --noautoconsole \
       --wait
    Copy to Clipboard Toggle word wrap
    1
    For os-variant, specify the RHEL version for the RHCOS compute machine. rhel9.4 is the recommended version. To query the supported RHEL version of your operating system, run the following command:
    $ osinfo-query os -f short-id
    Copy to Clipboard Toggle word wrap
    Note

    The os-variant is case sensitive.

    2
    For --location, specify the location of the kernel/initrd on the HTTP or HTTPS server.
    3
    Specify the location of the worker.ign config file. Only HTTP and HTTPS protocols are supported.
    4
    Specify the location of the rootfs artifact for the kernel and initramfs you are booting. Only HTTP and HTTPS protocols are supported
    5
    Optional: For hostname, specify the fully qualified hostname of the client machine.
    Note

    If you are using HAProxy as a load balancer, update your HAProxy rules for ingress-router-443 and ingress-router-80 in the /etc/haproxy/haproxy.cfg configuration file.

  7. Continue to create more compute machines for your cluster.

When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.

Prerequisites

  • You added machines to your cluster.

Procedure

  1. Confirm that the cluster recognizes the machines:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  63m  v1.33.4
    master-1  Ready     master  63m  v1.33.4
    master-2  Ready     master  64m  v1.33.4
    Copy to Clipboard Toggle word wrap

    The output lists all of the machines that you created.

    Note

    The preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.

  2. Review the pending CSRs and ensure that you see the client requests with the Pending or Approved status for each machine that you added to the cluster:

    $ oc get csr
    Copy to Clipboard Toggle word wrap

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-8b2br   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    csr-8vnps   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    ...
    Copy to Clipboard Toggle word wrap

    In this example, two machines are joining the cluster. You might see more approved CSRs in the list.

  3. If the CSRs were not approved, after all of the pending CSRs for the machines you added are in Pending status, approve the CSRs for your cluster machines:

    Note

    Because the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the machine-approver if the Kubelet requests a new certificate with identical parameters.

    Note

    For clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the oc exec, oc rsh, and oc logs commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by the node-bootstrapper service account in the system:node or system:admin groups, and confirm the identity of the node.

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
      Copy to Clipboard Toggle word wrap
      Note

      Some Operators might not become available until some CSRs are approved.

  4. Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:

    $ oc get csr
    Copy to Clipboard Toggle word wrap

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-bfd72   5m26s   system:node:ip-10-0-50-126.us-east-2.compute.internal                       Pending
    csr-c57lv   5m26s   system:node:ip-10-0-95-157.us-east-2.compute.internal                       Pending
    ...
    Copy to Clipboard Toggle word wrap

  5. If the remaining CSRs are not approved, and are in the Pending status, approve the CSRs for your cluster machines:

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
      Copy to Clipboard Toggle word wrap
  6. After all client and server CSRs have been approved, the machines have the Ready status. Verify this by running the following command:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  73m  v1.33.4
    master-1  Ready     master  73m  v1.33.4
    master-2  Ready     master  74m  v1.33.4
    worker-0  Ready     worker  11m  v1.33.4
    worker-1  Ready     worker  11m  v1.33.4
    Copy to Clipboard Toggle word wrap

    Note

    It can take a few minutes after approval of the server CSRs for the machines to transition to the Ready status.

Additional information

To create a cluster with multi-architecture compute machines on IBM Power® (ppc64le), you must have an existing single-architecture (x86_64) cluster. You can then add ppc64le compute machines to your OpenShift Container Platform cluster.

Important

Before you can add ppc64le nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.

The following procedures explain how to create a RHCOS compute machine using an ISO image or network PXE booting. This will allow you to add ppc64le nodes to your cluster and deploy a cluster with multi-architecture compute machines.

To create an IBM Power® (ppc64le) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Power®. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.

Note

Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig object. For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.

3.9.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc).
Note

When using multiple architectures, hosts for OpenShift Container Platform nodes must share the same storage layer. If they do not have the same storage layer, use a storage provider such as nfs-provisioner.

Note

You should limit the number of network hops between the compute and control plane as much as possible.

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"
    Copy to Clipboard Toggle word wrap

Verification

  • If you see the following output, your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap

    You can then begin adding multi-arch compute nodes to your cluster.

  • If you see the following output, your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Copy to Clipboard Toggle word wrap
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

3.9.2. Creating RHCOS machines using an ISO image

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your cluster by using an ISO image to create the machines.

Prerequisites

  • Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
  • You must have the OpenShift CLI (oc) installed.

Procedure

  1. Extract the Ignition config file from the cluster by running the following command:

    $ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
    Copy to Clipboard Toggle word wrap
  2. Upload the worker.ign Ignition config file you exported from your cluster to your HTTP server. Note the URLs of these files.
  3. You can validate that the ignition files are available on the URLs. The following example gets the Ignition config files for the compute node:

    $ curl -k http://<HTTP_server>/worker.ign
    Copy to Clipboard Toggle word wrap
  4. You can access the ISO image for booting your new machine by running to following command:

    RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.<architecture>.artifacts.metal.formats.iso.disk.location')
    Copy to Clipboard Toggle word wrap
  5. Use the ISO file to install RHCOS on more compute machines. Use the same method that you used when you created machines before you installed the cluster:

    • Burn the ISO image to a disk and boot it directly.
    • Use ISO redirection with a LOM interface.
  6. Boot the RHCOS ISO image without specifying any options, or interrupting the live boot sequence. Wait for the installer to boot into a shell prompt in the RHCOS live environment.

    Note

    You can interrupt the RHCOS installation boot process to add kernel arguments. However, for this ISO procedure you must use the coreos-installer command as outlined in the following steps, instead of adding kernel arguments.

  7. Run the coreos-installer command and specify the options that meet your installation requirements. At a minimum, you must specify the URL that points to the Ignition config file for the node type, and the device that you are installing to:

    $ sudo coreos-installer install --ignition-url=http://<HTTP_server>/<node_type>.ign <device> --ignition-hash=sha512-<digest> 
    1
    2
    Copy to Clipboard Toggle word wrap
    1
    You must run the coreos-installer command by using sudo, because the core user does not have the required root privileges to perform the installation.
    2
    The --ignition-hash option is required when the Ignition config file is obtained through an HTTP URL to validate the authenticity of the Ignition config file on the cluster node. <digest> is the Ignition config file SHA512 digest obtained in a preceding step.
    Note

    If you want to provide your Ignition config files through an HTTPS server that uses TLS, you can add the internal certificate authority (CA) to the system trust store before running coreos-installer.

    The following example initializes a bootstrap node installation to the /dev/sda device. The Ignition config file for the bootstrap node is obtained from an HTTP web server with the IP address 192.168.1.2:

    $ sudo coreos-installer install --ignition-url=http://192.168.1.2:80/installation_directory/bootstrap.ign /dev/sda --ignition-hash=sha512-a5a2d43879223273c9b60af66b44202a1d1248fc01cf156c46d4a79f552b6bad47bc8cc78ddf0116e80c59d2ea9e32ba53bc807afbca581aa059311def2c3e3b
    Copy to Clipboard Toggle word wrap
  8. Monitor the progress of the RHCOS installation on the console of the machine.

    Important

    Ensure that the installation is successful on each node before commencing with the OpenShift Container Platform installation. Observing the installation process can also help to determine the cause of RHCOS installation issues that might arise.

  9. Continue to create more compute machines for your cluster.

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using PXE or iPXE booting.

Prerequisites

  • Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
  • Obtain the URLs of the RHCOS ISO image, compressed metal BIOS, kernel, and initramfs files that you uploaded to your HTTP server during cluster installation.
  • You have access to the PXE booting infrastructure that you used to create the machines for your OpenShift Container Platform cluster during installation. The machines must boot from their local disks after RHCOS is installed on them.
  • If you use UEFI, you have access to the grub.conf file that you modified during OpenShift Container Platform installation.

Procedure

  1. Confirm that your PXE or iPXE installation for the RHCOS images is correct.

    • For PXE:

      DEFAULT pxeboot
      TIMEOUT 20
      PROMPT 0
      LABEL pxeboot
          KERNEL http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> 
      1
      
          APPEND initrd=http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img 
      2
      Copy to Clipboard Toggle word wrap
      1
      Specify the location of the live kernel file that you uploaded to your HTTP server.
      2
      Specify locations of the RHCOS files that you uploaded to your HTTP server. The initrd parameter value is the location of the live initramfs file, the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file, and the coreos.live.rootfs_url parameter value is the location of the live rootfs file. The coreos.inst.ignition_url and coreos.live.rootfs_url parameters only support HTTP and HTTPS.
      Note

      This configuration does not enable serial console access on machines with a graphical console. To configure a different console, add one or more console= arguments to the APPEND line. For example, add console=tty0 console=ttyS0 to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux?.

    • For iPXE (x86_64 + ppc64le):

      kernel http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> initrd=main coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 
      1
       
      2
      
      initrd --name main http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img 
      3
      
      boot
      Copy to Clipboard Toggle word wrap
      1
      Specify the locations of the RHCOS files that you uploaded to your HTTP server. The kernel parameter value is the location of the kernel file, the initrd=main argument is needed for booting on UEFI systems, the coreos.live.rootfs_url parameter value is the location of the rootfs file, and the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file.
      2
      If you use multiple NICs, specify a single interface in the ip option. For example, to use DHCP on a NIC that is named eno1, set ip=eno1:dhcp.
      3
      Specify the location of the initramfs file that you uploaded to your HTTP server.
      Note

      This configuration does not enable serial console access on machines with a graphical console To configure a different console, add one or more console= arguments to the kernel line. For example, add console=tty0 console=ttyS0 to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux? and "Enabling the serial console for PXE and ISO installation" in the "Advanced RHCOS installation configuration" section.

      Note

      To network boot the CoreOS kernel on ppc64le architecture, you need to use a version of iPXE build with the IMAGE_GZIP option enabled. See IMAGE_GZIP option in iPXE.

    • For PXE (with UEFI and GRUB as second stage) on ppc64le:

      menuentry 'Install CoreOS' {
          linux rhcos-<version>-live-kernel-<architecture>  coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 
      1
       
      2
      
          initrd rhcos-<version>-live-initramfs.<architecture>.img 
      3
      
      }
      Copy to Clipboard Toggle word wrap
      1
      Specify the locations of the RHCOS files that you uploaded to your HTTP/TFTP server. The kernel parameter value is the location of the kernel file on your TFTP server. The coreos.live.rootfs_url parameter value is the location of the rootfs file, and the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file on your HTTP Server.
      2
      If you use multiple NICs, specify a single interface in the ip option. For example, to use DHCP on a NIC that is named eno1, set ip=eno1:dhcp.
      3
      Specify the location of the initramfs file that you uploaded to your TFTP server.
  2. Use the PXE or iPXE infrastructure to create the required compute machines for your cluster.

When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.

Prerequisites

  • You added machines to your cluster.

Procedure

  1. Confirm that the cluster recognizes the machines:

    $ oc get nodes
    Copy to Clipboard Toggle word wrap

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  63m  v1.33.4
    master-1  Ready     master  63m  v1.33.4
    master-2  Ready     master  64m  v1.33.4
    Copy to Clipboard Toggle word wrap

    The output lists all of the machines that you created.

    Note

    The preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.

  2. Review the pending CSRs and ensure that you see the client requests with the Pending or Approved status for each machine that you added to the cluster:

    $ oc get csr
    Copy to Clipboard Toggle word wrap

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-8b2br   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    csr-8vnps   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    ...
    Copy to Clipboard Toggle word wrap

    In this example, two machines are joining the cluster. You might see more approved CSRs in the list.

  3. If the CSRs were not approved, after all of the pending CSRs for the machines you added are in Pending status, approve the CSRs for your cluster machines:

    Note

    Because the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the machine-approver if the Kubelet requests a new certificate with identical parameters.

    Note

    For clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the oc exec, oc rsh, and oc logs commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by the node-bootstrapper service account in the system:node or system:admin groups, and confirm the identity of the node.

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
      Copy to Clipboard Toggle word wrap
      Note

      Some Operators might not become available until some CSRs are approved.

  4. Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:

    $ oc get csr
    Copy to Clipboard Toggle word wrap

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-bfd72   5m26s   system:node:ip-10-0-50-126.us-east-2.compute.internal                       Pending
    csr-c57lv   5m26s   system:node:ip-10-0-95-157.us-east-2.compute.internal                       Pending
    ...
    Copy to Clipboard Toggle word wrap

  5. If the remaining CSRs are not approved, and are in the Pending status, approve the CSRs for your cluster machines:

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
      Copy to Clipboard Toggle word wrap
  6. After all client and server CSRs have been approved, the machines have the Ready status. Verify this by running the following command:

    $ oc get nodes -o wide
    Copy to Clipboard Toggle word wrap

    Example output

    NAME               STATUS   ROLES                  AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                  CONTAINER-RUNTIME
    worker-0-ppc64le   Ready    worker                 42d   v1.33.4   192.168.200.21   <none>        Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.ppc64le   cri-o://1.33.4-3.rhaos4.15.gitb36169e.el9
    worker-1-ppc64le   Ready    worker                 42d   v1.33.4   192.168.200.20   <none>        Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.ppc64le   cri-o://1.33.4-3.rhaos4.15.gitb36169e.el9
    master-0-x86       Ready    control-plane,master   75d   v1.33.4   10.248.0.38      10.248.0.38   Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.x86_64    cri-o://1.33.4-3.rhaos4.15.gitb36169e.el9
    master-1-x86       Ready    control-plane,master   75d   v1.33.4   10.248.0.39      10.248.0.39   Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.x86_64    cri-o://1.33.4-3.rhaos4.15.gitb36169e.el9
    master-2-x86       Ready    control-plane,master   75d   v1.33.4   10.248.0.40      10.248.0.40   Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.x86_64    cri-o://1.33.4-3.rhaos4.15.gitb36169e.el9
    worker-0-x86       Ready    worker                 75d   v1.33.4   10.248.0.43      10.248.0.43   Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.x86_64    cri-o://1.33.4-3.rhaos4.15.gitb36169e.el9
    worker-1-x86       Ready    worker                 75d   v1.33.4   10.248.0.44      10.248.0.44   Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.x86_64    cri-o://1.33.4-3.rhaos4.15.gitb36169e.el9
    Copy to Clipboard Toggle word wrap

    Note

    It can take a few minutes after approval of the server CSRs for the machines to transition to the Ready status.

Additional information

Managing a cluster that has nodes with multiple architectures requires you to consider node architecture as you monitor the cluster and manage your workloads. This requires you to take additional considerations into account when you configure cluster resource requirements and behavior, or schedule workloads in a multi-architecture cluster.

When you deploy workloads on a cluster with compute nodes that use different architectures, you must align pod architecture with the architecture of the underlying node. Your workload may also require additional configuration to particular resources depending on the underlying node architecture.

You can use the Multiarch Tuning Operator to enable architecture-aware scheduling of workloads on clusters with multi-architecture compute machines. The Multiarch Tuning Operator implements additional scheduler predicates in the pods specifications based on the architectures that the pods can support at creation time.

Scheduling a workload to an appropriate node based on architecture works in the same way as scheduling based on any other node characteristic. Consider the following options when determining how to schedule your workloads.

Using nodeAffinity to schedule nodes with specific architectures

You can allow a workload to be scheduled on only a set of nodes with architectures supported by its images, you can set the spec.affinity.nodeAffinity field in your pod’s template specification.

Example deployment with node affinity set

apiVersion: apps/v1
kind: Deployment
metadata: # ...
spec:
   # ...
  template:
     # ...
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/arch
                operator: In
                values: 
1

                - amd64
                - arm64
Copy to Clipboard Toggle word wrap
1
Specify the supported architectures. Valid values include amd64,arm64, or both values.

Tainting each node for a specific architecture

You can taint a node to avoid the node scheduling workloads that are incompatible with its architecture. When your cluster uses a MachineSet object, you can add parameters to the .spec.template.spec.taints field to avoid workloads being scheduled on nodes with non-supported architectures.

Before you add a taint to a node, you must scale down the MachineSet object or remove existing available machines. For more information, see Modifying a compute machine set.

Example machine set with taint set

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata: # ...
spec:
  # ...
  template:
    # ...
    spec:
      # ...
      taints:
      - effect: NoSchedule
        key: multiarch.openshift.io/arch
        value: arm64
Copy to Clipboard Toggle word wrap

You can also set a taint on a specific node by running the following command:

$ oc adm taint nodes <node-name> multiarch.openshift.io/arch=arm64:NoSchedule
Copy to Clipboard Toggle word wrap
Creating a default toleration in a namespace

When a node or machine set has a taint, only workloads that tolerate that taint can be scheduled. You can annotate a namespace so all of the workloads get the same default toleration by running the following command:

Example default toleration set on a namespace

$ oc annotate namespace my-namespace \
  'scheduler.alpha.kubernetes.io/defaultTolerations'='[{"operator": "Exists", "effect": "NoSchedule", "key": "multiarch.openshift.io/arch"}]'
Copy to Clipboard Toggle word wrap

Tolerating architecture taints in workloads

When a node or machine set has a taint, only workloads that tolerate that taint can be scheduled. You can configure your workload with a toleration so that it is scheduled on nodes with specific architecture taints.

Example deployment with toleration set

apiVersion: apps/v1
kind: Deployment
metadata: # ...
spec:
  # ...
  template:
    # ...
    spec:
      tolerations:
      - key: "multiarch.openshift.io/arch"
        value: "arm64"
        operator: "Equal"
        effect: "NoSchedule"
Copy to Clipboard Toggle word wrap

This example deployment can be scheduled on nodes and machine sets that have the multiarch.openshift.io/arch=arm64 taint specified.

Using node affinity with taints and tolerations

When a scheduler computes the set of nodes to schedule a pod, tolerations can broaden the set while node affinity restricts the set. If you set a taint on nodes that have a specific architecture, you must also add a toleration to workloads that you want to be scheduled there.

Example deployment with node affinity and toleration set

apiVersion: apps/v1
kind: Deployment
metadata: # ...
spec:
  # ...
  template:
    # ...
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/arch
                operator: In
                values:
                - amd64
                - arm64
      tolerations:
      - key: "multiarch.openshift.io/arch"
        value: "arm64"
        operator: "Equal"
        effect: "NoSchedule"
Copy to Clipboard Toggle word wrap

You can enable the 64k memory page in the Red Hat Enterprise Linux CoreOS (RHCOS) kernel on the 64-bit ARM compute machines in your cluster. The 64k page size kernel specification can be used for large GPU or high memory workloads. This is done using the Machine Config Operator (MCO) which uses a machine config pool to update the kernel. To enable 64k page sizes, you must dedicate a machine config pool for ARM64 to enable on the kernel.

Important

Using 64k pages is exclusive to 64-bit ARM architecture compute nodes or clusters installed on 64-bit ARM machines. If you configure the 64k pages kernel on a machine config pool using 64-bit x86 machines, the machine config pool and MCO will degrade.

Prerequisites

  • You installed the OpenShift CLI (oc).
  • You created a cluster with compute nodes of different architecture on one of the supported platforms.

Procedure

  1. Label the nodes where you want to run the 64k page size kernel:

    $ oc label node <node_name> <label>
    Copy to Clipboard Toggle word wrap

    Example command

    $ oc label node worker-arm64-01 node-role.kubernetes.io/worker-64k-pages=
    Copy to Clipboard Toggle word wrap

  2. Create a machine config pool that contains the worker role that uses the ARM64 architecture and the worker-64k-pages role:

    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineConfigPool
    metadata:
      name: worker-64k-pages
    spec:
      machineConfigSelector:
        matchExpressions:
          - key: machineconfiguration.openshift.io/role
            operator: In
            values:
            - worker
            - worker-64k-pages
      nodeSelector:
        matchLabels:
          node-role.kubernetes.io/worker-64k-pages: ""
          kubernetes.io/arch: arm64
    Copy to Clipboard Toggle word wrap
  3. Create a machine config on your compute node to enable 64k-pages with the 64k-pages parameter.

    $ oc create -f <filename>.yaml
    Copy to Clipboard Toggle word wrap

    Example MachineConfig

    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineConfig
    metadata:
      labels:
        machineconfiguration.openshift.io/role: "worker-64k-pages" 
    1
    
      name: 99-worker-64kpages
    spec:
      kernelType: 64k-pages 
    2
    Copy to Clipboard Toggle word wrap

    1
    Specify the value of the machineconfiguration.openshift.io/role label in the custom machine config pool. The example MachineConfig uses the worker-64k-pages label to enable 64k pages in the worker-64k-pages pool.
    2
    Specify your desired kernel type. Valid values are 64k-pages and default
    Note

    The 64k-pages type is supported on only 64-bit ARM architecture based compute nodes. The realtime type is supported on only 64-bit x86 architecture based compute nodes.

Verification

  • To view your new worker-64k-pages machine config pool, run the following command:

    $ oc get mcp
    Copy to Clipboard Toggle word wrap

    Example output

    NAME     CONFIG                                                                UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
    master   rendered-master-9d55ac9a91127c36314e1efe7d77fbf8                      True      False      False      3              3                   3                     0                      361d
    worker   rendered-worker-e7b61751c4a5b7ff995d64b967c421ff                      True      False      False      7              7                   7                     0                      361d
    worker-64k-pages  rendered-worker-64k-pages-e7b61751c4a5b7ff995d64b967c421ff   True      False      False      2              2                   2                     0                      35m
    Copy to Clipboard Toggle word wrap

On an OpenShift Container Platform 4.20 cluster with multi-architecture compute machines, the image streams in the cluster do not import manifest lists automatically. You must manually change the default importMode option to the PreserveOriginal option in order to import the manifest list.

Prerequisites

  • You installed the OpenShift Container Platform CLI (oc).

Procedure

  • The following example command shows how to patch the ImageStream cli-artifacts so that the cli-artifacts:latest image stream tag is imported as a manifest list.

    $ oc patch is/cli-artifacts -n openshift -p '{"spec":{"tags":[{"name":"latest","importPolicy":{"importMode":"PreserveOriginal"}}]}}'
    Copy to Clipboard Toggle word wrap

Verification

  • You can check that the manifest lists imported properly by inspecting the image stream tag. The following command will list the individual architecture manifests for a particular tag.

    $ oc get istag cli-artifacts:latest -n openshift -oyaml
    Copy to Clipboard Toggle word wrap

    If the dockerImageManifests object is present, then the manifest list import was successful.

    Example output of the dockerImageManifests object

    dockerImageManifests:
      - architecture: amd64
        digest: sha256:16d4c96c52923a9968fbfa69425ec703aff711f1db822e4e9788bf5d2bee5d77
        manifestSize: 1252
        mediaType: application/vnd.docker.distribution.manifest.v2+json
        os: linux
      - architecture: arm64
        digest: sha256:6ec8ad0d897bcdf727531f7d0b716931728999492709d19d8b09f0d90d57f626
        manifestSize: 1252
        mediaType: application/vnd.docker.distribution.manifest.v2+json
        os: linux
      - architecture: ppc64le
        digest: sha256:65949e3a80349cdc42acd8c5b34cde6ebc3241eae8daaeea458498fedb359a6a
        manifestSize: 1252
        mediaType: application/vnd.docker.distribution.manifest.v2+json
        os: linux
      - architecture: s390x
        digest: sha256:75f4fa21224b5d5d511bea8f92dfa8e1c00231e5c81ab95e83c3013d245d1719
        manifestSize: 1252
        mediaType: application/vnd.docker.distribution.manifest.v2+json
        os: linux
    Copy to Clipboard Toggle word wrap

The Multiarch Tuning Operator optimizes workload management within multi-architecture clusters and in single-architecture clusters transitioning to multi-architecture environments.

Architecture-aware workload scheduling allows the scheduler to place pods onto nodes that match the architecture of the pod images.

By default, the scheduler does not consider the architecture of a pod’s container images when determining the placement of new pods onto nodes.

To enable architecture-aware workload scheduling, you must create the ClusterPodPlacementConfig object. When you create the ClusterPodPlacementConfig object, the Multiarch Tuning Operator deploys the necessary operands to support architecture-aware workload scheduling. You can also use the nodeAffinityScoring plugin in the ClusterPodPlacementConfig object to set cluster-wide scores for node architectures. If you enable the nodeAffinityScoring plugin, the scheduler first filters nodes with compatible architectures and then places the pod on the node with the highest score.

When a pod is created, the operands perform the following actions:

  1. Add the multiarch.openshift.io/scheduling-gate scheduling gate that prevents the scheduling of the pod.
  2. Compute a scheduling predicate that includes the supported architecture values for the kubernetes.io/arch label.
  3. Integrate the scheduling predicate as a nodeAffinity requirement in the pod specification.
  4. Remove the scheduling gate from the pod.
Important

Note the following operand behaviors:

  • If the nodeSelector field is already configured with the kubernetes.io/arch label for a workload, the operand does not update the nodeAffinity field for that workload.
  • If the nodeSelector field is not configured with the kubernetes.io/arch label for a workload, the operand updates the nodeAffinity field for that workload. However, in that nodeAffinity field, the operand updates only the node selector terms that are not configured with the kubernetes.io/arch label.
  • If the nodeName field is already set, the Multiarch Tuning Operator does not process the pod.
  • If the pod is owned by a DaemonSet, the operand does not update the nodeAffinity field.
  • If both nodeSelector or nodeAffinity and preferredAffinity fields are set for the kubernetes.io/arch label, the operand does not update the nodeAffinity field.
  • If only nodeSelector or nodeAffinity field is set for the kubernetes.io/arch label and the nodeAffinityScoring plugin is disabled, the operand does not update the nodeAffinity field.
  • If the nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution field already contains terms that score nodes based on the kubernetes.io/arch label, the operand ignores the configuration in the nodeAffinityScoring plugin.

You can install the Multiarch Tuning Operator by using the OpenShift CLI (oc).

Prerequisites

  • You have installed oc.
  • You have logged in to oc as a user with cluster-admin privileges.

Procedure

  1. Create a new project named openshift-multiarch-tuning-operator by running the following command:

    $ oc create ns openshift-multiarch-tuning-operator
    Copy to Clipboard Toggle word wrap
  2. Create an OperatorGroup object:

    1. Create a YAML file with the configuration for creating an OperatorGroup object.

      Example YAML configuration for creating an OperatorGroup object

      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
        name: openshift-multiarch-tuning-operator
        namespace: openshift-multiarch-tuning-operator
      spec: {}
      Copy to Clipboard Toggle word wrap

    2. Create the OperatorGroup object by running the following command:

      $ oc create -f <file_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      Replace <file_name> with the name of the YAML file that contains the OperatorGroup object configuration.
  3. Create a Subscription object:

    1. Create a YAML file with the configuration for creating a Subscription object.

      Example YAML configuration for creating a Subscription object

      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
        name: openshift-multiarch-tuning-operator
        namespace: openshift-multiarch-tuning-operator
      spec:
        channel: stable
        name: multiarch-tuning-operator
        source: redhat-operators
        sourceNamespace: openshift-marketplace
        installPlanApproval: Automatic
        startingCSV: multiarch-tuning-operator.<version>
      Copy to Clipboard Toggle word wrap

    2. Create the Subscription object by running the following command:

      $ oc create -f <file_name> 
      1
      Copy to Clipboard Toggle word wrap
      1
      Replace <file_name> with the name of the YAML file that contains the Subscription object configuration.
Note

For more details about configuring the Subscription object and OperatorGroup object, see "Installing from the software catalog by using the CLI".

Verification

  1. To verify that the Multiarch Tuning Operator is installed, run the following command:

    $ oc get csv -n openshift-multiarch-tuning-operator
    Copy to Clipboard Toggle word wrap

    Example output

    NAME                                   DISPLAY                     VERSION       REPLACES                            PHASE
    multiarch-tuning-operator.<version>   Multiarch Tuning Operator   <version>     multiarch-tuning-operator.1.0.0      Succeeded
    Copy to Clipboard Toggle word wrap

    The installation is successful if the Operator is in Succeeded phase.

  2. Optional: To verify that the OperatorGroup object is created, run the following command:

    $ oc get operatorgroup -n openshift-multiarch-tuning-operator
    Copy to Clipboard Toggle word wrap

    Example output

    NAME                                        AGE
    openshift-multiarch-tuning-operator-q8zbb   133m
    Copy to Clipboard Toggle word wrap

  3. Optional: To verify that the Subscription object is created, run the following command:

    $ oc get subscription -n openshift-multiarch-tuning-operator
    Copy to Clipboard Toggle word wrap

    Example output

    NAME                        PACKAGE                     SOURCE                  CHANNEL
    multiarch-tuning-operator   multiarch-tuning-operator   redhat-operators        stable
    Copy to Clipboard Toggle word wrap

You can install the Multiarch Tuning Operator by using the OpenShift Container Platform web console.

Prerequisites

  • You have access to the cluster with cluster-admin privileges.
  • You have access to the OpenShift Container Platform web console.

Procedure

  1. Log in to the OpenShift Container Platform web console.
  2. Navigate to Ecosystem Software Catalog.
  3. Enter Multiarch Tuning Operator in the search field.
  4. Click Multiarch Tuning Operator.
  5. Select the Multiarch Tuning Operator version from the Version list.
  6. Click Install
  7. Set the following options on the Operator Installation page:

    1. Set Update Channel to stable.
    2. Set Installation Mode to All namespaces on the cluster.
    3. Set Installed Namespace to Operator recommended Namespace or Select a Namespace.

      The recommended Operator namespace is openshift-multiarch-tuning-operator. If the openshift-multiarch-tuning-operator namespace does not exist, it is created during the operator installation.

      If you select Select a namespace, you must select a namespace for the Operator from the Select Project list.

    4. Update approval as Automatic or Manual.

      If you select Automatic updates, Operator Lifecycle Manager (OLM) automatically updates the running instance of the Multiarch Tuning Operator without any intervention.

      If you select Manual updates, OLM creates an update request. As a cluster administrator, you must manually approve the update request to update the Multiarch Tuning Operator to a newer version.

  8. Optional: Select the Enable Operator recommended cluster monitoring on this Namespace checkbox.
  9. Click Install.

Verification

  1. Navigate to Ecosystem Installed Operators.
  2. Verify that the Multiarch Tuning Operator is listed with the Status field as Succeeded in the openshift-multiarch-tuning-operator namespace.

After installing the Multiarch Tuning Operator, you can verify the multi-architecture support for workloads in your cluster. You can identify and manage pods based on their architecture compatibility by using the pod labels. These labels are automatically set on the newly created pods to provide insights into their architecture support.

The following table describes the labels that the Multiarch Tuning Operator adds when you create a pod:

Expand
Table 3.2. Pod labels that the Multiarch Tuning Operator adds when you create a pod
LabelDescription

multiarch.openshift.io/multi-arch: ""

The pod supports multiple architectures.

multiarch.openshift.io/single-arch: ""

The pod supports only a single architecture.

multiarch.openshift.io/arm64: ""

The pod supports the arm64 architecture.

multiarch.openshift.io/amd64: ""

The pod supports the amd64 architecture.

multiarch.openshift.io/ppc64le: ""

The pod supports the ppc64le architecture.

multiarch.openshift.io/s390x: ""

The pod supports the s390x architecture.

multirach.openshift.io/node-affinity: set

The Operator has set the node affinity requirement for the architecture.

multirach.openshift.io/node-affinity: not-set

The Operator did not set the node affinity requirement. For example, when the pod already has a node affinity for the architecture, the Multiarch Tuning Operator adds this label to the pod.

multiarch.openshift.io/scheduling-gate: gated

The pod is gated.

multiarch.openshift.io/scheduling-gate: removed

The pod gate has been removed.

multiarch.openshift.io/inspection-error: ""

An error has occurred while building the node affinity requirements.

multiarch.openshift.io/preferred-node-affinity: set

The Operator has set the architecture preferences in the pod.

multiarch.openshift.io/preferred-node-affinity: not-set

The Operator did not set the architecture preferences in the pod because the user had already set them in the preferredDuringSchedulingIgnoredDuringExecution node affinity.

After installing the Multiarch Tuning Operator, you must create a ClusterPodPlacementConfig object. This object instructs the operator to deploy its operand, which enables architecture-aware workload scheduling across your cluster.

The ClusterPodPlacementConfig object supports two optional plugins:

  • The node affinity scoring plugin patches pods to set soft preferences, using weighted affinities, for the architectures specified by the user. Pods are more likely to be scheduled on nodes running architectures with higher weights.
  • The exec format error monitor plugin detects ENOEXEC errors, which occur when a pod attempts to execute a binary incompatible with the node’s architecture. When enabled, this plugin generates events in the affected pod’s event stream. It triggers a ExecFormatErrorsDetected Prometheus alert if one or more ENOEXEC errors are detected within the last six hours. These errors can result from incorrect architecture node selectors, invalid image metadata that affects architecture-aware workload scheduling, an incorrect binary in an image, or an incompatible binary injected at runtime.
Note

You can create only one instance of the ClusterPodPlacementConfig object.

Example ClusterPodPlacementConfig object configuration

apiVersion: multiarch.openshift.io/v1beta1
kind: ClusterPodPlacementConfig
metadata:
  name: cluster 
1

spec:
  logVerbosityLevel: Normal 
2

  namespaceSelector: 
3

    matchExpressions:
      - key: multiarch.openshift.io/exclude-pod-placement
        operator: DoesNotExist
  plugins: 
4

    nodeAffinityScoring: 
5

      enabled: true 
6

      platforms: 
7

        - architecture: amd64 
8

          weight: 100 
9

        - architecture: arm64
          weight: 50
    execFormatErrorMonitor:
      enabled: true 
10
Copy to Clipboard Toggle word wrap

1
You must set this field value to cluster.
2
Optional: You can set the field value to Normal, Debug, Trace, or TraceAll. The value is set to Normal by default.
3
Optional: You can configure the namespaceSelector to select the namespaces in which the Multiarch Tuning Operator’s pod placement operand must process the nodeAffinity of the pods. All namespaces are considered by default.
4
Optional: Includes a list of plugins for architecture-aware workload scheduling.
5
Optional: You can use this plugin to set architecture preferences for pod placement. When enabled, the scheduler first filters out nodes that do not meet the pod’s requirements. Then, it prioritizes the remaining nodes based on the architecture scores defined in the nodeAffinityScoring.platforms field.
6
Optional: Set this field to true to enable the nodeAffinityScoring plugin. The default value is false.
7
Optional: Defines a list of architectures and their corresponding scores.
8
Specify the node architecture to score. The scheduler prioritizes nodes for pod placement based on the architecture scores that you set and the scheduling requirements defined in the pod specification. Accepted values are arm64, amd64, ppc64le, or s390x.
9
Assign a score to the architecture. The value for this field must be configured in the range of 1 (lowest priority) to 100 (highest priority). The scheduler uses this score to prioritize nodes for pod placement, favoring nodes with architectures that have higher scores.
10
Optional: Set this field to true to enable the execFormatErrorMonitor plugin. When enabled, the plugin detects ENOEXEC errors, caused when a pod executes a binary incompatible with the node’s architecture. The plugin generates events in the affected pods, and triggers the ExecFormatErrorsDetected Prometheus alert if one or more errors are found in the last six hours.

In this example, the operator field value is set to DoesNotExist. Therefore, if the key field value (multiarch.openshift.io/exclude-pod-placement) is set as a label in a namespace, the operand does not process the nodeAffinity of the pods in that namespace. Instead, the operand processes the nodeAffinity of the pods in namespaces that do not contain the label.

If you want the operand to process the nodeAffinity of the pods only in specific namespaces, you can configure the namespaceSelector as follows:

namespaceSelector:
  matchExpressions:
    - key: multiarch.openshift.io/include-pod-placement
      operator: Exists
Copy to Clipboard Toggle word wrap

In this example, the operator field value is set to Exists. Therefore, the operand processes the nodeAffinity of the pods only in namespaces that contain the multiarch.openshift.io/include-pod-placement label.

Important

This Operator excludes pods in namespaces starting with kube-. It also excludes pods that are expected to be scheduled on control plane nodes.

To deploy the pod placement operand that enables architecture-aware workload scheduling, you can create the ClusterPodPlacementConfig object by using the OpenShift CLI (oc).

Prerequisites

  • You have installed oc.
  • You have logged in to oc as a user with cluster-admin privileges.
  • You have installed the Multiarch Tuning Operator.

Procedure

  1. Create a ClusterPodPlacementConfig object YAML file:

    Example ClusterPodPlacementConfig object configuration

    apiVersion: multiarch.openshift.io/v1beta1
    kind: ClusterPodPlacementConfig
    metadata:
      name: cluster
    spec:
      logVerbosityLevel: Normal
      namespaceSelector:
        matchExpressions:
          - key: multiarch.openshift.io/exclude-pod-placement
            operator: DoesNotExist
      plugins:
        nodeAffinityScoring:
          enabled: true
          platforms:
            - architecture: amd64
              weight: 100
            - architecture: arm64
              weight: 50
    Copy to Clipboard Toggle word wrap

  2. Create the ClusterPodPlacementConfig object by running the following command:

    $ oc create -f <file_name> 
    1
    Copy to Clipboard Toggle word wrap
    1
    Replace <file_name> with the name of the ClusterPodPlacementConfig object YAML file.

Verification

  • To check that the ClusterPodPlacementConfig object is created, run the following command:

    $ oc get clusterpodplacementconfig
    Copy to Clipboard Toggle word wrap

    Example output

    NAME      AGE
    cluster   29s
    Copy to Clipboard Toggle word wrap

To deploy the pod placement operand that enables architecture-aware workload scheduling, you can create the ClusterPodPlacementConfig object by using the OpenShift Container Platform web console.

Prerequisites

  • You have access to the cluster with cluster-admin privileges.
  • You have access to the OpenShift Container Platform web console.
  • You have installed the Multiarch Tuning Operator.

Procedure

  1. Log in to the OpenShift Container Platform web console.
  2. Navigate to Ecosystem Installed Operators.
  3. On the Installed Operators page, click Multiarch Tuning Operator.
  4. Click the Cluster Pod Placement Config tab.
  5. Select either Form view or YAML view.
  6. Configure the ClusterPodPlacementConfig object parameters.
  7. Click Create.
  8. Optional: If you want to edit the ClusterPodPlacementConfig object, perform the following actions:

    1. Click the Cluster Pod Placement Config tab.
    2. Select Edit ClusterPodPlacementConfig from the options menu.
    3. Click YAML and edit the ClusterPodPlacementConfig object parameters.
    4. Click Save.

Verification

  • On the Cluster Pod Placement Config page, check that the ClusterPodPlacementConfig object is in the Ready state.

You can create only one instance of the ClusterPodPlacementConfig object. If you want to re-create this object, you must first delete the existing instance.

You can delete this object by using the OpenShift CLI (oc).

Prerequisites

  • You have installed oc.
  • You have logged in to oc as a user with cluster-admin privileges.

Procedure

  1. Log in to the OpenShift CLI (oc).
  2. Delete the ClusterPodPlacementConfig object by running the following command:

    $ oc delete clusterpodplacementconfig cluster
    Copy to Clipboard Toggle word wrap

Verification

  • To check that the ClusterPodPlacementConfig object is deleted, run the following command:

    $ oc get clusterpodplacementconfig
    Copy to Clipboard Toggle word wrap

    Example output

    No resources found
    Copy to Clipboard Toggle word wrap

You can create only one instance of the ClusterPodPlacementConfig object. If you want to re-create this object, you must first delete the existing instance.

You can delete this object by using the OpenShift Container Platform web console.

Prerequisites

  • You have access to the cluster with cluster-admin privileges.
  • You have access to the OpenShift Container Platform web console.
  • You have created the ClusterPodPlacementConfig object.

Procedure

  1. Log in to the OpenShift Container Platform web console.
  2. Navigate to Ecosystem Installed Operators.
  3. On the Installed Operators page, click Multiarch Tuning Operator.
  4. Click the Cluster Pod Placement Config tab.
  5. Select Delete ClusterPodPlacementConfig from the options menu.
  6. Click Delete.

Verification

  • On the Cluster Pod Placement Config page, check that the ClusterPodPlacementConfig object has been deleted.

You can uninstall the Multiarch Tuning Operator by using the OpenShift CLI (oc).

Prerequisites

  • You have installed oc.
  • You have logged in to oc as a user with cluster-admin privileges.
  • You deleted the ClusterPodPlacementConfig object.

    Important

    You must delete the ClusterPodPlacementConfig object before uninstalling the Multiarch Tuning Operator. Uninstalling the Operator without deleting the ClusterPodPlacementConfig object can lead to unexpected behavior.

Procedure

  1. Get the Subscription object name for the Multiarch Tuning Operator by running the following command:

    $ oc get subscription.operators.coreos.com -n <namespace> 
    1
    Copy to Clipboard Toggle word wrap
    1
    Replace <namespace> with the name of the namespace where you want to uninstall the Multiarch Tuning Operator.

    Example output

    NAME                                  PACKAGE                     SOURCE             CHANNEL
    openshift-multiarch-tuning-operator   multiarch-tuning-operator   redhat-operators   stable
    Copy to Clipboard Toggle word wrap

  2. Get the currentCSV value for the Multiarch Tuning Operator by running the following command:

    $ oc get subscription.operators.coreos.com <subscription_name> -n <namespace> -o yaml | grep currentCSV 
    1
    Copy to Clipboard Toggle word wrap
    1
    Replace <subscription_name> with the Subscription object name. For example: openshift-multiarch-tuning-operator. Replace <namespace> with the name of the namespace where you want to uninstall the Multiarch Tuning Operator.

    Example output

    currentCSV: multiarch-tuning-operator.<version>
    Copy to Clipboard Toggle word wrap

  3. Delete the Subscription object by running the following command:

    $ oc delete subscription.operators.coreos.com <subscription_name> -n <namespace> 
    1
    Copy to Clipboard Toggle word wrap
    1
    Replace <subscription_name> with the Subscription object name. Replace <namespace> with the name of the namespace where you want to uninstall the Multiarch Tuning Operator.

    Example output

    subscription.operators.coreos.com "openshift-multiarch-tuning-operator" deleted
    Copy to Clipboard Toggle word wrap

  4. Delete the CSV for the Multiarch Tuning Operator in the target namespace using the currentCSV value by running the following command:

    $ oc delete clusterserviceversion <currentCSV_value> -n <namespace> 
    1
    Copy to Clipboard Toggle word wrap
    1
    Replace <currentCSV> with the currentCSV value for the Multiarch Tuning Operator. For example: multiarch-tuning-operator.<version>. Replace <namespace> with the name of the namespace where you want to uninstall the Multiarch Tuning Operator.

    Example output

    clusterserviceversion.operators.coreos.com "multiarch-tuning-operator.<version>" deleted
    Copy to Clipboard Toggle word wrap

Verification

  • To verify that the Multiarch Tuning Operator is uninstalled, run the following command:

    $ oc get csv -n <namespace> 
    1
    Copy to Clipboard Toggle word wrap
    1
    Replace <namespace> with the name of the namespace where you have uninstalled the Multiarch Tuning Operator.

    Example output

    No resources found in openshift-multiarch-tuning-operator namespace.
    Copy to Clipboard Toggle word wrap

You can uninstall the Multiarch Tuning Operator by using the OpenShift Container Platform web console.

Prerequisites

  • You have access to the cluster with cluster-admin permissions.
  • You deleted the ClusterPodPlacementConfig object.

    Important

    You must delete the ClusterPodPlacementConfig object before uninstalling the Multiarch Tuning Operator. Uninstalling the Operator without deleting the ClusterPodPlacementConfig object can lead to unexpected behavior.

Procedure

  1. Log in to the OpenShift Container Platform web console.
  2. Navigate to Ecosystem Software Catalog.
  3. Enter Multiarch Tuning Operator in the search field.
  4. Click Multiarch Tuning Operator.
  5. Click the Details tab.
  6. From the Actions menu, select Uninstall Operator.
  7. When prompted, click Uninstall.

Verification

  1. Navigate to Ecosystem Installed Operators.
  2. On the Installed Operators page, verify that the Multiarch Tuning Operator is not listed.

3.12. Multiarch Tuning Operator release notes

The Multiarch Tuning Operator optimizes workload management within multi-architecture clusters and in single-architecture clusters transitioning to multi-architecture environments.

These release notes track the development of the Multiarch Tuning Operator.

For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.

Issued: 22 October 2025

3.12.1.1. New features and enhancements

  • With this release, you can enable the exec format error monitor plugin for the Multiarch Tuning Operator. This plugin detects ENOEXEC errors, which occur when a pod attempts to execute a binary incompatible with the node’s architecture. You enable this plugin by setting the plugins.execFormatErrorMonitor.enabled parameter to true in the ClusterPodPlacementConfig object. For more information, see Creating the ClusterPodPlacementConfig object.

3.12.1.2. Bug fixes

  • Previously, the Multiarch Tuning Operator incorrectly handled the Operator bundle image inspector, restricting it to a single architecture, which could cause OLM to fail when installing Operators. With this update, MTO now sets the bundle image to support all architectures, allowing Operators to be successfully installed on single-architecture clusters when the Multiarch Tuning Operator is deployed. (MULTIARCH-5546)
  • Previously, when a cluster global pull secret was changed, stale authentication information could remain in the Multiarch Tuning Operator cache. With this update, the cache is cleared whenever a cluster global pull secret is changed. (MULTIARCH-5538)
  • Previously, the Multiarch Tuning Operator failed to process pods if an image reference contained both a tag and a digest. With this update, the image inspector prioritizes the digest if both are present. (MULTIARCH-5584)
  • Previously, the Multiarch Tuning Operator did not respect the .spec.registrySources.containerRuntimeSearchRegistries field in the config.openshift.io/Image custom resource when a workload image did not specify a registry URL. With this update, the Operator can now handle this case, allowing workload images without an explicit registry URL to be pulled successfully. (MULTIARCH-5611)
  • Previously, if the ClusterPodPlacementConfig object was deleted less than 1 second after its creation, some finalizers were not removed in time, causing certain resources to remain. With this update, all finalizers are properly deleted when the ClusterPodPlacementConfig object is deleted. (MULTIARCH-5372)

Issued: 27 May 2025

3.12.2.1. Bug fixes

  • Previously, the pod placement operand did not support authenticating registries using wildcard entries in the hostname of their pull secret. This caused inconsistent behavior with Kubelet when pulling images, because Kubelet supported wildcard entries while the operand required exact hostname matches. As a result, image pulls could fail unexpectedly when registries used wildcard hostnames.

    With this release, the pod placement operand supports pull secrets that include wildcard hostnames, ensuring consistent and reliable image authentication and pulling.

  • Previously, when image inspection failed after all retries and the nodeAffinityScoring plugin was enabled, the pod placement operand applied incorrect nodeAffinityScoring labels.

    With this release, the operand sets nodeAffinityScoring labels correctly, even when image inspection fails. It now applies these labels independently of the required affinity process to ensure accurate and consistent scheduling.

Issued: 18 March 2024

3.12.3.1. New features and enhancements

  • The Multiarch Tuning Operator is now supported on managed offerings, including ROSA with Hosted Control Planes (HCP) and other HCP environments.
  • With this release, you can configure architecture-aware workload scheduling by using the new plugins field in the ClusterPodPlacementConfig object. You can use the plugins.nodeAffinityScoring field to set architecture preferences for pod placement. If you enable the nodeAffinityScoring plugin, the scheduler first filters out nodes that do not meet the pod requirements. Then, the scheduler prioritizes the remaining nodes based on the architecture scores defined in the nodeAffinityScoring.platforms field.
3.12.3.1.1. Bug fixes
  • With this release, the Multiarch Tuning Operator does not update the nodeAffinity field for pods that are managed by a daemon set. (OCPBUGS-45885)

Issued: 31 October 2024

3.12.4.1. New features and enhancements

  • With this release, the Multiarch Tuning Operator supports custom network scenarios and cluster-wide custom registries configurations.
  • With this release, you can identify pods based on their architecture compatibility by using the pod labels that the Multiarch Tuning Operator adds to newly created pods.
  • With this release, you can monitor the behavior of the Multiarch Tuning Operator by using the metrics and alerts that are registered in the Cluster Monitoring Operator.
Back to top
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2025 Red Hat