Chapter 4. Configuring multi-architecture compute machines on an OpenShift cluster


4.1. About clusters with multi-architecture compute machines

An OpenShift Container Platform cluster with multi-architecture compute machines is a cluster that supports compute machines with different architectures. Clusters with multi-architecture compute machines are available only on Amazon Web Services (AWS) or Microsoft Azure installer-provisioned infrastructures and bare metal, IBM Power®, and IBM Z® user-provisioned infrastructures with 64-bit x86 control plane machines.

Note

When there are nodes with multiple architectures in your cluster, the architecture of your image must be consistent with the architecture of the node. You need to ensure that the pod is assigned to the node with the appropriate architecture and that it matches the image architecture. For more information on assigning pods to nodes, see Assigning pods to nodes.

Important

The Cluster Samples Operator is not supported on clusters with multi-architecture compute machines. Your cluster can be created without this capability. For more information, see Cluster capabilities.

For information on migrating your single-architecture cluster to a cluster that supports multi-architecture compute machines, see Migrating to a cluster with multi-architecture compute machines.

4.1.1. Configuring your cluster with multi-architecture compute machines

To create a cluster with multi-architecture compute machines with different installation options and platforms, you can use the documentation in the following table:

Table 4.1. Cluster with multi-architecture compute machine installation options
Documentation sectionPlatformUser-provisioned installationInstaller-provisioned installationControl PlaneCompute node

Creating a cluster with multi-architecture compute machines on Azure

Microsoft Azure

 

x86_64

aarch64

Creating a cluster with multi-architecture compute machines on AWS

Amazon Web Services (AWS)

 

x86_64

aarch64

Creating a cluster with multi-architecture compute machines on GCP

Google Cloud Platform (GCP)

 

x86_64

aarch64

Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z

Bare metal

 

x86_64

aarch64

IBM Power

 

x86_64 or ppc64le

x86_64, ppc64le

IBM Z

 

x86_64 or s390x

x86_64, s390x

Creating a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE with z/VM

IBM Z® and IBM® LinuxONE

 

x86_64

x86_64, s390x

Creating a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE with RHEL KVM

IBM Z® and IBM® LinuxONE

 

x86_64

x86_64, s390x

Creating a cluster with multi-architecture compute machines on IBM Power®

IBM Power®

 

x86_64

x86_64, ppc64le

Important

Autoscaling from zero is currently not supported on Google Cloud Platform (GCP).

4.2. Creating a cluster with multi-architecture compute machine on Azure

To deploy an Azure cluster with multi-architecture compute machines, you must first create a single-architecture Azure installer-provisioned cluster that uses the multi-architecture installer binary. For more information on Azure installations, see Installing a cluster on Azure with customizations. You can then add an ARM64 compute machine set to your cluster to create a cluster with multi-architecture compute machines.

The following procedures explain how to generate an ARM64 boot image and create an Azure compute machine set that uses the ARM64 boot image. This adds ARM64 compute nodes to your cluster and deploys the amount of ARM64 virtual machines (VM) that you need.

4.2.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc)

Procedure

  • You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"

Verification

  1. If you see the following output, then your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }

    You can then begin adding multi-arch compute nodes to your cluster.

  2. If you see the following output, then your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

4.2.2. Creating an ARM64 boot image using the Azure image gallery

The following procedure describes how to manually generate an ARM64 boot image.

Prerequisites

  • You installed the Azure CLI (az).
  • You created a single-architecture Azure installer-provisioned cluster with the multi-architecture installer binary.

Procedure

  1. Log in to your Azure account:

    $ az login
  2. Create a storage account and upload the arm64 virtual hard disk (VHD) to your storage account. The OpenShift Container Platform installation program creates a resource group, however, the boot image can also be uploaded to a custom named resource group:

    $ az storage account create -n ${STORAGE_ACCOUNT_NAME} -g ${RESOURCE_GROUP} -l westus --sku Standard_LRS 1
    1
    The westus object is an example region.
  3. Create a storage container using the storage account you generated:

    $ az storage container create -n ${CONTAINER_NAME} --account-name ${STORAGE_ACCOUNT_NAME}
  4. You must use the OpenShift Container Platform installation program JSON file to extract the URL and aarch64 VHD name:

    1. Extract the URL field and set it to RHCOS_VHD_ORIGIN_URL as the file name by running the following command:

      $ RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.aarch64."rhel-coreos-extensions"."azure-disk".url')
    2. Extract the aarch64 VHD name and set it to BLOB_NAME as the file name by running the following command:

      $ BLOB_NAME=rhcos-$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.aarch64."rhel-coreos-extensions"."azure-disk".release')-azure.aarch64.vhd
  5. Generate a shared access signature (SAS) token. Use this token to upload the RHCOS VHD to your storage container with the following commands:

    $ end=`date -u -d "30 minutes" '+%Y-%m-%dT%H:%MZ'`
    $ sas=`az storage container generate-sas -n ${CONTAINER_NAME} --account-name ${STORAGE_ACCOUNT_NAME} --https-only --permissions dlrw --expiry $end -o tsv`
  6. Copy the RHCOS VHD into the storage container:

    $ az storage blob copy start --account-name ${STORAGE_ACCOUNT_NAME} --sas-token "$sas" \
     --source-uri "${RHCOS_VHD_ORIGIN_URL}" \
     --destination-blob "${BLOB_NAME}" --destination-container ${CONTAINER_NAME}

    You can check the status of the copying process with the following command:

    $ az storage blob show -c ${CONTAINER_NAME} -n ${BLOB_NAME} --account-name ${STORAGE_ACCOUNT_NAME} | jq .properties.copy

    Example output

    {
     "completionTime": null,
     "destinationSnapshot": null,
     "id": "1fd97630-03ca-489a-8c4e-cfe839c9627d",
     "incrementalCopy": null,
     "progress": "17179869696/17179869696",
     "source": "https://rhcos.blob.core.windows.net/imagebucket/rhcos-411.86.202207130959-0-azure.aarch64.vhd",
     "status": "success", 1
     "statusDescription": null
    }

    1
    If the status parameter displays the success object, the copying process is complete.
  7. Create an image gallery using the following command:

    $ az sig create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME}

    Use the image gallery to create an image definition. In the following example command, rhcos-arm64 is the name of the image definition.

    $ az sig image-definition create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME} --gallery-image-definition rhcos-arm64 --publisher RedHat --offer arm --sku arm64 --os-type linux --architecture Arm64 --hyper-v-generation V2
  8. To get the URL of the VHD and set it to RHCOS_VHD_URL as the file name, run the following command:

    $ RHCOS_VHD_URL=$(az storage blob url --account-name ${STORAGE_ACCOUNT_NAME} -c ${CONTAINER_NAME} -n "${BLOB_NAME}" -o tsv)
  9. Use the RHCOS_VHD_URL file, your storage account, resource group, and image gallery to create an image version. In the following example, 1.0.0 is the image version.

    $ az sig image-version create --resource-group ${RESOURCE_GROUP} --gallery-name ${GALLERY_NAME} --gallery-image-definition rhcos-arm64 --gallery-image-version 1.0.0 --os-vhd-storage-account ${STORAGE_ACCOUNT_NAME} --os-vhd-uri ${RHCOS_VHD_URL}
  10. Your arm64 boot image is now generated. You can access the ID of your image with the following command:

    $ az sig image-version show -r $GALLERY_NAME -g $RESOURCE_GROUP -i rhcos-arm64 -e 1.0.0

    The following example image ID is used in the recourseID parameter of the compute machine set:

    Example resourceID

    /resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.Compute/galleries/${GALLERY_NAME}/images/rhcos-arm64/versions/1.0.0

4.2.3. Adding a multi-architecture compute machine set to your cluster

To add ARM64 compute nodes to your cluster, you must create an Azure compute machine set that uses the ARM64 boot image. To create your own custom compute machine set on Azure, see "Creating a compute machine set on Azure".

Prerequisites

  • You installed the OpenShift CLI (oc).

Procedure

  • Create a compute machine set and modify the resourceID and vmSize parameters with the following command. This compute machine set will control the arm64 worker nodes in your cluster:

    $ oc create -f arm64-machine-set-0.yaml

    Sample YAML compute machine set with arm64 boot image

    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
      name: <infrastructure_id>-arm64-machine-set-0
      namespace: openshift-machine-api
    spec:
      replicas: 2
      selector:
        matchLabels:
          machine.openshift.io/cluster-api-cluster: <infrastructure_id>
          machine.openshift.io/cluster-api-machineset: <infrastructure_id>-arm64-machine-set-0
      template:
        metadata:
          labels:
            machine.openshift.io/cluster-api-cluster: <infrastructure_id>
            machine.openshift.io/cluster-api-machine-role: worker
            machine.openshift.io/cluster-api-machine-type: worker
            machine.openshift.io/cluster-api-machineset: <infrastructure_id>-arm64-machine-set-0
        spec:
          lifecycleHooks: {}
          metadata: {}
          providerSpec:
            value:
              acceleratedNetworking: true
              apiVersion: machine.openshift.io/v1beta1
              credentialsSecret:
                name: azure-cloud-credentials
                namespace: openshift-machine-api
              image:
                offer: ""
                publisher: ""
                resourceID: /resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.Compute/galleries/${GALLERY_NAME}/images/rhcos-arm64/versions/1.0.0 1
                sku: ""
                version: ""
              kind: AzureMachineProviderSpec
              location: <region>
              managedIdentity: <infrastructure_id>-identity
              networkResourceGroup: <infrastructure_id>-rg
              osDisk:
                diskSettings: {}
                diskSizeGB: 128
                managedDisk:
                  storageAccountType: Premium_LRS
                osType: Linux
              publicIP: false
              publicLoadBalancer: <infrastructure_id>
              resourceGroup: <infrastructure_id>-rg
              subnet: <infrastructure_id>-worker-subnet
              userDataSecret:
                name: worker-user-data
              vmSize: Standard_D4ps_v5 2
              vnet: <infrastructure_id>-vnet
              zone: "<zone>"

    1
    Set the resourceID parameter to the arm64 boot image.
    2
    Set the vmSize parameter to the instance type used in your installation. Some example instance types are Standard_D4ps_v5 or D8ps.

Verification

  1. Verify that the new ARM64 machines are running by entering the following command:

    $ oc get machineset -n openshift-machine-api

    Example output

    NAME                                                DESIRED  CURRENT  READY  AVAILABLE  AGE
    <infrastructure_id>-arm64-machine-set-0                   2        2      2          2  10m

  2. You can check that the nodes are ready and scheduable with the following command:

    $ oc get nodes

4.3. Creating a cluster with multi-architecture compute machines on AWS

To create an AWS cluster with multi-architecture compute machines, you must first create a single-architecture AWS installer-provisioned cluster with the multi-architecture installer binary. For more information on AWS installations, refer to Installing a cluster on AWS with customizations. You can then add a ARM64 compute machine set to your AWS cluster.

4.3.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc)

Procedure

  • You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"

Verification

  1. If you see the following output, then your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }

    You can then begin adding multi-arch compute nodes to your cluster.

  2. If you see the following output, then your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

4.3.2. Adding an ARM64 compute machine set to your cluster

To configure a cluster with multi-architecture compute machines, you must create a AWS ARM64 compute machine set. This adds ARM64 compute nodes to your cluster so that your cluster has multi-architecture compute machines.

Prerequisites

  • You installed the OpenShift CLI (oc).
  • You used the installation program to create an AMD64 single-architecture AWS cluster with the multi-architecture installer binary.

Procedure

  • Create and modify a compute machine set, this will control the ARM64 compute nodes in your cluster.

    $ oc create -f aws-arm64-machine-set-0.yaml

    Sample YAML compute machine set to deploy an ARM64 compute node

    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id> 1
      name: <infrastructure_id>-aws-arm64-machine-set-0 2
      namespace: openshift-machine-api
    spec:
      replicas: 1
      selector:
        matchLabels:
          machine.openshift.io/cluster-api-cluster: <infrastructure_id> 3
          machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone> 4
      template:
        metadata:
          labels:
            machine.openshift.io/cluster-api-cluster: <infrastructure_id>
            machine.openshift.io/cluster-api-machine-role: <role> 5
            machine.openshift.io/cluster-api-machine-type: <role> 6
            machine.openshift.io/cluster-api-machineset: <infrastructure_id>-<role>-<zone> 7
        spec:
          metadata:
            labels:
              node-role.kubernetes.io/<role>: ""
          providerSpec:
            value:
              ami:
                id: ami-02a574449d4f4d280 8
              apiVersion: awsproviderconfig.openshift.io/v1beta1
              blockDevices:
                - ebs:
                    iops: 0
                    volumeSize: 120
                    volumeType: gp2
              credentialsSecret:
                name: aws-cloud-credentials
              deviceIndex: 0
              iamInstanceProfile:
                id: <infrastructure_id>-worker-profile 9
              instanceType: m6g.xlarge 10
              kind: AWSMachineProviderConfig
              placement:
                availabilityZone: us-east-1a 11
                region: <region> 12
              securityGroups:
                - filters:
                    - name: tag:Name
                      values:
                        - <infrastructure_id>-worker-sg 13
              subnet:
                filters:
                  - name: tag:Name
                    values:
                      - <infrastructure_id>-private-<zone>
              tags:
                - name: kubernetes.io/cluster/<infrastructure_id> 14
                  value: owned
                - name: <custom_tag_name>
                  value: <custom_tag_value>
              userDataSecret:
                name: worker-user-data

    1 2 3 9 13 14
    Specify the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. If you have the OpenShift CLI installed, you can obtain the infrastructure ID by running the following command:
    $ oc get -o jsonpath=‘{.status.infrastructureName}{“\n”}’ infrastructure cluster
    4 7
    Specify the infrastructure ID, role node label, and zone.
    5 6
    Specify the role node label to add.
    8
    Specify an ARM64 supported Red Hat Enterprise Linux CoreOS (RHCOS) Amazon Machine Image (AMI) for your AWS zone for your OpenShift Container Platform nodes.
    $ oc get configmap/coreos-bootimages \
    	  -n openshift-machine-config-operator \
    	  -o jsonpath='{.data.stream}' | jq \
    	  -r '.architectures.<arch>.images.aws.regions."<region>".image'
    10
    Specify an ARM64 supported machine type. For more information, refer to "Tested instance types for AWS 64-bit ARM"
    11
    Specify the zone, for example us-east-1a. Ensure that the zone you select offers 64-bit ARM machines.
    12
    Specify the region, for example, us-east-1. Ensure that the zone you select offers 64-bit ARM machines.

Verification

  1. View the list of compute machine sets by entering the following command:

    $ oc get machineset -n openshift-machine-api

    You can then see your created ARM64 machine set.

    Example output

    NAME                                                DESIRED  CURRENT  READY  AVAILABLE  AGE
    <infrastructure_id>-aws-arm64-machine-set-0                   2        2      2          2  10m

  2. You can check that the nodes are ready and scheduable with the following command:

    $ oc get nodes

4.4. Creating a cluster with multi-architecture compute machines on GCP

To create a Google Cloud Platform (GCP) cluster with multi-architecture compute machines, you must first create a single-architecture GCP installer-provisioned cluster with the multi-architecture installer binary. For more information on AWS installations, refer to Installing a cluster on GCP with customizations. You can then add ARM64 compute machines sets to your GCP cluster.

Note

Secure booting is currently not supported on ARM64 machines for GCP

4.4.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc)

Procedure

  • You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"

Verification

  1. If you see the following output, then your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }

    You can then begin adding multi-arch compute nodes to your cluster.

  2. If you see the following output, then your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

4.4.2. Adding an ARM64 compute machine set to your GCP cluster

To configure a cluster with multi-architecture compute machines, you must create a GCP ARM64 compute machine set. This adds ARM64 compute nodes to your cluster.

Prerequisites

  • You installed the OpenShift CLI (oc).
  • You used the installation program to create an AMD64 single-architecture AWS cluster with the multi-architecture installer binary.

Procedure

  • Create and modify a compute machine set, this controls the ARM64 compute nodes in your cluster:

    $ oc create -f gcp-arm64-machine-set-0.yaml

    Sample GCP YAML compute machine set to deploy an ARM64 compute node

    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id> 1
      name: <infrastructure_id>-w-a
      namespace: openshift-machine-api
    spec:
      replicas: 1
      selector:
        matchLabels:
          machine.openshift.io/cluster-api-cluster: <infrastructure_id>
          machine.openshift.io/cluster-api-machineset: <infrastructure_id>-w-a
      template:
        metadata:
          creationTimestamp: null
          labels:
            machine.openshift.io/cluster-api-cluster: <infrastructure_id>
            machine.openshift.io/cluster-api-machine-role: <role> 2
            machine.openshift.io/cluster-api-machine-type: <role>
            machine.openshift.io/cluster-api-machineset: <infrastructure_id>-w-a
        spec:
          metadata:
            labels:
              node-role.kubernetes.io/<role>: ""
          providerSpec:
            value:
              apiVersion: gcpprovider.openshift.io/v1beta1
              canIPForward: false
              credentialsSecret:
                name: gcp-cloud-credentials
              deletionProtection: false
              disks:
              - autoDelete: true
                boot: true
                image: <path_to_image> 3
                labels: null
                sizeGb: 128
                type: pd-ssd
              gcpMetadata: 4
              - key: <custom_metadata_key>
                value: <custom_metadata_value>
              kind: GCPMachineProviderSpec
              machineType: n1-standard-4 5
              metadata:
                creationTimestamp: null
              networkInterfaces:
              - network: <infrastructure_id>-network
                subnetwork: <infrastructure_id>-worker-subnet
              projectID: <project_name> 6
              region: us-central1 7
              serviceAccounts:
              - email: <infrastructure_id>-w@<project_name>.iam.gserviceaccount.com
                scopes:
                - https://www.googleapis.com/auth/cloud-platform
              tags:
                - <infrastructure_id>-worker
              userDataSecret:
                name: worker-user-data
              zone: us-central1-a

    1
    Specify the infrastructure ID that is based on the cluster ID that you set when you provisioned the cluster. You can obtain the infrastructure ID by running the following command:
    $ oc get -o jsonpath='{.status.infrastructureName}{"\n"}' infrastructure cluster
    2
    Specify the role node label to add.
    3
    Specify the path to the image that is used in current compute machine sets. You need the project and image name for your path to image.

    To access the project and image name, run the following command:

    $ oc get configmap/coreos-bootimages \
      -n openshift-machine-config-operator \
      -o jsonpath='{.data.stream}' | jq \
      -r '.architectures.aarch64.images.gcp'

    Example output

      "gcp": {
        "release": "415.92.202309142014-0",
        "project": "rhcos-cloud",
        "name": "rhcos-415-92-202309142014-0-gcp-aarch64"
      }

    Use the project and name parameters from the output to create the path to image field in your machine set. The path to the image should follow the following format:

    $ projects/<project>/global/images/<image_name>
    4
    Optional: Specify custom metadata in the form of a key:value pair. For example use cases, see the GCP documentation for setting custom metadata.
    5
    Specify an ARM64 supported machine type. For more information, refer to Tested instance types for GCP on 64-bit ARM infrastructures in "Additional resources".
    6
    Specify the name of the GCP project that you use for your cluster.
    7
    Specify the region, for example, us-central1. Ensure that the zone you select offers 64-bit ARM machines.

Verification

  1. View the list of compute machine sets by entering the following command:

    $ oc get machineset -n openshift-machine-api

    You can then see your created ARM64 machine set.

    Example output

    NAME                                                DESIRED  CURRENT  READY  AVAILABLE  AGE
    <infrastructure_id>-gcp-arm64-machine-set-0                   2        2      2          2  10m

  2. You can check that the nodes are ready and scheduable with the following command:

    $ oc get nodes

4.5. Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z

To create a cluster with multi-architecture compute machines on bare metal (x86_64), IBM Power® (ppc64le), or IBM Z® (s390x) you must have an existing single-architecture cluster on one of these platforms. Follow the installations procedures for your platform:

Before you can add additional compute nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.

The following procedures explain how to create a RHCOS compute machine using an ISO image or network PXE booting. This will allow you to add additional nodes to your cluster and deploy a cluster with multi-architecture compute machines.

4.5.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc)

Procedure

  • You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"

Verification

  1. If you see the following output, then your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }

    You can then begin adding multi-arch compute nodes to your cluster.

  2. If you see the following output, then your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

4.5.2. Creating RHCOS machines using an ISO image

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using an ISO image to create the machines.

Prerequisites

  • Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
  • You must have the OpenShift CLI (oc) installed.

Procedure

  1. Extract the Ignition config file from the cluster by running the following command:

    $ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
  2. Upload the worker.ign Ignition config file you exported from your cluster to your HTTP server. Note the URLs of these files.
  3. You can validate that the ignition files are available on the URLs. The following example gets the Ignition config files for the compute node:

    $ curl -k http://<HTTP_server>/worker.ign
  4. You can access the ISO image for booting your new machine by running to following command:

    RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.<architecture>.artifacts.metal.formats.iso.disk.location')
  5. Use the ISO file to install RHCOS on more compute machines. Use the same method that you used when you created machines before you installed the cluster:

    • Burn the ISO image to a disk and boot it directly.
    • Use ISO redirection with a LOM interface.
  6. Boot the RHCOS ISO image without specifying any options, or interrupting the live boot sequence. Wait for the installer to boot into a shell prompt in the RHCOS live environment.

    Note

    You can interrupt the RHCOS installation boot process to add kernel arguments. However, for this ISO procedure you must use the coreos-installer command as outlined in the following steps, instead of adding kernel arguments.

  7. Run the coreos-installer command and specify the options that meet your installation requirements. At a minimum, you must specify the URL that points to the Ignition config file for the node type, and the device that you are installing to:

    $ sudo coreos-installer install --ignition-url=http://<HTTP_server>/<node_type>.ign <device> --ignition-hash=sha512-<digest> 12
    1
    You must run the coreos-installer command by using sudo, because the core user does not have the required root privileges to perform the installation.
    2
    The --ignition-hash option is required when the Ignition config file is obtained through an HTTP URL to validate the authenticity of the Ignition config file on the cluster node. <digest> is the Ignition config file SHA512 digest obtained in a preceding step.
    Note

    If you want to provide your Ignition config files through an HTTPS server that uses TLS, you can add the internal certificate authority (CA) to the system trust store before running coreos-installer.

    The following example initializes a bootstrap node installation to the /dev/sda device. The Ignition config file for the bootstrap node is obtained from an HTTP web server with the IP address 192.168.1.2:

    $ sudo coreos-installer install --ignition-url=http://192.168.1.2:80/installation_directory/bootstrap.ign /dev/sda --ignition-hash=sha512-a5a2d43879223273c9b60af66b44202a1d1248fc01cf156c46d4a79f552b6bad47bc8cc78ddf0116e80c59d2ea9e32ba53bc807afbca581aa059311def2c3e3b
  8. Monitor the progress of the RHCOS installation on the console of the machine.

    Important

    Ensure that the installation is successful on each node before commencing with the OpenShift Container Platform installation. Observing the installation process can also help to determine the cause of RHCOS installation issues that might arise.

  9. Continue to create more compute machines for your cluster.

4.5.3. Creating RHCOS machines by PXE or iPXE booting

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using PXE or iPXE booting.

Prerequisites

  • Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
  • Obtain the URLs of the RHCOS ISO image, compressed metal BIOS, kernel, and initramfs files that you uploaded to your HTTP server during cluster installation.
  • You have access to the PXE booting infrastructure that you used to create the machines for your OpenShift Container Platform cluster during installation. The machines must boot from their local disks after RHCOS is installed on them.
  • If you use UEFI, you have access to the grub.conf file that you modified during OpenShift Container Platform installation.

Procedure

  1. Confirm that your PXE or iPXE installation for the RHCOS images is correct.

    • For PXE:

      DEFAULT pxeboot
      TIMEOUT 20
      PROMPT 0
      LABEL pxeboot
          KERNEL http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> 1
          APPEND initrd=http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img 2
      1
      Specify the location of the live kernel file that you uploaded to your HTTP server.
      2
      Specify locations of the RHCOS files that you uploaded to your HTTP server. The initrd parameter value is the location of the live initramfs file, the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file, and the coreos.live.rootfs_url parameter value is the location of the live rootfs file. The coreos.inst.ignition_url and coreos.live.rootfs_url parameters only support HTTP and HTTPS.
      Note

      This configuration does not enable serial console access on machines with a graphical console. To configure a different console, add one or more console= arguments to the APPEND line. For example, add console=tty0 console=ttyS0 to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux?.

    • For iPXE (x86_64 + aarch64):

      kernel http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> initrd=main coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 1 2
      initrd --name main http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img 3
      boot
      1
      Specify the locations of the RHCOS files that you uploaded to your HTTP server. The kernel parameter value is the location of the kernel file, the initrd=main argument is needed for booting on UEFI systems, the coreos.live.rootfs_url parameter value is the location of the rootfs file, and the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file.
      2
      If you use multiple NICs, specify a single interface in the ip option. For example, to use DHCP on a NIC that is named eno1, set ip=eno1:dhcp.
      3
      Specify the location of the initramfs file that you uploaded to your HTTP server.
      Note

      This configuration does not enable serial console access on machines with a graphical console To configure a different console, add one or more console= arguments to the kernel line. For example, add console=tty0 console=ttyS0 to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux? and "Enabling the serial console for PXE and ISO installation" in the "Advanced RHCOS installation configuration" section.

      Note

      To network boot the CoreOS kernel on aarch64 architecture, you need to use a version of iPXE build with the IMAGE_GZIP option enabled. See IMAGE_GZIP option in iPXE.

    • For PXE (with UEFI and GRUB as second stage) on aarch64:

      menuentry 'Install CoreOS' {
          linux rhcos-<version>-live-kernel-<architecture>  coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 1 2
          initrd rhcos-<version>-live-initramfs.<architecture>.img 3
      }
      1
      Specify the locations of the RHCOS files that you uploaded to your HTTP/TFTP server. The kernel parameter value is the location of the kernel file on your TFTP server. The coreos.live.rootfs_url parameter value is the location of the rootfs file, and the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file on your HTTP Server.
      2
      If you use multiple NICs, specify a single interface in the ip option. For example, to use DHCP on a NIC that is named eno1, set ip=eno1:dhcp.
      3
      Specify the location of the initramfs file that you uploaded to your TFTP server.
  2. Use the PXE or iPXE infrastructure to create the required compute machines for your cluster.

4.5.4. Approving the certificate signing requests for your machines

When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.

Prerequisites

  • You added machines to your cluster.

Procedure

  1. Confirm that the cluster recognizes the machines:

    $ oc get nodes

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  63m  v1.28.5
    master-1  Ready     master  63m  v1.28.5
    master-2  Ready     master  64m  v1.28.5

    The output lists all of the machines that you created.

    Note

    The preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.

  2. Review the pending CSRs and ensure that you see the client requests with the Pending or Approved status for each machine that you added to the cluster:

    $ oc get csr

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-8b2br   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    csr-8vnps   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    ...

    In this example, two machines are joining the cluster. You might see more approved CSRs in the list.

  3. If the CSRs were not approved, after all of the pending CSRs for the machines you added are in Pending status, approve the CSRs for your cluster machines:

    Note

    Because the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the machine-approver if the Kubelet requests a new certificate with identical parameters.

    Note

    For clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the oc exec, oc rsh, and oc logs commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by the node-bootstrapper service account in the system:node or system:admin groups, and confirm the identity of the node.

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 1
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
      Note

      Some Operators might not become available until some CSRs are approved.

  4. Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:

    $ oc get csr

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-bfd72   5m26s   system:node:ip-10-0-50-126.us-east-2.compute.internal                       Pending
    csr-c57lv   5m26s   system:node:ip-10-0-95-157.us-east-2.compute.internal                       Pending
    ...

  5. If the remaining CSRs are not approved, and are in the Pending status, approve the CSRs for your cluster machines:

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 1
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
  6. After all client and server CSRs have been approved, the machines have the Ready status. Verify this by running the following command:

    $ oc get nodes

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  73m  v1.28.5
    master-1  Ready     master  73m  v1.28.5
    master-2  Ready     master  74m  v1.28.5
    worker-0  Ready     worker  11m  v1.28.5
    worker-1  Ready     worker  11m  v1.28.5

    Note

    It can take a few minutes after approval of the server CSRs for the machines to transition to the Ready status.

Additional information

4.6. Creating a cluster with multi-architecture compute machines on IBM Z and IBM LinuxONE with z/VM

To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x) with z/VM, you must have an existing single-architecture x86_64 cluster. You can then add s390x compute machines to your OpenShift Container Platform cluster.

Before you can add s390x nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.

The following procedures explain how to create a RHCOS compute machine using a z/VM instance. This will allow you to add s390x nodes to your cluster and deploy a cluster with multi-architecture compute machines.

Note

To create an IBM Z® or IBM® LinuxONE (s390x) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.

4.6.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc)

Procedure

  • You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"

Verification

  1. If you see the following output, then your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }

    You can then begin adding multi-arch compute nodes to your cluster.

  2. If you see the following output, then your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

4.6.2. Creating RHCOS machines on IBM Z with z/VM

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines running on IBM Z® with z/VM and attach them to your existing cluster.

Prerequisites

  • You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
  • You have an HTTP or HTTPS server running on your provisioning machine that is accessible to the machines you create.

Procedure

  1. Disable UDP aggregation.

    Currently, UDP aggregation is not supported on IBM Z® and is not automatically deactivated on multi-architecture compute clusters with an x86_64 control plane and additional s390x compute machines. To ensure that the addtional compute nodes are added to the cluster correctly, you must manually disable UDP aggregation.

    1. Create a YAML file udp-aggregation-config.yaml with the following content:

      apiVersion: v1
      kind: ConfigMap
      data:
        disable-udp-aggregation: "true"
      metadata:
        name: udp-aggregation-config
        namespace: openshift-network-operator
    2. Create the ConfigMap resource by running the following command:

      $ oc create -f udp-aggregation-config.yaml
  2. Extract the Ignition config file from the cluster by running the following command:

    $ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
  3. Upload the worker.ign Ignition config file you exported from your cluster to your HTTP server. Note the URL of this file.
  4. You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:

    $ curl -k http://<HTTP_server>/worker.ign
  5. Download the RHEL live kernel, initramfs, and rootfs files by running the following commands:

    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')
    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')
    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location')
  6. Move the downloaded RHEL live kernel, initramfs, and rootfs files to an HTTP or HTTPS server that is accessible from the z/VM guest you want to add.
  7. Create a parameter file for the z/VM guest. The following parameters are specific for the virtual machine:

    • Optional: To specify a static IP address, add an ip= parameter with the following entries, with each separated by a colon:

      1. The IP address for the machine.
      2. An empty string.
      3. The gateway.
      4. The netmask.
      5. The machine host and domain name in the form hostname.domainname. Omit this value to let RHCOS decide.
      6. The network interface name. Omit this value to let RHCOS decide.
      7. The value none.
    • For coreos.inst.ignition_url=, specify the URL to the worker.ign file. Only HTTP and HTTPS protocols are supported.
    • For coreos.live.rootfs_url=, specify the matching rootfs artifact for the kernel and initramfs you are booting. Only HTTP and HTTPS protocols are supported.
    • For installations on DASD-type disks, complete the following tasks:

      1. For coreos.inst.install_dev=, specify /dev/dasda.
      2. Use rd.dasd= to specify the DASD where RHCOS is to be installed.
      3. Leave all other parameters unchanged.

        The following is an example parameter file, additional-worker-dasd.parm:

        rd.neednet=1 \
        console=ttysclp0 \
        coreos.inst.install_dev=/dev/dasda \
        coreos.live.rootfs_url=http://cl1.provide.example.com:8080/assets/rhcos-live-rootfs.s390x.img \
        coreos.inst.ignition_url=http://cl1.provide.example.com:8080/ignition/worker.ign \
        ip=172.18.78.2::172.18.78.1:255.255.255.0:::none nameserver=172.18.78.1 \
        rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \
        zfcp.allow_lun_scan=0 \
        rd.dasd=0.0.3490

        Write all options in the parameter file as a single line and make sure that you have no newline characters.

    • For installations on FCP-type disks, complete the following tasks:

      1. Use rd.zfcp=<adapter>,<wwpn>,<lun> to specify the FCP disk where RHCOS is to be installed. For multipathing, repeat this step for each additional path.

        Note

        When you install with multiple paths, you must enable multipathing directly after the installation, not at a later point in time, as this can cause problems.

      2. Set the install device as: coreos.inst.install_dev=/dev/sda.

        Note

        If additional LUNs are configured with NPIV, FCP requires zfcp.allow_lun_scan=0. If you must enable zfcp.allow_lun_scan=1 because you use a CSI driver, for example, you must configure your NPIV so that each node cannot access the boot partition of another node.

      3. Leave all other parameters unchanged.

        Important

        Additional postinstallation steps are required to fully enable multipathing. For more information, see “Enabling multipathing with kernel arguments on RHCOS" in Postinstallation machine configuration tasks.

        The following is an example parameter file, additional-worker-fcp.parm for a worker node with multipathing:

        rd.neednet=1 \
        console=ttysclp0 \
        coreos.inst.install_dev=/dev/sda \
        coreos.live.rootfs_url=http://cl1.provide.example.com:8080/assets/rhcos-live-rootfs.s390x.img \
        coreos.inst.ignition_url=http://cl1.provide.example.com:8080/ignition/worker.ign \
        ip=172.18.78.2::172.18.78.1:255.255.255.0:::none nameserver=172.18.78.1 \
        rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \
        zfcp.allow_lun_scan=0 \
        rd.zfcp=0.0.1987,0x50050763070bc5e3,0x4008400B00000000 \
        rd.zfcp=0.0.19C7,0x50050763070bc5e3,0x4008400B00000000 \
        rd.zfcp=0.0.1987,0x50050763071bc5e3,0x4008400B00000000 \
        rd.zfcp=0.0.19C7,0x50050763071bc5e3,0x4008400B00000000

        Write all options in the parameter file as a single line and make sure that you have no newline characters.

  8. Transfer the initramfs, kernel, parameter files, and RHCOS images to z/VM, for example, by using FTP. For details about how to transfer the files with FTP and boot from the virtual reader, see Installing under Z/VM.
  9. Punch the files to the virtual reader of the z/VM guest virtual machine.

    See PUNCH in IBM® Documentation.

    Tip

    You can use the CP PUNCH command or, if you use Linux, the vmur command to transfer files between two z/VM guest virtual machines.

  10. Log in to CMS on the bootstrap machine.
  11. IPL the bootstrap machine from the reader by running the following command:

    $ ipl c

    See IPL in IBM® Documentation.

4.6.3. Approving the certificate signing requests for your machines

When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.

Prerequisites

  • You added machines to your cluster.

Procedure

  1. Confirm that the cluster recognizes the machines:

    $ oc get nodes

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  63m  v1.28.5
    master-1  Ready     master  63m  v1.28.5
    master-2  Ready     master  64m  v1.28.5

    The output lists all of the machines that you created.

    Note

    The preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.

  2. Review the pending CSRs and ensure that you see the client requests with the Pending or Approved status for each machine that you added to the cluster:

    $ oc get csr

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-8b2br   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    csr-8vnps   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    ...

    In this example, two machines are joining the cluster. You might see more approved CSRs in the list.

  3. If the CSRs were not approved, after all of the pending CSRs for the machines you added are in Pending status, approve the CSRs for your cluster machines:

    Note

    Because the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the machine-approver if the Kubelet requests a new certificate with identical parameters.

    Note

    For clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the oc exec, oc rsh, and oc logs commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by the node-bootstrapper service account in the system:node or system:admin groups, and confirm the identity of the node.

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 1
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
      Note

      Some Operators might not become available until some CSRs are approved.

  4. Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:

    $ oc get csr

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-bfd72   5m26s   system:node:ip-10-0-50-126.us-east-2.compute.internal                       Pending
    csr-c57lv   5m26s   system:node:ip-10-0-95-157.us-east-2.compute.internal                       Pending
    ...

  5. If the remaining CSRs are not approved, and are in the Pending status, approve the CSRs for your cluster machines:

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 1
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
  6. After all client and server CSRs have been approved, the machines have the Ready status. Verify this by running the following command:

    $ oc get nodes

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  73m  v1.28.5
    master-1  Ready     master  73m  v1.28.5
    master-2  Ready     master  74m  v1.28.5
    worker-0  Ready     worker  11m  v1.28.5
    worker-1  Ready     worker  11m  v1.28.5

    Note

    It can take a few minutes after approval of the server CSRs for the machines to transition to the Ready status.

Additional information

4.7. Creating a cluster with multi-architecture compute machines on IBM Z and IBM LinuxONE with RHEL KVM

To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x) with RHEL KVM, you must have an existing single-architecture x86_64 cluster. You can then add s390x compute machines to your OpenShift Container Platform cluster.

Before you can add s390x nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.

The following procedures explain how to create a RHCOS compute machine using a RHEL KVM instance. This will allow you to add s390x nodes to your cluster and deploy a cluster with multi-architecture compute machines.

Note

To create an IBM Z® or IBM® LinuxONE (s390x) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.

4.7.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc)

Procedure

  • You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"

Verification

  1. If you see the following output, then your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }

    You can then begin adding multi-arch compute nodes to your cluster.

  2. If you see the following output, then your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

4.7.2. Creating RHCOS machines using virt-install

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your cluster by using virt-install.

Prerequisites

  • You have at least one LPAR running on RHEL 8.7 or later with KVM, referred to as RHEL KVM host in this procedure.
  • The KVM/QEMU hypervisor is installed on the RHEL KVM host.
  • You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
  • An HTTP or HTTPS server is set up.

Procedure

  1. Disable UDP aggregation.

    Currently, UDP aggregation is not supported on IBM Z® and is not automatically deactivated on multi-architecture compute clusters with an x86_64 control plane and additional s390x compute machines. To ensure that the addtional compute nodes are added to the cluster correctly, you must manually disable UDP aggregation.

    1. Create a YAML file udp-aggregation-config.yaml with the following content:

      apiVersion: v1
      kind: ConfigMap
      data:
        disable-udp-aggregation: "true"
      metadata:
        name: udp-aggregation-config
        namespace: openshift-network-operator
    2. Create the ConfigMap resource by running the following command:

      $ oc create -f udp-aggregation-config.yaml
  2. Extract the Ignition config file from the cluster by running the following command:

    $ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
  3. Upload the worker.ign Ignition config file you exported from your cluster to your HTTP server. Note the URL of this file.
  4. You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:

    $ curl -k http://<HTTP_server>/worker.ign
  5. Download the RHEL live kernel, initramfs, and rootfs files by running the following commands:

     $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')
    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')
    $ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \
    | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location')
  6. Move the downloaded RHEL live kernel, initramfs and rootfs files to an HTTP or HTTPS server before you launch virt-install.
  7. Create the new KVM guest nodes using the RHEL kernel, initramfs, and Ignition files; the new disk image; and adjusted parm line arguments.

    $ virt-install \
       --connect qemu:///system \
       --name <vm_name> \
       --autostart \
       --os-variant rhel9.2 \ 1
       --cpu host \
       --vcpus <vcpus> \
       --memory <memory_mb> \
       --disk <vm_name>.qcow2,size=<image_size> \
       --network network=<virt_network_parm> \
       --location <media_location>,kernel=<rhcos_kernel>,initrd=<rhcos_initrd> \ 2
       --extra-args "rd.neednet=1" \
       --extra-args "coreos.inst.install_dev=/dev/vda" \
       --extra-args "coreos.inst.ignition_url=<worker_ign>" \ 3
       --extra-args "coreos.live.rootfs_url=<rhcos_rootfs>" \ 4
       --extra-args "ip=<ip>::<default_gateway>:<subnet_mask_length>:<hostname>::none:<MTU>" \ 5
       --extra-args "nameserver=<dns>" \
       --extra-args "console=ttysclp0" \
       --noautoconsole \
       --wait
    1
    For os-variant, specify the RHEL version for the RHCOS compute machine. rhel9.2 is the recommended version. To query the supported RHEL version of your operating system, run the following command:
    $ osinfo-query os -f short-id
    Note

    The os-variant is case sensitive.

    2
    For --location, specify the location of the kernel/initrd on the HTTP or HTTPS server.
    3
    For coreos.inst.ignition_url=, specify the worker.ign Ignition file for the machine role. Only HTTP and HTTPS protocols are supported.
    4
    For coreos.live.rootfs_url=, specify the matching rootfs artifact for the kernel and initramfs you are booting. Only HTTP and HTTPS protocols are supported.
    5
    Optional: For hostname, specify the fully qualified hostname of the client machine.
    Note

    If you are using HAProxy as a load balancer, update your HAProxy rules for ingress-router-443 and ingress-router-80 in the /etc/haproxy/haproxy.cfg configuration file.

  8. Continue to create more compute machines for your cluster.

4.7.3. Approving the certificate signing requests for your machines

When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.

Prerequisites

  • You added machines to your cluster.

Procedure

  1. Confirm that the cluster recognizes the machines:

    $ oc get nodes

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  63m  v1.28.5
    master-1  Ready     master  63m  v1.28.5
    master-2  Ready     master  64m  v1.28.5

    The output lists all of the machines that you created.

    Note

    The preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.

  2. Review the pending CSRs and ensure that you see the client requests with the Pending or Approved status for each machine that you added to the cluster:

    $ oc get csr

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-8b2br   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    csr-8vnps   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    ...

    In this example, two machines are joining the cluster. You might see more approved CSRs in the list.

  3. If the CSRs were not approved, after all of the pending CSRs for the machines you added are in Pending status, approve the CSRs for your cluster machines:

    Note

    Because the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the machine-approver if the Kubelet requests a new certificate with identical parameters.

    Note

    For clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the oc exec, oc rsh, and oc logs commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by the node-bootstrapper service account in the system:node or system:admin groups, and confirm the identity of the node.

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 1
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
      Note

      Some Operators might not become available until some CSRs are approved.

  4. Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:

    $ oc get csr

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-bfd72   5m26s   system:node:ip-10-0-50-126.us-east-2.compute.internal                       Pending
    csr-c57lv   5m26s   system:node:ip-10-0-95-157.us-east-2.compute.internal                       Pending
    ...

  5. If the remaining CSRs are not approved, and are in the Pending status, approve the CSRs for your cluster machines:

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 1
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
  6. After all client and server CSRs have been approved, the machines have the Ready status. Verify this by running the following command:

    $ oc get nodes

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  73m  v1.28.5
    master-1  Ready     master  73m  v1.28.5
    master-2  Ready     master  74m  v1.28.5
    worker-0  Ready     worker  11m  v1.28.5
    worker-1  Ready     worker  11m  v1.28.5

    Note

    It can take a few minutes after approval of the server CSRs for the machines to transition to the Ready status.

Additional information

4.8. Creating a cluster with multi-architecture compute machines on IBM Power

To create a cluster with multi-architecture compute machines on IBM Power® (ppc64le), you must have an existing single-architecture (x86_64) cluster. You can then add ppc64le compute machines to your OpenShift Container Platform cluster.

Important

Before you can add ppc64le nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.

The following procedures explain how to create a RHCOS compute machine using an ISO image or network PXE booting. This will allow you to add ppc64le nodes to your cluster and deploy a cluster with multi-architecture compute machines.

Note

To create an IBM Power® (ppc64le) cluster with multi-architecture compute machines on x86_64, follow the instructions for Installing a cluster on IBM Power®. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.

4.8.1. Verifying cluster compatibility

Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.

Prerequisites

  • You installed the OpenShift CLI (oc)
Note

When using multiple architectures, hosts for OpenShift Container Platform nodes must share the same storage layer. If they do not have the same storage layer, use a storage provider such as nfs-provisioner.

Note

You should limit the number of network hops between the compute and control plane as much as possible.

Procedure

  • You can check that your cluster uses the architecture payload by running the following command:

    $ oc adm release info -o jsonpath="{ .metadata.metadata}"

Verification

  1. If you see the following output, then your cluster is using the multi-architecture payload:

    {
     "release.openshift.io/architecture": "multi",
     "url": "https://access.redhat.com/errata/<errata_version>"
    }

    You can then begin adding multi-arch compute nodes to your cluster.

  2. If you see the following output, then your cluster is not using the multi-architecture payload:

    {
     "url": "https://access.redhat.com/errata/<errata_version>"
    }
    Important

    To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.

4.8.2. Creating RHCOS machines using an ISO image

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your cluster by using an ISO image to create the machines.

Prerequisites

  • Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
  • You must have the OpenShift CLI (oc) installed.

Procedure

  1. Extract the Ignition config file from the cluster by running the following command:

    $ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign
  2. Upload the worker.ign Ignition config file you exported from your cluster to your HTTP server. Note the URLs of these files.
  3. You can validate that the ignition files are available on the URLs. The following example gets the Ignition config files for the compute node:

    $ curl -k http://<HTTP_server>/worker.ign
  4. You can access the ISO image for booting your new machine by running to following command:

    RHCOS_VHD_ORIGIN_URL=$(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' | jq -r '.architectures.<architecture>.artifacts.metal.formats.iso.disk.location')
  5. Use the ISO file to install RHCOS on more compute machines. Use the same method that you used when you created machines before you installed the cluster:

    • Burn the ISO image to a disk and boot it directly.
    • Use ISO redirection with a LOM interface.
  6. Boot the RHCOS ISO image without specifying any options, or interrupting the live boot sequence. Wait for the installer to boot into a shell prompt in the RHCOS live environment.

    Note

    You can interrupt the RHCOS installation boot process to add kernel arguments. However, for this ISO procedure you must use the coreos-installer command as outlined in the following steps, instead of adding kernel arguments.

  7. Run the coreos-installer command and specify the options that meet your installation requirements. At a minimum, you must specify the URL that points to the Ignition config file for the node type, and the device that you are installing to:

    $ sudo coreos-installer install --ignition-url=http://<HTTP_server>/<node_type>.ign <device> --ignition-hash=sha512-<digest> 12
    1
    You must run the coreos-installer command by using sudo, because the core user does not have the required root privileges to perform the installation.
    2
    The --ignition-hash option is required when the Ignition config file is obtained through an HTTP URL to validate the authenticity of the Ignition config file on the cluster node. <digest> is the Ignition config file SHA512 digest obtained in a preceding step.
    Note

    If you want to provide your Ignition config files through an HTTPS server that uses TLS, you can add the internal certificate authority (CA) to the system trust store before running coreos-installer.

    The following example initializes a bootstrap node installation to the /dev/sda device. The Ignition config file for the bootstrap node is obtained from an HTTP web server with the IP address 192.168.1.2:

    $ sudo coreos-installer install --ignition-url=http://192.168.1.2:80/installation_directory/bootstrap.ign /dev/sda --ignition-hash=sha512-a5a2d43879223273c9b60af66b44202a1d1248fc01cf156c46d4a79f552b6bad47bc8cc78ddf0116e80c59d2ea9e32ba53bc807afbca581aa059311def2c3e3b
  8. Monitor the progress of the RHCOS installation on the console of the machine.

    Important

    Ensure that the installation is successful on each node before commencing with the OpenShift Container Platform installation. Observing the installation process can also help to determine the cause of RHCOS installation issues that might arise.

  9. Continue to create more compute machines for your cluster.

4.8.3. Creating RHCOS machines by PXE or iPXE booting

You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines for your bare metal cluster by using PXE or iPXE booting.

Prerequisites

  • Obtain the URL of the Ignition config file for the compute machines for your cluster. You uploaded this file to your HTTP server during installation.
  • Obtain the URLs of the RHCOS ISO image, compressed metal BIOS, kernel, and initramfs files that you uploaded to your HTTP server during cluster installation.
  • You have access to the PXE booting infrastructure that you used to create the machines for your OpenShift Container Platform cluster during installation. The machines must boot from their local disks after RHCOS is installed on them.
  • If you use UEFI, you have access to the grub.conf file that you modified during OpenShift Container Platform installation.

Procedure

  1. Confirm that your PXE or iPXE installation for the RHCOS images is correct.

    • For PXE:

      DEFAULT pxeboot
      TIMEOUT 20
      PROMPT 0
      LABEL pxeboot
          KERNEL http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> 1
          APPEND initrd=http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img 2
      1
      Specify the location of the live kernel file that you uploaded to your HTTP server.
      2
      Specify locations of the RHCOS files that you uploaded to your HTTP server. The initrd parameter value is the location of the live initramfs file, the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file, and the coreos.live.rootfs_url parameter value is the location of the live rootfs file. The coreos.inst.ignition_url and coreos.live.rootfs_url parameters only support HTTP and HTTPS.
      Note

      This configuration does not enable serial console access on machines with a graphical console. To configure a different console, add one or more console= arguments to the APPEND line. For example, add console=tty0 console=ttyS0 to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux?.

    • For iPXE (x86_64 + ppc64le):

      kernel http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture> initrd=main coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 1 2
      initrd --name main http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img 3
      boot
      1
      Specify the locations of the RHCOS files that you uploaded to your HTTP server. The kernel parameter value is the location of the kernel file, the initrd=main argument is needed for booting on UEFI systems, the coreos.live.rootfs_url parameter value is the location of the rootfs file, and the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file.
      2
      If you use multiple NICs, specify a single interface in the ip option. For example, to use DHCP on a NIC that is named eno1, set ip=eno1:dhcp.
      3
      Specify the location of the initramfs file that you uploaded to your HTTP server.
      Note

      This configuration does not enable serial console access on machines with a graphical console To configure a different console, add one or more console= arguments to the kernel line. For example, add console=tty0 console=ttyS0 to set the first PC serial port as the primary console and the graphical console as a secondary console. For more information, see How does one set up a serial terminal and/or console in Red Hat Enterprise Linux? and "Enabling the serial console for PXE and ISO installation" in the "Advanced RHCOS installation configuration" section.

      Note

      To network boot the CoreOS kernel on ppc64le architecture, you need to use a version of iPXE build with the IMAGE_GZIP option enabled. See IMAGE_GZIP option in iPXE.

    • For PXE (with UEFI and GRUB as second stage) on ppc64le:

      menuentry 'Install CoreOS' {
          linux rhcos-<version>-live-kernel-<architecture>  coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img coreos.inst.install_dev=/dev/sda coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 1 2
          initrd rhcos-<version>-live-initramfs.<architecture>.img 3
      }
      1
      Specify the locations of the RHCOS files that you uploaded to your HTTP/TFTP server. The kernel parameter value is the location of the kernel file on your TFTP server. The coreos.live.rootfs_url parameter value is the location of the rootfs file, and the coreos.inst.ignition_url parameter value is the location of the worker Ignition config file on your HTTP Server.
      2
      If you use multiple NICs, specify a single interface in the ip option. For example, to use DHCP on a NIC that is named eno1, set ip=eno1:dhcp.
      3
      Specify the location of the initramfs file that you uploaded to your TFTP server.
  2. Use the PXE or iPXE infrastructure to create the required compute machines for your cluster.

4.8.4. Approving the certificate signing requests for your machines

When you add machines to a cluster, two pending certificate signing requests (CSRs) are generated for each machine that you added. You must confirm that these CSRs are approved or, if necessary, approve them yourself. The client requests must be approved first, followed by the server requests.

Prerequisites

  • You added machines to your cluster.

Procedure

  1. Confirm that the cluster recognizes the machines:

    $ oc get nodes

    Example output

    NAME      STATUS    ROLES   AGE  VERSION
    master-0  Ready     master  63m  v1.28.5
    master-1  Ready     master  63m  v1.28.5
    master-2  Ready     master  64m  v1.28.5

    The output lists all of the machines that you created.

    Note

    The preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.

  2. Review the pending CSRs and ensure that you see the client requests with the Pending or Approved status for each machine that you added to the cluster:

    $ oc get csr

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-8b2br   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    csr-8vnps   15m     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
    ...

    In this example, two machines are joining the cluster. You might see more approved CSRs in the list.

  3. If the CSRs were not approved, after all of the pending CSRs for the machines you added are in Pending status, approve the CSRs for your cluster machines:

    Note

    Because the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the machine-approver if the Kubelet requests a new certificate with identical parameters.

    Note

    For clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the oc exec, oc rsh, and oc logs commands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by the node-bootstrapper service account in the system:node or system:admin groups, and confirm the identity of the node.

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 1
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approve
      Note

      Some Operators might not become available until some CSRs are approved.

  4. Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:

    $ oc get csr

    Example output

    NAME        AGE     REQUESTOR                                                                   CONDITION
    csr-bfd72   5m26s   system:node:ip-10-0-50-126.us-east-2.compute.internal                       Pending
    csr-c57lv   5m26s   system:node:ip-10-0-95-157.us-east-2.compute.internal                       Pending
    ...

  5. If the remaining CSRs are not approved, and are in the Pending status, approve the CSRs for your cluster machines:

    • To approve them individually, run the following command for each valid CSR:

      $ oc adm certificate approve <csr_name> 1
      1
      <csr_name> is the name of a CSR from the list of current CSRs.
    • To approve all pending CSRs, run the following command:

      $ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
  6. After all client and server CSRs have been approved, the machines have the Ready status. Verify this by running the following command:

    $ oc get nodes -o wide

    Example output

    NAME               STATUS   ROLES                  AGE   VERSION           INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                  CONTAINER-RUNTIME
    worker-0-ppc64le   Ready    worker                 42d   v1.28.2+e3ba6d9   192.168.200.21   <none>        Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.ppc64le   cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9
    worker-1-ppc64le   Ready    worker                 42d   v1.28.2+e3ba6d9   192.168.200.20   <none>        Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.ppc64le   cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9
    master-0-x86       Ready    control-plane,master   75d   v1.28.2+e3ba6d9   10.248.0.38      10.248.0.38   Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.x86_64    cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9
    master-1-x86       Ready    control-plane,master   75d   v1.28.2+e3ba6d9   10.248.0.39      10.248.0.39   Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.x86_64    cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9
    master-2-x86       Ready    control-plane,master   75d   v1.28.2+e3ba6d9   10.248.0.40      10.248.0.40   Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.x86_64    cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9
    worker-0-x86       Ready    worker                 75d   v1.28.2+e3ba6d9   10.248.0.43      10.248.0.43   Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.x86_64    cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9
    worker-1-x86       Ready    worker                 75d   v1.28.2+e3ba6d9   10.248.0.44      10.248.0.44   Red Hat Enterprise Linux CoreOS 415.92.202309261919-0 (Plow)   5.14.0-284.34.1.el9_2.x86_64    cri-o://1.28.1-3.rhaos4.15.gitb36169e.el9

    Note

    It can take a few minutes after approval of the server CSRs for the machines to transition to the Ready status.

Additional information

4.9. Managing your cluster with multi-architecture compute machines

4.9.1. Scheduling workloads on clusters with multi-architecture compute machines

Deploying a workload on a cluster with compute nodes of different architectures requires attention and monitoring of your cluster. There might be further actions you need to take in order to successfully place pods in the nodes of your cluster.

For more detailed information on node affinity, scheduling, taints and tolerlations, see the following documentatinon:

4.9.1.1. Sample multi-architecture node workload deployments

Before you schedule workloads on a cluster with compute nodes of different architectures, consider the following use cases:

Using node affinity to schedule workloads on a node

You can allow a workload to be scheduled on only a set of nodes with architectures supported by its images, you can set the spec.affinity.nodeAffinity field in your pod’s template specification.

Example deployment with the nodeAffinity set to certain architectures

apiVersion: apps/v1
kind: Deployment
metadata: # ...
spec:
   # ...
  template:
     # ...
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/arch
                operator: In
                values: 1
                - amd64
                - arm64

1
Specify the supported architectures. Valid values include amd64,arm64, or both values.
Tainting every node for a specific architecture

You can taint a node to avoid workloads that are not compatible with its architecture to be scheduled on that node. In the case where your cluster is using a MachineSet object, you can add parameters to the .spec.template.spec.taints field to avoid workloads being scheduled on nodes with non-supported architectures.

  • Before you can taint a node, you must scale down the MachineSet object or remove available machines. You can scale down the machine set by using one of following commands:

    $ oc scale --replicas=0 machineset <machineset> -n openshift-machine-api

    Or:

    $ oc edit machineset <machineset> -n openshift-machine-api

    For more information on scaling machine sets, see "Modifying a compute machine set".

Example MachineSet with a taint set

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata: # ...
spec:
  # ...
  template:
    # ...
    spec:
      # ...
      taints:
      - effect: NoSchedule
        key: multi-arch.openshift.io/arch
        value: arm64

You can also set a taint on a specific node by running the following command:

$ oc adm taint nodes <node-name> multi-arch.openshift.io/arch=arm64:NoSchedule
Creating a default toleration

You can annotate a namespace so all of the workloads get the same default toleration by running the following command:

$ oc annotate namespace my-namespace \
  'scheduler.alpha.kubernetes.io/defaultTolerations'='[{"operator": "Exists", "effect": "NoSchedule", "key": "multi-arch.openshift.io/arch"}]'
Tolerating architecture taints in workloads

On a node with a defined taint, workloads will not be scheduled on that node. However, you can allow them to be scheduled by setting a toleration in the pod’s specification.

Example deployment with a toleration

apiVersion: apps/v1
kind: Deployment
metadata: # ...
spec:
  # ...
  template:
    # ...
    spec:
      tolerations:
      - key: "multi-arch.openshift.io/arch"
        value: "arm64"
        operator: "Equal"
        effect: "NoSchedule"

This example deployment can also be allowed on nodes with the multi-arch.openshift.io/arch=arm64 taint specified.

Using node affinity with taints and tolerations

When a scheduler computes the set of nodes to schedule a pod, tolerations can broaden the set while node affinity restricts the set. If you set a taint to the nodes of a specific architecture, the following example toleration is required for scheduling pods.

Example deployment with a node affinity and toleration set.

apiVersion: apps/v1
kind: Deployment
metadata: # ...
spec:
  # ...
  template:
    # ...
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/arch
                operator: In
                values:
                - amd64
                - arm64
      tolerations:
      - key: "multi-arch.openshift.io/arch"
        value: "arm64"
        operator: "Equal"
        effect: "NoSchedule"

Additional resources

4.9.2. Enabling 64k pages on the Red Hat Enterprise Linux CoreOS (RHCOS) kernel

You can enable the 64k memory page in the Red Hat Enterprise Linux CoreOS (RHCOS) kernel on the 64-bit ARM compute machines in your cluster. The 64k page size kernel specification can be used for large GPU or high memory workloads. This is done using the Machine Config Operator (MCO) which uses a machine config pool to update the kernel. To enable 64k page sizes, you must dedicate a machine config pool for ARM64 to enable on the kernel.

Important

Using 64k pages is exclusive to 64-bit ARM architecture compute nodes or clusters installed on 64-bit ARM machines. If you configure the 64k pages kernel on a machine config pool using 64-bit x86 machines, the machine config pool and MCO will degrade.

Prerequisites

  • You installed the OpenShift CLI (oc).
  • You created a cluster with compute nodes of different architecture on one of the supported platforms.

Procedure

  1. Label the nodes where you want to run the 64k page size kernel:

    $ oc label node <node_name> <label>

    Example command

    $ oc label node worker-arm64-01 node-role.kubernetes.io/worker-64k-pages=

  2. Create a machine config pool that contains the worker role that uses the ARM64 architecture and the worker-64k-pages role:

    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineConfigPool
    metadata:
      name: worker-64k-pages
    spec:
      machineConfigSelector:
        matchExpressions:
          - key: machineconfiguration.openshift.io/role
            operator: In
            values:
            - worker
            - worker-64k-pages
      nodeSelector:
        matchLabels:
          node-role.kubernetes.io/worker-64k-pages: ""
          kubernetes.io/arch: arm64
  3. Create a machine config on your compute node to enable 64k-pages with the 64k-pages parameter.

    $ oc create -f <filename>.yaml

    Example MachineConfig

    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineConfig
    metadata:
      labels:
        machineconfiguration.openshift.io/role: "worker-64k-pages" 1
      name: 99-worker-64kpages
    spec:
      kernelType: 64k-pages 2

    1
    Specify the value of the machineconfiguration.openshift.io/role label in the custom machine config pool. The example MachineConfig uses the worker-64k-pages label to enable 64k pages in the worker-64k-pages pool.
    2
    Specify your desired kernel type. Valid values are 64k-pages and default
    Note

    The 64k-pages type is supported on only 64-bit ARM architecture based compute nodes. The realtime type is supported on only 64-bit x86 architecture based compute nodes.

Verification

  • To view your new worker-64k-pages machine config pool, run the following command:

    $ oc get mcp

    Example output

    NAME     CONFIG                                                                UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
    master   rendered-master-9d55ac9a91127c36314e1efe7d77fbf8                      True      False      False      3              3                   3                     0                      361d
    worker   rendered-worker-e7b61751c4a5b7ff995d64b967c421ff                      True      False      False      7              7                   7                     0                      361d
    worker-64k-pages  rendered-worker-64k-pages-e7b61751c4a5b7ff995d64b967c421ff   True      False      False      2              2                   2                     0                      35m

4.9.3. Importing manifest lists in image streams on your multi-architecture compute machines

On an OpenShift Container Platform 4.15 cluster with multi-architecture compute machines, the image streams in the cluster do not import manifest lists automatically. You must manually change the default importMode option to the PreserveOriginal option in order to import the manifest list.

Prerequisites

  • You installed the OpenShift Container Platform CLI (oc).

Procedure

  • The following example command shows how to patch the ImageStream cli-artifacts so that the cli-artifacts:latest image stream tag is imported as a manifest list.

    $ oc patch is/cli-artifacts -n openshift -p '{"spec":{"tags":[{"name":"latest","importPolicy":{"importMode":"PreserveOriginal"}}]}}'

Verification

  • You can check that the manifest lists imported properly by inspecting the image stream tag. The following command will list the individual architecture manifests for a particular tag.

    $ oc get istag cli-artifacts:latest -n openshift -oyaml

    If the dockerImageManifests object is present, then the manifest list import was successful.

    Example output of the dockerImageManifests object

    dockerImageManifests:
      - architecture: amd64
        digest: sha256:16d4c96c52923a9968fbfa69425ec703aff711f1db822e4e9788bf5d2bee5d77
        manifestSize: 1252
        mediaType: application/vnd.docker.distribution.manifest.v2+json
        os: linux
      - architecture: arm64
        digest: sha256:6ec8ad0d897bcdf727531f7d0b716931728999492709d19d8b09f0d90d57f626
        manifestSize: 1252
        mediaType: application/vnd.docker.distribution.manifest.v2+json
        os: linux
      - architecture: ppc64le
        digest: sha256:65949e3a80349cdc42acd8c5b34cde6ebc3241eae8daaeea458498fedb359a6a
        manifestSize: 1252
        mediaType: application/vnd.docker.distribution.manifest.v2+json
        os: linux
      - architecture: s390x
        digest: sha256:75f4fa21224b5d5d511bea8f92dfa8e1c00231e5c81ab95e83c3013d245d1719
        manifestSize: 1252
        mediaType: application/vnd.docker.distribution.manifest.v2+json
        os: linux

Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.