Home
Products
OpenShift Container Platform
4.19
Installing on VMware vSphere
Chapter 10. Multiple regions and zones configuration for a cluster on VMware vSphere

Chapter 10. Multiple regions and zones configuration for a cluster on VMware vSphere

As an administrator, you can specify multiple regions and zones for your OpenShift Container Platform cluster that runs on a VMware vSphere instance. This configuration reduces the risk of a hardware failure or network outage causing your cluster to fail.

A failure domain configuration lists parameters that create a topology. The following list states some of these parameters:

computeCluster
datacenter
datastore
networks
resourcePool

After you define multiple regions and zones for your OpenShift Container Platform cluster, you can create or migrate nodes to another failure domain.

Important

If you want to migrate pre-existing OpenShift Container Platform cluster compute nodes to a failure domain, you must define a new compute machine set for the compute node. This new machine set can scale up a compute node according to the topology of the failure domain, and scale down the pre-existing compute node.

The cloud provider adds topology.kubernetes.io/zone and topology.kubernetes.io/region labels to any compute node provisioned by a machine set resource.

For more information, see Creating a compute machine set.

10.1. Specifying multiple regions and zones for your cluster on vSphere
Copy link

You can configure the infrastructures.config.openshift.io configuration resource to specify multiple regions and zones for your OpenShift Container Platform cluster that runs on a VMware vSphere instance.

Topology-aware features for the cloud controller manager and the vSphere Container Storage Interface (CSI) Operator Driver require information about the vSphere topology where you host your OpenShift Container Platform cluster. This topology information exists in the infrastructures.config.openshift.io configuration resource.

Before you specify regions and zones for your cluster, you must ensure that all data centers and compute clusters contain tags, so that the cloud provider can add labels to your node. For example, if data-center-1 represents region-a and compute-cluster-1 represents zone-1, the cloud provider adds an openshift-region category label with a value of region-a to data-center-1. Additionally, the cloud provider adds an openshift-zone category tag with a value of zone-1 to compute-cluster-1.

Note

You can migrate control plane nodes with vMotion capabilities to a failure domain. After you add these nodes to a failure domain, the cloud provider adds topology.kubernetes.io/zone and topology.kubernetes.io/region labels to these nodes.

Prerequisites

You created the openshift-region and openshift-zone tag categories on the vCenter server.
You ensured that each data center and compute cluster contains tags that represent the name of their associated region or zone, or both.
Optional: If you defined API and Ingress static IP addresses to the installation program, you must ensure that all regions and zones share a common layer 2 network. This configuration ensures that API and Ingress Virtual IP (VIP) addresses can interact with your cluster.

Important

If you do not supply tags to all data centers and compute clusters before you create a node or migrate a node, the cloud provider cannot add the topology.kubernetes.io/zone and topology.kubernetes.io/region labels to the node. This means that services cannot route traffic to your node.

Procedure

Edit the infrastructures.config.openshift.io custom resource definition (CRD) of your cluster to specify multiple regions and zones in the failureDomains section of the resource by running the following command:

oc edit infrastructures.config.openshift.io cluster

$ oc edit infrastructures.config.openshift.io cluster

Copy to Clipboard

Toggle word wrap

Example infrastructures.config.openshift.io CRD for a instance named cluster with multiple regions and zones defined in its configuration

spec:
  cloudConfig:
    key: config
    name: cloud-provider-config
  platformSpec:
    type: vSphere
    vsphere:
      vcenters:
        - datacenters:
            - <region_a_data_center>
            - <region_b_data_center>
          port: 443
          server: <your_vcenter_server>
      failureDomains:
        - name: <failure_domain_1>
          region: <region_a>
          zone: <zone_a>
          server: <your_vcenter_server>
          topology:
            datacenter: <region_a_dc>
            computeCluster: "</region_a_dc/host/zone_a_cluster>"
            resourcePool: "</region_a_dc/host/zone_a_cluster/Resources/resource_pool>"
            datastore: "</region_a_dc/datastore/datastore_a>"
            networks:
            - port-group
        - name: <failure_domain_2>
          region: <region_a>
          zone: <zone_b>
          server: <your_vcenter_server>
          topology:
            computeCluster: </region_a_dc/host/zone_b_cluster>
            datacenter: <region_a_dc>
            datastore: </region_a_dc/datastore/datastore_a>
            networks:
            - port-group
        - name: <failure_domain_3>
          region: <region_b>
          zone: <zone_a>
          server: <your_vcenter_server>
          topology:
            computeCluster: </region_b_dc/host/zone_a_cluster>
            datacenter: <region_b_dc>
            datastore: </region_b_dc/datastore/datastore_b>
            networks:
            - port-group
      nodeNetworking:
        external: {}
        internal: {}

spec:
  cloudConfig:
    key: config
    name: cloud-provider-config
  platformSpec:
    type: vSphere
    vsphere:
      vcenters:
        - datacenters:
            - <region_a_data_center>
            - <region_b_data_center>
          port: 443
          server: <your_vcenter_server>
      failureDomains:
        - name: <failure_domain_1>
          region: <region_a>
          zone: <zone_a>
          server: <your_vcenter_server>
          topology:
            datacenter: <region_a_dc>
            computeCluster: "</region_a_dc/host/zone_a_cluster>"
            resourcePool: "</region_a_dc/host/zone_a_cluster/Resources/resource_pool>"
            datastore: "</region_a_dc/datastore/datastore_a>"
            networks:
            - port-group
        - name: <failure_domain_2>
          region: <region_a>
          zone: <zone_b>
          server: <your_vcenter_server>
          topology:
            computeCluster: </region_a_dc/host/zone_b_cluster>
            datacenter: <region_a_dc>
            datastore: </region_a_dc/datastore/datastore_a>
            networks:
            - port-group
        - name: <failure_domain_3>
          region: <region_b>
          zone: <zone_a>
          server: <your_vcenter_server>
          topology:
            computeCluster: </region_b_dc/host/zone_a_cluster>
            datacenter: <region_b_dc>
            datastore: </region_b_dc/datastore/datastore_b>
            networks:
            - port-group
      nodeNetworking:
        external: {}
        internal: {}

Copy to Clipboard

Toggle word wrap

Important

After you create a failure domain and you define it in a CRD for a VMware vSphere cluster, you must not modify or delete the failure domain. Doing any of these actions with this configuration can impact the availability and fault tolerance of a control plane machine.

Save the resource file to apply the changes.

10.2. Enabling a multiple layer 2 network for your cluster
Copy link

You can configure your cluster to use a multiple layer 2 network configuration so that data transfer among nodes can span across multiple networks.

Prerequisites

You configured network connectivity among machines so that cluster components can communicate with each other.

Procedure

If you installed your cluster with installer-provisioned infrastructure, you must ensure that all control plane nodes share a common layer 2 network. Additionally, ensure compute nodes that are configured for Ingress pod scheduling share a common layer 2 network.
- If you need compute nodes to span multiple layer 2 networks, you can create infrastructure nodes that can host Ingress pods.
- If you need to provision workloads across additional layer 2 networks, you can create compute machine sets on vSphere and then move these workloads to your target layer 2 networks.
If you installed your cluster on infrastructure that you provided, which is defined as a user-provisioned infrastructure, complete the following actions to meet your needs:
- Configure your API load balancer and network so that the load balancer can reach the API and Machine Config Server on the control plane nodes.
- Configure your Ingress load balancer and network so that the load balancer can reach the Ingress pods on the compute or infrastructure nodes.

10.3. Parameters for the cluster-wide infrastructure CRD
Copy link

You must set values for specific parameters in the cluster-wide infrastructure, infrastructures.config.openshift.io, Custom Resource Definition (CRD) to define multiple regions and zones for your OpenShift Container Platform cluster that runs on a VMware vSphere instance.

The following table lists mandatory parameters for defining multiple regions and zones for your OpenShift Container Platform cluster:

Expand

Parameter	Description
`vcenters`	The vCenter servers for your OpenShift Container Platform cluster. You can specify either a single vCenter, or up to 3 vCenters.
`datacenters`	vCenter data centers where VMs associated with the OpenShift Container Platform cluster will be created or presently exist.
`port`	The TCP port of the vCenter server.
`server`	The fully qualified domain name (FQDN) of the vCenter server.
`failureDomains`	The list of failure domains.
`name`	The name of the failure domain.
`region`	The value of the `openshift-region` tag assigned to the topology for the failure failure domain.
`zone`	The value of the `openshift-zone` tag assigned to the topology for the failure failure domain.
`topology`	The vCenter reources associated with the failure domain.
`datacenter`	The data center associated with the failure domain.
`computeCluster`	The full path of the compute cluster associated with the failure domain.
`resourcePool`	The full path of the resource pool associated with the failure domain.
`datastore`	The full path of the datastore associated with the failure domain.
`networks`	A list of port groups associated with the failure domain. Only one portgroup may be defined.

10.4. Specifying multiple host groups for your cluster on vSphere
Copy link

You can configure the infrastructures.config.openshift.io configuration resource to specify multiple host groups for your OpenShift Container Platform cluster that runs on a VMware vSphere instance. This is necessary if your vSphere instance is in a stretched cluster configuration, with your ESXi hosts and storage distributed across multiple physical data centers. Use this procedure if you did not already configure host groups for your OpenShift Container Platform cluster at installation, or if you need to update your OpenShift Container Platform cluster with additional host groups.

Important

OpenShift zones support for vSphere host groups is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see the following link:

Technology Preview Features Support Scope

Prerequisites

ESXi hosts are grouped into host groups, which are linked via VM-host affinity rules to corresponding virtual machine (VM) groups. See the following example govc commands for details:

This example shows the correct configuration for a cluster with two host groups:
Create host groups:
Create VM groups:
Create VM-host affinity rules:
Add ESXi hosts to host groups:

# This example shows the correct configuration for a cluster with two host groups:

# Create host groups:
govc cluster.group.create -name <host_group_1> -host
govc cluster.group.create -name <host_group_2> -host

# Create VM groups:
govc cluster.group.create -name <vm_group_1> -vm
govc cluster.group.create -name <vm_group_2> -vm

# Create VM-host affinity rules:
govc cluster.rule.create -name <rule_1> -enable -vm-host -vm-group <vm_group_1> -host-affine-group <host_group_1>
govc cluster.rule.create -name <rule_2> -enable -vm-host -vm-group <vm_group_2> -host-affine-group <host_group_2>

# Add ESXi hosts to host groups:
govc cluster.group.change -name <host_group_1> <esxi_host_1_ip>
govc cluster.group.change -name <host_group_2> <esxi_host_2_ip>

Copy to Clipboard

Toggle word wrap

openshift-region and openshift-zone tag categories are created on the vCenter server.
Compute clusters have tags from the openshift-region tag category.
ESXi hosts within host groups have tags from the openshift-zone tag category.
Host.Inventory.EditCluster privilege is granted on the vSphere vCenter cluster object.
TechPreviewNoUpgrade feature set is enabled. For more information, "see Enabling features using feature gates".

Procedure

Edit the infrastructure settings of your OpenShift Container Platform cluster.

To copy your existing infrastructure settings to a file, run the following command:

oc get infrastructures.config.openshift.io cluster -o yaml > <name_of_infrastructure_file>.yaml

$ oc get infrastructures.config.openshift.io cluster -o yaml > <name_of_infrastructure_file>.yaml

Copy to Clipboard

Toggle word wrap

Edit your infrastructure file to include a failure domain for each host group in your vSphere cluster. Refer to the following YAML file for an example of this configuration. Ensure you replace any values wrapped in angle brackets (< >) with your values:

apiVersion: config.openshift.io/v1
kind: Infrastructure
metadata:
  name: cluster
spec:
  cloudConfig:
    key: config
    name: cloud-provider-config
  platformSpec:
    type: VSphere
    vsphere:
      apiServerInternalIPs:
      - <internal_ip_of_api_server>
      failureDomains:
      - name: <unique_name_for_failure_domain_1>
        region: <cluster_1_region_tag>
        server: <vcenter_server_ip_address>
        zoneAffinity:
          type: HostGroup
          hostGroup:
            vmGroup: <name_of_vm_group_1>
            hostGroup: <name_of_host_group_1>
            vmHostRule: <name_of_vm_host_affinity_rule_1>
        regionAffinity:
          type: ComputeCluster
        topology:
          computeCluster: /<data_center_1>/host/<cluster_1>
          datacenter: <data_center_1>
          datastore: /<data_center_1>/datastore/<datastore_1>
          networks:
          - VM Network
          resourcePool: /<data_center_1>/host/<cluster_1>/Resources
          template: /<data_center_1>/vm/<vm_template>
        zone: <host_group_1_tag>
      - name: <unique_name_for_failure_domain_2>
        region: <cluster_1_region_tag>
        server: <vcenter_server_ip_address>
        zoneAffinity:
          type: HostGroup
          hostGroup:
            vmGroup: <name_of_vm_group_2>
            hostGroup: <name_of_host_group_2>
            vmHostRule: <name_of_vm_host_affinity_rule_2>
        regionAffinity:
          type: ComputeCluster
        topology:
          computeCluster: /<data_center_1>/host/<cluster_1>
          datacenter: <data_center_1>
          datastore: /<data_center_1>/datastore/<datastore_1>
          networks:
          - VM Network
          resourcePool: /<data_center_1>/host/<cluster_1>/Resources
          template: /<data_center_1>/vm/<vm_template>
        zone: <host_group_2_tag>
# ...

apiVersion: config.openshift.io/v1
kind: Infrastructure
metadata:
  name: cluster
spec:
  cloudConfig:
    key: config
    name: cloud-provider-config
  platformSpec:
    type: VSphere
    vsphere:
      apiServerInternalIPs:
      - <internal_ip_of_api_server>
      failureDomains:
      - name: <unique_name_for_failure_domain_1>
        region: <cluster_1_region_tag>
        server: <vcenter_server_ip_address>
        zoneAffinity:
          type: HostGroup
          hostGroup:
            vmGroup: <name_of_vm_group_1>
            hostGroup: <name_of_host_group_1>
            vmHostRule: <name_of_vm_host_affinity_rule_1>
        regionAffinity:
          type: ComputeCluster
        topology:
          computeCluster: /<data_center_1>/host/<cluster_1>
          datacenter: <data_center_1>
          datastore: /<data_center_1>/datastore/<datastore_1>
          networks:
          - VM Network
          resourcePool: /<data_center_1>/host/<cluster_1>/Resources
          template: /<data_center_1>/vm/<vm_template>
        zone: <host_group_1_tag>
      - name: <unique_name_for_failure_domain_2>
        region: <cluster_1_region_tag>
        server: <vcenter_server_ip_address>
        zoneAffinity:
          type: HostGroup
          hostGroup:
            vmGroup: <name_of_vm_group_2>
            hostGroup: <name_of_host_group_2>
            vmHostRule: <name_of_vm_host_affinity_rule_2>
        regionAffinity:
          type: ComputeCluster
        topology:
          computeCluster: /<data_center_1>/host/<cluster_1>
          datacenter: <data_center_1>
          datastore: /<data_center_1>/datastore/<datastore_1>
          networks:
          - VM Network
          resourcePool: /<data_center_1>/host/<cluster_1>/Resources
          template: /<data_center_1>/vm/<vm_template>
        zone: <host_group_2_tag>
# ...

Copy to Clipboard

Toggle word wrap

To update your cluster with these changes, run the following command:
```
oc replace -f <name_of_infrastructure_file>.yaml
```
```
$ oc replace -f <name_of_infrastructure_file>.yaml
```
Copy to Clipboard Toggle word wrap

Update your ControlPlaneMachineSet custom resource (CR) with the new failure domains by completing the following steps:

Edit the ControlPlaneMachineSet CR by running the following command:

oc edit controlplanemachinesets.machine.openshift.io -n openshift-machine-api cluster

$ oc edit controlplanemachinesets.machine.openshift.io -n openshift-machine-api cluster

Copy to Clipboard

Toggle word wrap

Edit the failureDomains parameter as shown in the following example:

spec:
  replicas: 3
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: jdoe3-whb8l
      machine.openshift.io/cluster-api-machine-role: master
      machine.openshift.io/cluster-api-machine-type: master
  state: Active
  strategy:
    type: RollingUpdate
  template:
    machineType: machines_v1beta1_machine_openshift_io
    machines_v1beta1_machine_openshift_io:
      failureDomains:
        platform: VSphere
        vsphere:
        - name: <failure_domain_1_name>
        - name: <failure_domain_2_name>
# ...

spec:
  replicas: 3
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: jdoe3-whb8l
      machine.openshift.io/cluster-api-machine-role: master
      machine.openshift.io/cluster-api-machine-type: master
  state: Active
  strategy:
    type: RollingUpdate
  template:
    machineType: machines_v1beta1_machine_openshift_io
    machines_v1beta1_machine_openshift_io:
      failureDomains:
        platform: VSphere
        vsphere:
        - name: <failure_domain_1_name>
        - name: <failure_domain_2_name>
# ...

Copy to Clipboard

Toggle word wrap

Verify that your control plane nodes have finished updating before proceeding further. To do this, run the following command:
```
oc get controlplanemachinesets.machine.openshift.io -n openshift-machine-api
```
```
$ oc get controlplanemachinesets.machine.openshift.io -n openshift-machine-api
```
Copy to Clipboard Toggle word wrap

Create new MachineSet CRs for your failure domains.

To retrieve the configuration of an existing MachineSet CR for use as a template, run the following command:

oc get machinesets.machine.openshift.io -n openshift-machine-api <existing_machine_set> -o yaml > machineset-<failure_domain_name>.yaml

$ oc get machinesets.machine.openshift.io -n openshift-machine-api <existing_machine_set> -o yaml > machineset-<failure_domain_name>.yaml

Copy to Clipboard

Toggle word wrap

Copy the template as needed to create MachineSet CR files for each failure domain that you defined in your infrastructure file. Refer to the following example:

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id>
  name: <machineset_name>
  namespace: openshift-machine-api
spec:
  replicas: 0
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <machineset_name>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
        machine.openshift.io/cluster-api-machineset: <machineset_name>
    spec:
      lifecycleHooks: {}
      metadata: {}
      providerSpec:
        value:
          apiVersion: machine.openshift.io/v1beta1
          credentialsSecret:
            name: vsphere-cloud-credentials
          diskGiB: <disk_GiB>
          kind: VSphereMachineProviderSpec
          memoryMiB: <memory_in_MiB>
          metadata:
            creationTimestamp: null
          network:
            devices:
            - networkName: VM Network
          numCPUs: <number_of_cpus>
          numCoresPerSocket: <number_of_cores_per_socket>
          snapshot: ""
          template: <template_name>
          userDataSecret:
            name: worker-user-data
          workspace:
            datacenter: <data_center_1>
            datastore: /<data_center_1>/datastore/<datastore_1>
            folder: /<data_center_1>/vm/<folder>
            resourcePool: /<data_center_1>/host/<cluster_1>/Resources
            server: <server_ip_address>
            vmGroup: <name_of_vm_group_1>
# ...

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: <infrastructure_id>
  name: <machineset_name>
  namespace: openshift-machine-api
spec:
  replicas: 0
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: <infrastructure_id>
      machine.openshift.io/cluster-api-machineset: <machineset_name>
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: <infrastructure_id>
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
        machine.openshift.io/cluster-api-machineset: <machineset_name>
    spec:
      lifecycleHooks: {}
      metadata: {}
      providerSpec:
        value:
          apiVersion: machine.openshift.io/v1beta1
          credentialsSecret:
            name: vsphere-cloud-credentials
          diskGiB: <disk_GiB>
          kind: VSphereMachineProviderSpec
          memoryMiB: <memory_in_MiB>
          metadata:
            creationTimestamp: null
          network:
            devices:
            - networkName: VM Network
          numCPUs: <number_of_cpus>
          numCoresPerSocket: <number_of_cores_per_socket>
          snapshot: ""
          template: <template_name>
          userDataSecret:
            name: worker-user-data
          workspace:
            datacenter: <data_center_1>
            datastore: /<data_center_1>/datastore/<datastore_1>
            folder: /<data_center_1>/vm/<folder>
            resourcePool: /<data_center_1>/host/<cluster_1>/Resources
            server: <server_ip_address>
            vmGroup: <name_of_vm_group_1>
# ...

Copy to Clipboard

Toggle word wrap

For each MachineSet CR file, run the following command:
```
oc create -f <name_of_machine_set_file>.yaml
```
```
$ oc create -f <name_of_machine_set_file>.yaml
```
Copy to Clipboard Toggle word wrap

Chapter 10. Multiple regions and zones configuration for a cluster on VMware vSphere

10.1. Specifying multiple regions and zones for your cluster on vSphere
Copy link

10.2. Enabling a multiple layer 2 network for your cluster
Copy link

10.3. Parameters for the cluster-wide infrastructure CRD
Copy link

10.4. Specifying multiple host groups for your cluster on vSphere
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 10. Multiple regions and zones configuration for a cluster on VMware vSphere

10.1. Specifying multiple regions and zones for your cluster on vSphereCopy linkLink copied to clipboard!

10.2. Enabling a multiple layer 2 network for your clusterCopy linkLink copied to clipboard!

10.3. Parameters for the cluster-wide infrastructure CRDCopy linkLink copied to clipboard!

10.4. Specifying multiple host groups for your cluster on vSphereCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

10.1. Specifying multiple regions and zones for your cluster on vSphere
Copy link

10.2. Enabling a multiple layer 2 network for your cluster
Copy link

10.3. Parameters for the cluster-wide infrastructure CRD
Copy link

10.4. Specifying multiple host groups for your cluster on vSphere
Copy link