Chapter 6. Expanding the cluster
After deploying an installer-provisioned OpenShift Container Platform cluster, you can use the following procedures to expand the number of worker nodes. Ensure that each prospective worker node meets the prerequisites.
Expanding the cluster using RedFish Virtual Media involves meeting minimum firmware requirements. See Firmware requirements for installing with virtual media in the Prerequisites section for additional details when expanding the cluster using RedFish Virtual Media.
6.1. Preparing the bare metal node
To expand your cluster, you must provide the node with the relevant IP address. This can be done with a static configuration, or with a DHCP (Dynamic Host Configuration protocol) server. When expanding the cluster using a DHCP server, each node must have a DHCP reservation.
Some administrators prefer to use static IP addresses so that each node’s IP address remains constant in the absence of a DHCP server. To configure static IP addresses with NMState, see "Optional: Configuring host network interfaces in the install-config.yaml
file" in the "Setting up the environment for an OpenShift installation" section for additional details.
Preparing the bare metal node requires executing the following procedure from the provisioner node.
Procedure
Get the
oc
binary:Copy to Clipboard Copied! Toggle word wrap Toggle overflow curl -s https://mirror.openshift.com/pub/openshift-v4/clients/ocp/$VERSION/openshift-client-linux-$VERSION.tar.gz | tar zxvf - oc
$ curl -s https://mirror.openshift.com/pub/openshift-v4/clients/ocp/$VERSION/openshift-client-linux-$VERSION.tar.gz | tar zxvf - oc
Copy to Clipboard Copied! Toggle word wrap Toggle overflow sudo cp oc /usr/local/bin
$ sudo cp oc /usr/local/bin
- Power off the bare metal node by using the baseboard management controller (BMC), and ensure it is off.
Retrieve the user name and password of the bare metal node’s baseboard management controller. Then, create
base64
strings from the user name and password:Copy to Clipboard Copied! Toggle word wrap Toggle overflow echo -ne "root" | base64
$ echo -ne "root" | base64
Copy to Clipboard Copied! Toggle word wrap Toggle overflow echo -ne "password" | base64
$ echo -ne "password" | base64
Create a configuration file for the bare metal node. Depending on whether you are using a static configuration or a DHCP server, use one of the following example
bmh.yaml
files, replacing values in the YAML to match your environment:Copy to Clipboard Copied! Toggle word wrap Toggle overflow vim bmh.yaml
$ vim bmh.yaml
Static configuration
bmh.yaml
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow --- apiVersion: v1 kind: Secret metadata: name: openshift-worker-<num>-network-config-secret namespace: openshift-machine-api type: Opaque stringData: nmstate: | interfaces: - name: <nic1_name> type: ethernet state: up ipv4: address: - ip: <ip_address> prefix-length: 24 enabled: true dns-resolver: config: server: - <dns_ip_address> routes: config: - destination: 0.0.0.0/0 next-hop-address: <next_hop_ip_address> next-hop-interface: <next_hop_nic1_name> --- apiVersion: v1 kind: Secret metadata: name: openshift-worker-<num>-bmc-secret namespace: openshift-machine-api type: Opaque data: username: <base64_of_uid> password: <base64_of_pwd> --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: openshift-worker-<num> namespace: openshift-machine-api spec: online: True bootMACAddress: <nic1_mac_address> bmc: address: <protocol>://<bmc_url> credentialsName: openshift-worker-<num>-bmc-secret disableCertificateVerification: True username: <bmc_username> password: <bmc_password> rootDeviceHints: deviceName: <root_device_hint> preprovisioningNetworkDataName: openshift-worker-<num>-network-config-secret
--- apiVersion: v1
1 kind: Secret metadata: name: openshift-worker-<num>-network-config-secret
2 namespace: openshift-machine-api type: Opaque stringData: nmstate: |
3 interfaces:
4 - name: <nic1_name>
5 type: ethernet state: up ipv4: address: - ip: <ip_address>
6 prefix-length: 24 enabled: true dns-resolver: config: server: - <dns_ip_address>
7 routes: config: - destination: 0.0.0.0/0 next-hop-address: <next_hop_ip_address>
8 next-hop-interface: <next_hop_nic1_name>
9 --- apiVersion: v1 kind: Secret metadata: name: openshift-worker-<num>-bmc-secret
10 namespace: openshift-machine-api type: Opaque data: username: <base64_of_uid>
11 password: <base64_of_pwd>
12 --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: openshift-worker-<num>
13 namespace: openshift-machine-api spec: online: True bootMACAddress: <nic1_mac_address>
14 bmc: address: <protocol>://<bmc_url>
15 credentialsName: openshift-worker-<num>-bmc-secret
16 disableCertificateVerification: True
17 username: <bmc_username>
18 password: <bmc_password>
19 rootDeviceHints: deviceName: <root_device_hint>
20 preprovisioningNetworkDataName: openshift-worker-<num>-network-config-secret
21 - 1
- To configure the network interface for a newly created node, specify the name of the secret that contains the network configuration. Follow the
nmstate
syntax to define the network configuration for your node. See "Optional: Configuring host network interfaces in the install-config.yaml file" for details on configuring NMState syntax. - 2 10 13 16
- Replace
<num>
for the worker number of the bare metal node in thename
fields, thecredentialsName
field, and thepreprovisioningNetworkDataName
field. - 3
- Add the NMState YAML syntax to configure the host interfaces.
- 4
- Optional: If you have configured the network interface with
nmstate
, and you want to disable an interface, setstate: up
with the IP addresses set toenabled: false
as shown:Copy to Clipboard Copied! Toggle word wrap Toggle overflow --- interfaces: - name: <nic_name> type: ethernet state: up ipv4: enabled: false ipv6: enabled: false
--- interfaces: - name: <nic_name> type: ethernet state: up ipv4: enabled: false ipv6: enabled: false
- 5 6 7 8 9
- Replace
<nic1_name>
,<ip_address>
,<dns_ip_address>
,<next_hop_ip_address>
and<next_hop_nic1_name>
with appropriate values. - 11 12
- Replace
<base64_of_uid>
and<base64_of_pwd>
with the base64 string of the user name and password. - 14
- Replace
<nic1_mac_address>
with the MAC address of the bare metal node’s first NIC. See the "BMC addressing" section for additional BMC configuration options. - 15
- Replace
<protocol>
with the BMC protocol, such as IPMI, RedFish, or others. Replace<bmc_url>
with the URL of the bare metal node’s baseboard management controller. - 17
- To skip certificate validation, set
disableCertificateVerification
to true. - 18 19
- Replace
<bmc_username>
and<bmc_password>
with the string of the BMC user name and password. - 20
- Optional: Replace
<root_device_hint>
with a device path if you specify a root device hint. - 21
- Optional: If you have configured the network interface for the newly created node, provide the network configuration secret name in the
preprovisioningNetworkDataName
of the BareMetalHost CR.
DHCP configuration
bmh.yaml
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow --- apiVersion: v1 kind: Secret metadata: name: openshift-worker-<num>-bmc-secret namespace: openshift-machine-api type: Opaque data: username: <base64_of_uid> password: <base64_of_pwd> --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: openshift-worker-<num> namespace: openshift-machine-api spec: online: True bootMACAddress: <nic1_mac_address> bmc: address: <protocol>://<bmc_url> credentialsName: openshift-worker-<num>-bmc-secret disableCertificateVerification: True username: <bmc_username> password: <bmc_password> rootDeviceHints: deviceName: <root_device_hint> preprovisioningNetworkDataName: openshift-worker-<num>-network-config-secret
--- apiVersion: v1 kind: Secret metadata: name: openshift-worker-<num>-bmc-secret
1 namespace: openshift-machine-api type: Opaque data: username: <base64_of_uid>
2 password: <base64_of_pwd>
3 --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: openshift-worker-<num>
4 namespace: openshift-machine-api spec: online: True bootMACAddress: <nic1_mac_address>
5 bmc: address: <protocol>://<bmc_url>
6 credentialsName: openshift-worker-<num>-bmc-secret
7 disableCertificateVerification: True
8 username: <bmc_username>
9 password: <bmc_password>
10 rootDeviceHints: deviceName: <root_device_hint>
11 preprovisioningNetworkDataName: openshift-worker-<num>-network-config-secret
12 - 1 4 7
- Replace
<num>
for the worker number of the bare metal node in thename
fields, thecredentialsName
field, and thepreprovisioningNetworkDataName
field. - 2 3
- Replace
<base64_of_uid>
and<base64_of_pwd>
with the base64 string of the user name and password. - 5
- Replace
<nic1_mac_address>
with the MAC address of the bare metal node’s first NIC. See the "BMC addressing" section for additional BMC configuration options. - 6
- Replace
<protocol>
with the BMC protocol, such as IPMI, RedFish, or others. Replace<bmc_url>
with the URL of the bare metal node’s baseboard management controller. - 8
- To skip certificate validation, set
disableCertificateVerification
to true. - 9 10
- Replace
<bmc_username>
and<bmc_password>
with the string of the BMC user name and password. - 11
- Optional: Replace
<root_device_hint>
with a device path if you specify a root device hint. - 12
- Optional: If you have configured the network interface for the newly created node, provide the network configuration secret name in the
preprovisioningNetworkDataName
of the BareMetalHost CR.
NoteIf the MAC address of an existing bare metal node matches the MAC address of a bare metal host that you are attempting to provision, then the Ironic installation will fail. If the host enrollment, inspection, cleaning, or other Ironic steps fail, the Bare Metal Operator retries the installation continuously. See "Diagnosing a host duplicate MAC address" for more information.
Create the bare metal node:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc -n openshift-machine-api create -f bmh.yaml
$ oc -n openshift-machine-api create -f bmh.yaml
Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow secret/openshift-worker-<num>-network-config-secret created secret/openshift-worker-<num>-bmc-secret created baremetalhost.metal3.io/openshift-worker-<num> created
secret/openshift-worker-<num>-network-config-secret created secret/openshift-worker-<num>-bmc-secret created baremetalhost.metal3.io/openshift-worker-<num> created
Where
<num>
will be the worker number.Power up and inspect the bare metal node:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc -n openshift-machine-api get bmh openshift-worker-<num>
$ oc -n openshift-machine-api get bmh openshift-worker-<num>
Where
<num>
is the worker node number.Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME STATE CONSUMER ONLINE ERROR openshift-worker-<num> available true
NAME STATE CONSUMER ONLINE ERROR openshift-worker-<num> available true
NoteTo allow the worker node to join the cluster, scale the
machineset
object to the number of theBareMetalHost
objects. You can scale nodes either manually or automatically. To scale nodes automatically, use themetal3.io/autoscale-to-hosts
annotation formachineset
.
Additional resources
- See Optional: Configuring host network interfaces in the install-config.yaml file for details on configuring the NMState syntax.
- See Automatically scaling machines to the number of available bare metal hosts for details on automatically scaling machines.
6.2. Replacing a bare-metal control plane node
Use the following procedure to replace an installer-provisioned OpenShift Container Platform control plane node.
If you reuse the BareMetalHost
object definition from an existing control plane host, do not leave the externallyProvisioned
field set to true
.
Existing control plane BareMetalHost
objects may have the externallyProvisioned
flag set to true
if they were provisioned by the OpenShift Container Platform installation program.
Prerequisites
-
You have access to the cluster as a user with the
cluster-admin
role. You have taken an etcd backup.
ImportantTake an etcd backup before performing this procedure so that you can restore your cluster if you encounter any issues. For more information about taking an etcd backup, see the Additional resources section.
Procedure
Ensure that the Bare Metal Operator is available:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get clusteroperator baremetal
$ oc get clusteroperator baremetal
Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE baremetal 4.12.0 True False False 3d15h
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE baremetal 4.12.0 True False False 3d15h
Remove the old
BareMetalHost
andMachine
objects:Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete bmh -n openshift-machine-api <host_name> oc delete machine -n openshift-machine-api <machine_name>
$ oc delete bmh -n openshift-machine-api <host_name> $ oc delete machine -n openshift-machine-api <machine_name>
Replace
<host_name>
with the name of the host and<machine_name>
with the name of the machine. The machine name appears under theCONSUMER
field.After you remove the
BareMetalHost
andMachine
objects, then the machine controller automatically deletes theNode
object.Create the new
BareMetalHost
object and the secret to store the BMC credentials:Copy to Clipboard Copied! Toggle word wrap Toggle overflow cat <<EOF | oc apply -f - apiVersion: v1 kind: Secret metadata: name: control-plane-<num>-bmc-secret namespace: openshift-machine-api data: username: <base64_of_uid> password: <base64_of_pwd> type: Opaque --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: control-plane-<num> namespace: openshift-machine-api spec: automatedCleaningMode: disabled bmc: address: <protocol>://<bmc_ip> credentialsName: control-plane-<num>-bmc-secret bootMACAddress: <NIC1_mac_address> bootMode: UEFI externallyProvisioned: false online: true EOF
$ cat <<EOF | oc apply -f - apiVersion: v1 kind: Secret metadata: name: control-plane-<num>-bmc-secret
1 namespace: openshift-machine-api data: username: <base64_of_uid>
2 password: <base64_of_pwd>
3 type: Opaque --- apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: control-plane-<num>
4 namespace: openshift-machine-api spec: automatedCleaningMode: disabled bmc: address: <protocol>://<bmc_ip>
5 credentialsName: control-plane-<num>-bmc-secret
6 bootMACAddress: <NIC1_mac_address>
7 bootMode: UEFI externallyProvisioned: false online: true EOF
- 1 4 6
- Replace
<num>
for the control plane number of the bare metal node in thename
fields and thecredentialsName
field. - 2
- Replace
<base64_of_uid>
with thebase64
string of the user name. - 3
- Replace
<base64_of_pwd>
with thebase64
string of the password. - 5
- Replace
<protocol>
with the BMC protocol, such asredfish
,redfish-virtualmedia
,idrac-virtualmedia
, or others. Replace<bmc_ip>
with the IP address of the bare metal node’s baseboard management controller. For additional BMC configuration options, see "BMC addressing" in the Additional resources section. - 7
- Replace
<NIC1_mac_address>
with the MAC address of the bare metal node’s first NIC.
After the inspection is complete, the
BareMetalHost
object is created and available to be provisioned.View available
BareMetalHost
objects:Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get bmh -n openshift-machine-api
$ oc get bmh -n openshift-machine-api
Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME STATE CONSUMER ONLINE ERROR AGE control-plane-1.example.com available control-plane-1 true 1h10m control-plane-2.example.com externally provisioned control-plane-2 true 4h53m control-plane-3.example.com externally provisioned control-plane-3 true 4h53m compute-1.example.com provisioned compute-1-ktmmx true 4h53m compute-1.example.com provisioned compute-2-l2zmb true 4h53m
NAME STATE CONSUMER ONLINE ERROR AGE control-plane-1.example.com available control-plane-1 true 1h10m control-plane-2.example.com externally provisioned control-plane-2 true 4h53m control-plane-3.example.com externally provisioned control-plane-3 true 4h53m compute-1.example.com provisioned compute-1-ktmmx true 4h53m compute-1.example.com provisioned compute-2-l2zmb true 4h53m
There are no
MachineSet
objects for control plane nodes, so you must create aMachine
object instead. You can copy theproviderSpec
from another control planeMachine
object.Create a
Machine
object:Copy to Clipboard Copied! Toggle word wrap Toggle overflow cat <<EOF | oc apply -f - apiVersion: machine.openshift.io/v1beta1 kind: Machine metadata: annotations: metal3.io/BareMetalHost: openshift-machine-api/control-plane-<num> labels: machine.openshift.io/cluster-api-cluster: control-plane-<num> machine.openshift.io/cluster-api-machine-role: master machine.openshift.io/cluster-api-machine-type: master name: control-plane-<num> namespace: openshift-machine-api spec: metadata: {} providerSpec: value: apiVersion: baremetal.cluster.k8s.io/v1alpha1 customDeploy: method: install_coreos hostSelector: {} image: checksum: "" url: "" kind: BareMetalMachineProviderSpec metadata: creationTimestamp: null userData: name: master-user-data-managed EOF
$ cat <<EOF | oc apply -f - apiVersion: machine.openshift.io/v1beta1 kind: Machine metadata: annotations: metal3.io/BareMetalHost: openshift-machine-api/control-plane-<num>
1 labels: machine.openshift.io/cluster-api-cluster: control-plane-<num>
2 machine.openshift.io/cluster-api-machine-role: master machine.openshift.io/cluster-api-machine-type: master name: control-plane-<num>
3 namespace: openshift-machine-api spec: metadata: {} providerSpec: value: apiVersion: baremetal.cluster.k8s.io/v1alpha1 customDeploy: method: install_coreos hostSelector: {} image: checksum: "" url: "" kind: BareMetalMachineProviderSpec metadata: creationTimestamp: null userData: name: master-user-data-managed EOF
To view the
BareMetalHost
objects, run the following command:Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get bmh -A
$ oc get bmh -A
Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME STATE CONSUMER ONLINE ERROR AGE control-plane-1.example.com provisioned control-plane-1 true 2h53m control-plane-2.example.com externally provisioned control-plane-2 true 5h53m control-plane-3.example.com externally provisioned control-plane-3 true 5h53m compute-1.example.com provisioned compute-1-ktmmx true 5h53m compute-2.example.com provisioned compute-2-l2zmb true 5h53m
NAME STATE CONSUMER ONLINE ERROR AGE control-plane-1.example.com provisioned control-plane-1 true 2h53m control-plane-2.example.com externally provisioned control-plane-2 true 5h53m control-plane-3.example.com externally provisioned control-plane-3 true 5h53m compute-1.example.com provisioned compute-1-ktmmx true 5h53m compute-2.example.com provisioned compute-2-l2zmb true 5h53m
After the RHCOS installation, verify that the
BareMetalHost
is added to the cluster:Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get nodes
$ oc get nodes
Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME STATUS ROLES AGE VERSION control-plane-1.example.com available master 4m2s v1.18.2 control-plane-2.example.com available master 141m v1.18.2 control-plane-3.example.com available master 141m v1.18.2 compute-1.example.com available worker 87m v1.18.2 compute-2.example.com available worker 87m v1.18.2
NAME STATUS ROLES AGE VERSION control-plane-1.example.com available master 4m2s v1.18.2 control-plane-2.example.com available master 141m v1.18.2 control-plane-3.example.com available master 141m v1.18.2 compute-1.example.com available worker 87m v1.18.2 compute-2.example.com available worker 87m v1.18.2
NoteAfter replacement of the new control plane node, the etcd pod running in the new node is in
crashloopback
status. See "Replacing an unhealthy etcd member" in the Additional resources section for more information.
Additional resources
6.3. Preparing to deploy with Virtual Media on the baremetal network
If the provisioning
network is enabled and you want to expand the cluster using Virtual Media on the baremetal
network, use the following procedure.
Prerequisites
-
There is an existing cluster with a
baremetal
network and aprovisioning
network.
Procedure
Edit the
provisioning
custom resource (CR) to enable deploying with Virtual Media on thebaremetal
network:Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc edit provisioning
oc edit provisioning
Copy to Clipboard Copied! Toggle word wrap Toggle overflow apiVersion: metal3.io/v1alpha1 kind: Provisioning metadata: creationTimestamp: "2021-08-05T18:51:50Z" finalizers: - provisioning.metal3.io generation: 8 name: provisioning-configuration resourceVersion: "551591" uid: f76e956f-24c6-4361-aa5b-feaf72c5b526 spec: provisioningDHCPRange: 172.22.0.10,172.22.0.254 provisioningIP: 172.22.0.3 provisioningInterface: enp1s0 provisioningNetwork: Managed provisioningNetworkCIDR: 172.22.0.0/24 virtualMediaViaExternalNetwork: true status: generations: - group: apps hash: "" lastGeneration: 7 name: metal3 namespace: openshift-machine-api resource: deployments - group: apps hash: "" lastGeneration: 1 name: metal3-image-cache namespace: openshift-machine-api resource: daemonsets observedGeneration: 8 readyReplicas: 0
apiVersion: metal3.io/v1alpha1 kind: Provisioning metadata: creationTimestamp: "2021-08-05T18:51:50Z" finalizers: - provisioning.metal3.io generation: 8 name: provisioning-configuration resourceVersion: "551591" uid: f76e956f-24c6-4361-aa5b-feaf72c5b526 spec: provisioningDHCPRange: 172.22.0.10,172.22.0.254 provisioningIP: 172.22.0.3 provisioningInterface: enp1s0 provisioningNetwork: Managed provisioningNetworkCIDR: 172.22.0.0/24 virtualMediaViaExternalNetwork: true
1 status: generations: - group: apps hash: "" lastGeneration: 7 name: metal3 namespace: openshift-machine-api resource: deployments - group: apps hash: "" lastGeneration: 1 name: metal3-image-cache namespace: openshift-machine-api resource: daemonsets observedGeneration: 8 readyReplicas: 0
- 1
- Add
virtualMediaViaExternalNetwork: true
to theprovisioning
CR.
If the image URL exists, edit the
machineset
to use the API VIP address. This step only applies to clusters installed in versions 4.9 or earlier.Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc edit machineset
oc edit machineset
Copy to Clipboard Copied! Toggle word wrap Toggle overflow apiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: creationTimestamp: "2021-08-05T18:51:52Z" generation: 11 labels: machine.openshift.io/cluster-api-cluster: ostest-hwmdt machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker name: ostest-hwmdt-worker-0 namespace: openshift-machine-api resourceVersion: "551513" uid: fad1c6e0-b9da-4d4a-8d73-286f78788931 spec: replicas: 2 selector: matchLabels: machine.openshift.io/cluster-api-cluster: ostest-hwmdt machine.openshift.io/cluster-api-machineset: ostest-hwmdt-worker-0 template: metadata: labels: machine.openshift.io/cluster-api-cluster: ostest-hwmdt machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker machine.openshift.io/cluster-api-machineset: ostest-hwmdt-worker-0 spec: metadata: {} providerSpec: value: apiVersion: baremetal.cluster.k8s.io/v1alpha1 hostSelector: {} image: checksum: http:/172.22.0.3:6181/images/rhcos-<version>.<architecture>.qcow2.<md5sum> url: http://172.22.0.3:6181/images/rhcos-<version>.<architecture>.qcow2 kind: BareMetalMachineProviderSpec metadata: creationTimestamp: null userData: name: worker-user-data status: availableReplicas: 2 fullyLabeledReplicas: 2 observedGeneration: 11 readyReplicas: 2 replicas: 2
apiVersion: machine.openshift.io/v1beta1 kind: MachineSet metadata: creationTimestamp: "2021-08-05T18:51:52Z" generation: 11 labels: machine.openshift.io/cluster-api-cluster: ostest-hwmdt machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker name: ostest-hwmdt-worker-0 namespace: openshift-machine-api resourceVersion: "551513" uid: fad1c6e0-b9da-4d4a-8d73-286f78788931 spec: replicas: 2 selector: matchLabels: machine.openshift.io/cluster-api-cluster: ostest-hwmdt machine.openshift.io/cluster-api-machineset: ostest-hwmdt-worker-0 template: metadata: labels: machine.openshift.io/cluster-api-cluster: ostest-hwmdt machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker machine.openshift.io/cluster-api-machineset: ostest-hwmdt-worker-0 spec: metadata: {} providerSpec: value: apiVersion: baremetal.cluster.k8s.io/v1alpha1 hostSelector: {} image: checksum: http:/172.22.0.3:6181/images/rhcos-<version>.<architecture>.qcow2.<md5sum>
1 url: http://172.22.0.3:6181/images/rhcos-<version>.<architecture>.qcow2
2 kind: BareMetalMachineProviderSpec metadata: creationTimestamp: null userData: name: worker-user-data status: availableReplicas: 2 fullyLabeledReplicas: 2 observedGeneration: 11 readyReplicas: 2 replicas: 2
6.4. Diagnosing a duplicate MAC address when provisioning a new host in the cluster
If the MAC address of an existing bare-metal node in the cluster matches the MAC address of a bare-metal host you are attempting to add to the cluster, the Bare Metal Operator associates the host with the existing node. If the host enrollment, inspection, cleaning, or other Ironic steps fail, the Bare Metal Operator retries the installation continuously. A registration error is displayed for the failed bare-metal host.
You can diagnose a duplicate MAC address by examining the bare-metal hosts that are running in the openshift-machine-api
namespace.
Prerequisites
- Install an OpenShift Container Platform cluster on bare metal.
-
Install the OpenShift Container Platform CLI
oc
. -
Log in as a user with
cluster-admin
privileges.
Procedure
To determine whether a bare-metal host that fails provisioning has the same MAC address as an existing node, do the following:
Get the bare-metal hosts running in the
openshift-machine-api
namespace:Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get bmh -n openshift-machine-api
$ oc get bmh -n openshift-machine-api
Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME STATUS PROVISIONING STATUS CONSUMER openshift-master-0 OK externally provisioned openshift-zpwpq-master-0 openshift-master-1 OK externally provisioned openshift-zpwpq-master-1 openshift-master-2 OK externally provisioned openshift-zpwpq-master-2 openshift-worker-0 OK provisioned openshift-zpwpq-worker-0-lv84n openshift-worker-1 OK provisioned openshift-zpwpq-worker-0-zd8lm openshift-worker-2 error registering
NAME STATUS PROVISIONING STATUS CONSUMER openshift-master-0 OK externally provisioned openshift-zpwpq-master-0 openshift-master-1 OK externally provisioned openshift-zpwpq-master-1 openshift-master-2 OK externally provisioned openshift-zpwpq-master-2 openshift-worker-0 OK provisioned openshift-zpwpq-worker-0-lv84n openshift-worker-1 OK provisioned openshift-zpwpq-worker-0-zd8lm openshift-worker-2 error registering
To see more detailed information about the status of the failing host, run the following command replacing
<bare_metal_host_name>
with the name of the host:Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get -n openshift-machine-api bmh <bare_metal_host_name> -o yaml
$ oc get -n openshift-machine-api bmh <bare_metal_host_name> -o yaml
Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ... status: errorCount: 12 errorMessage: MAC address b4:96:91:1d:7c:20 conflicts with existing node openshift-worker-1 errorType: registration error ...
... status: errorCount: 12 errorMessage: MAC address b4:96:91:1d:7c:20 conflicts with existing node openshift-worker-1 errorType: registration error ...
6.5. Provisioning the bare metal node
Provisioning the bare metal node requires executing the following procedure from the provisioner node.
Procedure
Ensure the
STATE
isavailable
before provisioning the bare metal node.Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc -n openshift-machine-api get bmh openshift-worker-<num>
$ oc -n openshift-machine-api get bmh openshift-worker-<num>
Where
<num>
is the worker node number.Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME STATE ONLINE ERROR AGE openshift-worker available true 34h
NAME STATE ONLINE ERROR AGE openshift-worker available true 34h
Get a count of the number of worker nodes.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get nodes
$ oc get nodes
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME STATUS ROLES AGE VERSION openshift-master-1.openshift.example.com Ready master 30h v1.25.0 openshift-master-2.openshift.example.com Ready master 30h v1.25.0 openshift-master-3.openshift.example.com Ready master 30h v1.25.0 openshift-worker-0.openshift.example.com Ready worker 30h v1.25.0 openshift-worker-1.openshift.example.com Ready worker 30h v1.25.0
NAME STATUS ROLES AGE VERSION openshift-master-1.openshift.example.com Ready master 30h v1.25.0 openshift-master-2.openshift.example.com Ready master 30h v1.25.0 openshift-master-3.openshift.example.com Ready master 30h v1.25.0 openshift-worker-0.openshift.example.com Ready worker 30h v1.25.0 openshift-worker-1.openshift.example.com Ready worker 30h v1.25.0
Get the compute machine set.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get machinesets -n openshift-machine-api
$ oc get machinesets -n openshift-machine-api
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME DESIRED CURRENT READY AVAILABLE AGE ... openshift-worker-0.example.com 1 1 1 1 55m openshift-worker-1.example.com 1 1 1 1 55m
NAME DESIRED CURRENT READY AVAILABLE AGE ... openshift-worker-0.example.com 1 1 1 1 55m openshift-worker-1.example.com 1 1 1 1 55m
Increase the number of worker nodes by one.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc scale --replicas=<num> machineset <machineset> -n openshift-machine-api
$ oc scale --replicas=<num> machineset <machineset> -n openshift-machine-api
Replace
<num>
with the new number of worker nodes. Replace<machineset>
with the name of the compute machine set from the previous step.Check the status of the bare metal node.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc -n openshift-machine-api get bmh openshift-worker-<num>
$ oc -n openshift-machine-api get bmh openshift-worker-<num>
Where
<num>
is the worker node number. The STATE changes fromready
toprovisioning
.Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME STATE CONSUMER ONLINE ERROR openshift-worker-<num> provisioning openshift-worker-<num>-65tjz true
NAME STATE CONSUMER ONLINE ERROR openshift-worker-<num> provisioning openshift-worker-<num>-65tjz true
The
provisioning
status remains until the OpenShift Container Platform cluster provisions the node. This can take 30 minutes or more. After the node is provisioned, the state will change toprovisioned
.Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME STATE CONSUMER ONLINE ERROR openshift-worker-<num> provisioned openshift-worker-<num>-65tjz true
NAME STATE CONSUMER ONLINE ERROR openshift-worker-<num> provisioned openshift-worker-<num>-65tjz true
After provisioning completes, ensure the bare metal node is ready.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc get nodes
$ oc get nodes
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NAME STATUS ROLES AGE VERSION openshift-master-1.openshift.example.com Ready master 30h v1.25.0 openshift-master-2.openshift.example.com Ready master 30h v1.25.0 openshift-master-3.openshift.example.com Ready master 30h v1.25.0 openshift-worker-0.openshift.example.com Ready worker 30h v1.25.0 openshift-worker-1.openshift.example.com Ready worker 30h v1.25.0 openshift-worker-<num>.openshift.example.com Ready worker 3m27s v1.25.0
NAME STATUS ROLES AGE VERSION openshift-master-1.openshift.example.com Ready master 30h v1.25.0 openshift-master-2.openshift.example.com Ready master 30h v1.25.0 openshift-master-3.openshift.example.com Ready master 30h v1.25.0 openshift-worker-0.openshift.example.com Ready worker 30h v1.25.0 openshift-worker-1.openshift.example.com Ready worker 30h v1.25.0 openshift-worker-<num>.openshift.example.com Ready worker 3m27s v1.25.0
You can also check the kubelet.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow ssh openshift-worker-<num>
$ ssh openshift-worker-<num>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow [kni@openshift-worker-<num>]$ journalctl -fu kubelet
[kni@openshift-worker-<num>]$ journalctl -fu kubelet