Chapter 5. Scaling a user-provisioned cluster with the Bare Metal Operator
After deploying a user-provisioned infrastructure cluster, you can use the Bare Metal Operator (BMO) and other metal3 components to scale bare-metal hosts in the cluster. This approach helps you to scale a user-provisioned cluster in a more automated way.
5.1. About scaling a user-provisioned cluster with the Bare Metal Operator Copy linkLink copied to clipboard!
You can scale user-provisioned infrastructure clusters by using the Bare Metal Operator (BMO) and other metal3 components. User-provisioned infrastructure installations do not feature the Machine API Operator. The Machine API Operator typically manages the lifecycle of bare-metal nodes in a cluster. However, it is possible to use the BMO and other metal3 components to scale nodes in user-provisioned clusters without requiring the Machine API Operator.
5.1.1. Prerequisites for scaling a user-provisioned cluster Copy linkLink copied to clipboard!
- You installed a user-provisioned infrastructure cluster on bare metal.
- You have baseboard management controller (BMC) access to the hosts.
5.1.2. Limitations for scaling a user-provisioned cluster Copy linkLink copied to clipboard!
You cannot use a provisioning network to scale user-provisioned infrastructure clusters by using the Bare Metal Operator (BMO).
-
Consequentially, you can only use bare-metal host drivers that support virtual media networking booting, for example
redfish-virtualmedia
andidrac-virtualmedia
.
-
Consequentially, you can only use bare-metal host drivers that support virtual media networking booting, for example
-
You cannot scale
MachineSet
objects in user-provisioned infrastructure clusters by using the BMO.
5.2. Configuring a provisioning resource to scale user-provisioned clusters Copy linkLink copied to clipboard!
Create a Provisioning
custom resource (CR) to enable Metal platform components on a user-provisioned infrastructure cluster.
Prerequisites
- You installed a user-provisioned infrastructure cluster on bare metal.
Procedure
Create a
Provisioning
CR.Save the following YAML in the
provisioning.yaml
file:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteOpenShift Container Platform 4.13 does not support enabling a provisioning network when you scale a user-provisioned cluster by using the Bare Metal Operator.
Create the
Provisioning
CR by running the following command:oc create -f provisioning.yaml
$ oc create -f provisioning.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
provisioning.metal3.io/provisioning-configuration created
provisioning.metal3.io/provisioning-configuration created
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Verify that the provisioning service is running by running the following command:
oc get pods -n openshift-machine-api
$ oc get pods -n openshift-machine-api
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.3. Provisioning new hosts in a user-provisioned cluster by using the BMO Copy linkLink copied to clipboard!
You can use the Bare Metal Operator (BMO) to provision bare-metal hosts in a user-provisioned cluster by creating a BareMetalHost
custom resource (CR).
Provisioning bare-metal hosts to the cluster by using the BMO sets the spec.externallyProvisioned
specification in the BareMetalHost
custom resource to false
by default. Do not set the spec.externallyProvisioned
specification to true
, because this setting results in unexpected behavior.
Prerequisites
- You created a user-provisioned bare-metal cluster.
- You have baseboard management controller (BMC) access to the hosts.
-
You deployed a provisioning service in the cluster by creating a
Provisioning
CR.
Procedure
Create a configuration file for the bare-metal node. Depending if you use either a static configuration or a DHCP server, choose one of the following example
bmh.yaml
files and configure it to your needs by replacing values in the YAML to match your environment:To deploy with a static configuration, create the following
bmh.yaml
file:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Replace all instances of
<num>
with a unique compute node number for the bare-metal nodes in thename
,credentialsName
, andpreprovisioningNetworkDataName
fields. - 2
- Add the NMState YAML syntax to configure the host interfaces. To configure the network interface for a newly created node, specify the name of the secret that has the network configuration. Follow the
nmstate
syntax to define the network configuration for your node. See "Preparing the bare-metal node" for details on configuring NMState syntax. - 3
- Optional: If you have configured the network interface with
nmstate
, and you want to disable an interface, setstate: up
with the IP addresses set toenabled: false
. - 4
- Replace
<nic1_name>
with the name of the bare-metal node’s first network interface controller (NIC). - 5
- Replace
<ip_address>
with the IP address of the bare-metal node’s NIC. - 6
- Replace
<dns_ip_address>
with the IP address of the bare-metal node’s DNS resolver. - 7
- Replace
<next_hop_ip_address>
with the IP address of the bare-metal node’s external gateway. - 8
- Replace
<next_hop_nic1_name>
with the name of the bare-metal node’s external gateway. - 9
- Replace
<base64_of_uid>
and<base64_of_pwd>
with the base64 string of the user name and password. - 10
- Replace
<nic1_mac_address>
with the MAC address of the bare-metal node’s first NIC. See the "BMC addressing" section for additional BMC configuration options. - 11
- Replace
<protocol>
with the BMC protocol, such as IPMI, Redfish, or others. Replace<bmc_url>
with the URL of the bare-metal node’s baseboard management controller. - 12
- Optional: Replace
<root_device_hint>
with a device path when specifying a root device hint. See "Root device hints" for additional details.
When configuring the network interface with a static configuration by using
nmstate
, setstate: up
with the IP addresses set toenabled: false
:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To deploy with a DHCP configuration, create the following
bmh.yaml
file:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Replace
<num>
with a unique compute node number for the bare-metal nodes in thename
andcredentialsName
fields. - 2
- Replace
<base64_of_uid>
and<base64_of_pwd>
with the base64 string of the user name and password. - 3
- Replace
<nic1_mac_address>
with the MAC address of the bare-metal node’s first NIC. See the "BMC addressing" section for additional BMC configuration options. - 4
- Replace
<protocol>
with the BMC protocol, such as IPMI, Redfish, or others. Replace<bmc_url>
with the URL of the bare-metal node’s baseboard management controller. - 5
- Optional: Replace
<root_device_hint>
with a device path when specifying a root device hint. See "Root device hints" for additional details.
ImportantIf the MAC address of an existing bare-metal node matches the MAC address of the bare-metal host that you are attempting to provision, then the installation will fail. If the host enrollment, inspection, cleaning, or other steps fail, the Bare Metal Operator retries the installation continuously. See "Diagnosing a duplicate MAC address when provisioning a new host in the cluster" for additional details.
Create the bare-metal node by running the following command:
oc create -f bmh.yaml
$ oc create -f bmh.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
secret/openshift-worker-<num>-network-config-secret created secret/openshift-worker-<num>-bmc-secret created baremetalhost.metal3.io/openshift-worker-<num> created
secret/openshift-worker-<num>-network-config-secret created secret/openshift-worker-<num>-bmc-secret created baremetalhost.metal3.io/openshift-worker-<num> created
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Inspect the bare-metal node by running the following command:
oc -n openshift-machine-api get bmh openshift-worker-<num>
$ oc -n openshift-machine-api get bmh openshift-worker-<num>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
- <num>
Specifies the compute node number.
Example output
NAME STATE CONSUMER ONLINE ERROR openshift-worker-<num> provisioned true
NAME STATE CONSUMER ONLINE ERROR openshift-worker-<num> provisioned true
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Approve all certificate signing requests (CSRs).
Get the list of pending CSRs by running the following command:
oc get csr
$ oc get csr
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-gfm9f 33s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-o perator:node-bootstrapper <none> Pending
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-gfm9f 33s kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-o perator:node-bootstrapper <none> Pending
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Approve the CSR by running the following command:
oc adm certificate approve <csr_name>
$ oc adm certificate approve <csr_name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
certificatesigningrequest.certificates.k8s.io/<csr_name> approved
certificatesigningrequest.certificates.k8s.io/<csr_name> approved
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Verify that the node is ready by running the following command:
oc get nodes
$ oc get nodes
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME STATUS ROLES AGE VERSION app1 Ready worker 47s v1.24.0+dc5a2fd controller1 Ready master,worker 2d22h v1.24.0+dc5a2fd
NAME STATUS ROLES AGE VERSION app1 Ready worker 47s v1.24.0+dc5a2fd controller1 Ready master,worker 2d22h v1.24.0+dc5a2fd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.4. Optional: Managing existing hosts in a user-provisioned cluster by using the BMO Copy linkLink copied to clipboard!
Optionally, you can use the Bare Metal Operator (BMO) to manage existing bare-metal controller hosts in a user-provisioned cluster by creating a BareMetalHost
object for the existing host. It is not a requirement to manage existing user-provisioned hosts; however, you can enroll them as externally-provisioned hosts for inventory purposes.
To manage existing hosts by using the BMO, you must set the spec.externallyProvisioned
specification in the BareMetalHost
custom resource to true
to prevent the BMO from re-provisioning the host.
Prerequisites
- You created a user-provisioned bare-metal cluster.
- You have baseboard management controller (BMC) access to the hosts.
-
You deployed a provisioning service in the cluster by creating a
Provisioning
CR.
Procedure
Create the
Secret
CR and theBareMetalHost
CR.Save the following YAML in the
controller.yaml
file:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Create the bare-metal host object by running the following command:
oc create -f controller.yaml
$ oc create -f controller.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
secret/controller1-bmc created baremetalhost.metal3.io/controller1 created
secret/controller1-bmc created baremetalhost.metal3.io/controller1 created
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Verify that the BMO created the bare-metal host object by running the following command:
oc get bmh -A
$ oc get bmh -A
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAMESPACE NAME STATE CONSUMER ONLINE ERROR AGE openshift-machine-api controller1 externally provisioned true 13s
NAMESPACE NAME STATE CONSUMER ONLINE ERROR AGE openshift-machine-api controller1 externally provisioned true 13s
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.5. Removing hosts from a user-provisioned cluster by using the BMO Copy linkLink copied to clipboard!
You can use the Bare Metal Operator (BMO) to remove bare-metal hosts from a user-provisioned cluster.
Prerequisites
- You created a user-provisioned bare-metal cluster.
- You have baseboard management controller (BMC) access to the hosts.
-
You deployed a provisioning service in the cluster by creating a
Provisioning
CR.
Procedure
Cordon and drain the node by running the following command:
oc adm drain app1 --force --ignore-daemonsets=true
$ oc adm drain app1 --force --ignore-daemonsets=true
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Delete the
customDeploy
specification from theBareMetalHost
CR.Edit the
BareMetalHost
CR for the host by running the following command:oc edit bmh -n openshift-machine-api <host_name>
$ oc edit bmh -n openshift-machine-api <host_name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Delete the lines
spec.customDeploy
andspec.customDeploy.method
:... customDeploy: method: install_coreos
... customDeploy: method: install_coreos
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the provisioning state of the host changes to
deprovisioning
by running the following command:oc get bmh -A
$ oc get bmh -A
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAMESPACE NAME STATE CONSUMER ONLINE ERROR AGE openshift-machine-api controller1 externally provisioned true 58m openshift-machine-api worker1 deprovisioning true 57m
NAMESPACE NAME STATE CONSUMER ONLINE ERROR AGE openshift-machine-api controller1 externally provisioned true 58m openshift-machine-api worker1 deprovisioning true 57m
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Delete the host by running the following command when the
BareMetalHost
state changes toavailable
:oc delete bmh -n openshift-machine-api <bmh_name>
$ oc delete bmh -n openshift-machine-api <bmh_name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteYou can run this step without having to edit the
BareMetalHost
CR. It might take some time for theBareMetalHost
state to change fromdeprovisioning
toavailable
.Delete the node by running the following command:
oc delete node <node_name>
$ oc delete node <node_name>
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Verify that you deleted the node by running the following command:
oc get nodes
$ oc get nodes
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME STATUS ROLES AGE VERSION controller1 Ready master,worker 2d23h v1.24.0+dc5a2fd
NAME STATUS ROLES AGE VERSION controller1 Ready master,worker 2d23h v1.24.0+dc5a2fd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow