Chapter 5. Postinstallation machine configuration tasks
There are times when you need to make changes to the operating systems running on OpenShift Container Platform nodes. This can include changing settings for network time service, adding kernel arguments, or configuring journaling in a specific way.
Aside from a few specialized features, most changes to operating systems on OpenShift Container Platform nodes can be done by creating what are referred to as MachineConfig objects that are managed by the Machine Config Operator.
Tasks in this section describe how to use features of the Machine Config Operator to configure operating system features on OpenShift Container Platform nodes.
NetworkManager stores new network configurations to /etc/NetworkManager/system-connections/ in a key file format.
Previously, NetworkManager stored new network configurations to /etc/sysconfig/network-scripts/ in the ifcfg format. Starting with RHEL 9.0, RHEL stores new network configurations at /etc/NetworkManager/system-connections/ in a key file format. The connections configurations stored to /etc/sysconfig/network-scripts/ in the old format still work uninterrupted. Modifications in existing profiles continue updating the older files.
5.1. About the Machine Config Operator Copy linkLink copied to clipboard!
OpenShift Container Platform 4.14 integrates both operating system and cluster management. Because the cluster manages its own updates, including updates to Red Hat Enterprise Linux CoreOS (RHCOS) on cluster nodes, OpenShift Container Platform provides an opinionated lifecycle management experience that simplifies the orchestration of node upgrades.
OpenShift Container Platform employs three daemon sets and controllers to simplify node management. These daemon sets orchestrate operating system updates and configuration changes to the hosts by using standard Kubernetes-style constructs. They include:
-
The
machine-config-controller, which coordinates machine upgrades from the control plane. It monitors all of the cluster nodes and orchestrates their configuration updates. -
The
machine-config-daemondaemon set, which runs on each node in the cluster and updates a machine to configuration as defined by machine config and as instructed by the MachineConfigController. When the node detects a change, it drains off its pods, applies the update, and reboots. These changes come in the form of Ignition configuration files that apply the specified machine configuration and control kubelet configuration. The update itself is delivered in a container. This process is key to the success of managing OpenShift Container Platform and RHCOS updates together. -
The
machine-config-serverdaemon set, which provides the Ignition config files to control plane nodes as they join the cluster.
The machine configuration is a subset of the Ignition configuration. The machine-config-daemon reads the machine configuration to see if it needs to do an OSTree update or if it must apply a series of systemd kubelet file changes, configuration changes, or other changes to the operating system or OpenShift Container Platform configuration.
When you perform node management operations, you create or modify a KubeletConfig custom resource (CR).
When changes are made to a machine configuration, the Machine Config Operator (MCO) automatically reboots all corresponding nodes in order for the changes to take effect.
To prevent the nodes from automatically rebooting after machine configuration changes, before making the changes, you must pause the autoreboot process by setting the spec.paused field to true in the corresponding machine config pool. When paused, machine configuration changes are not applied until you set the spec.paused field to false and the nodes have rebooted into the new configuration.
When the MCO detects any of the following changes, it applies the update without draining or rebooting the node:
-
Changes to the SSH key in the
spec.config.passwd.users.sshAuthorizedKeysparameter of a machine config. -
Changes to the global pull secret or pull secret in the
openshift-confignamespace. -
Automatic rotation of the
/etc/kubernetes/kubelet-ca.crtcertificate authority (CA) by the Kubernetes API Server Operator.
-
Changes to the SSH key in the
When the MCO detects changes to the
/etc/containers/registries.conffile, such as editing anImageDigestMirrorSet,ImageTagMirrorSet, orImageContentSourcePolicyobject, it drains the corresponding nodes, applies the changes, and uncordons the nodes. The node drain does not happen for the following changes:-
The addition of a registry with the
pull-from-mirror = "digest-only"parameter set for each mirror. -
The addition of a mirror with the
pull-from-mirror = "digest-only"parameter set in a registry. -
The addition of items to the
unqualified-search-registrieslist.
-
The addition of a registry with the
There might be situations where the configuration on a node does not fully match what the currently-applied machine config specifies. This state is called configuration drift. The Machine Config Daemon (MCD) regularly checks the nodes for configuration drift. If the MCD detects configuration drift, the MCO marks the node degraded until an administrator corrects the node configuration. A degraded node is online and operational, but, it cannot be updated.
5.1.1. Machine Config overview Copy linkLink copied to clipboard!
The Machine Config Operator (MCO) manages updates to systemd, CRI-O and Kubelet, the kernel, Network Manager and other system features. It also offers a MachineConfig CRD that can write configuration files onto the host (see machine-config-operator). Understanding what the MCO does and how it interacts with other components is critical to making advanced, system-level changes to an OpenShift Container Platform cluster. Here are some things you should know about the MCO, machine configs, and how they are used:
- Machine configs are processed alphabetically, in lexicographically increasing order, of their name. The render controller uses the first machine config in the list as the base and appends the rest to the base machine config.
- A machine config can make a specific change to a file or service on the operating system of each system representing a pool of OpenShift Container Platform nodes.
MCO applies changes to operating systems in pools of machines. All OpenShift Container Platform clusters start with worker and control plane node pools. By adding more role labels, you can configure custom pools of nodes. For example, you can set up a custom pool of worker nodes that includes particular hardware features needed by an application. However, examples in this section focus on changes to the default pool types.
ImportantA node can have multiple labels applied that indicate its type, such as
masterorworker, however it can be a member of only a single machine config pool.-
After a machine config change, the MCO updates the affected nodes alphabetically by zone, based on the
topology.kubernetes.io/zonelabel. If a zone has more than one node, the oldest nodes are updated first. For nodes that do not use zones, such as in bare metal deployments, the nodes are upgraded by age, with the oldest nodes updated first. The MCO updates the number of nodes as specified by themaxUnavailablefield on the machine configuration pool at a time. - Some machine configuration must be in place before OpenShift Container Platform is installed to disk. In most cases, this can be accomplished by creating a machine config that is injected directly into the OpenShift Container Platform installer process, instead of running as a postinstallation machine config. In other cases, you might need to do bare metal installation where you pass kernel arguments at OpenShift Container Platform installer startup, to do such things as setting per-node individual IP addresses or advanced disk partitioning.
- MCO manages items that are set in machine configs. Manual changes you do to your systems will not be overwritten by MCO, unless MCO is explicitly told to manage a conflicting file. In other words, MCO only makes specific updates you request, it does not claim control over the whole node.
- Manual changes to nodes are strongly discouraged. If you need to decommission a node and start a new one, those direct changes would be lost.
-
MCO is only supported for writing to files in
/etcand/vardirectories, although there are symbolic links to some directories that can be writeable by being symbolically linked to one of those areas. The/optand/usr/localdirectories are examples. - Ignition is the configuration format used in MachineConfigs. See the Ignition Configuration Specification v3.4.0 for details.
- Although Ignition config settings can be delivered directly at OpenShift Container Platform installation time, and are formatted in the same way that MCO delivers Ignition configs, MCO has no way of seeing what those original Ignition configs are. Therefore, you should wrap Ignition config settings into a machine config before deploying them.
-
When a file managed by MCO changes outside of MCO, the Machine Config Daemon (MCD) sets the node as
degraded. It will not overwrite the offending file, however, and should continue to operate in adegradedstate. -
A key reason for using a machine config is that it will be applied when you spin up new nodes for a pool in your OpenShift Container Platform cluster. The
machine-api-operatorprovisions a new machine and MCO configures it.
MCO uses Ignition as the configuration format. OpenShift Container Platform 4.6 moved from Ignition config specification version 2 to version 3.
5.1.1.1. What can you change with machine configs? Copy linkLink copied to clipboard!
The kinds of components that MCO can change include:
config: Create Ignition config objects (see the Ignition configuration specification) to do things like modify files, systemd services, and other features on OpenShift Container Platform machines, including:
-
Configuration files: Create or overwrite files in the
/varor/etcdirectory. - systemd units: Create and set the status of a systemd service or add to an existing systemd service by dropping in additional settings.
users and groups: Change SSH keys in the passwd section postinstallation.
Important-
Changing SSH keys by using a machine config is supported only for the
coreuser. - Adding new users by using a machine config is not supported.
-
Changing SSH keys by using a machine config is supported only for the
-
Configuration files: Create or overwrite files in the
- kernelArguments: Add arguments to the kernel command line when OpenShift Container Platform nodes boot.
-
kernelType: Optionally identify a non-standard kernel to use instead of the standard kernel. Use
realtimeto use the RT kernel (for RAN). This is only supported on select platforms. - fips: Enable FIPS mode. FIPS should be set at installation-time setting and not a postinstallation procedure.
To enable FIPS mode for your cluster, you must run the installation program from a Red Hat Enterprise Linux (RHEL) computer configured to operate in FIPS mode. For more information about configuring FIPS mode on RHEL, see Installing the system in FIPS mode. When running Red Hat Enterprise Linux (RHEL) or Red Hat Enterprise Linux CoreOS (RHCOS) booted in FIPS mode, OpenShift Container Platform core components use the RHEL cryptographic libraries that have been submitted to NIST for FIPS 140-2/140-3 Validation on only the x86_64, ppc64le, and s390x architectures.
- extensions: Extend RHCOS features by adding selected pre-packaged software. For this feature, available extensions include usbguard and kernel modules.
-
Custom resources (for
ContainerRuntimeandKubelet): Outside of machine configs, MCO manages two special custom resources for modifying CRI-O container runtime settings (ContainerRuntimeCR) and the Kubelet service (KubeletCR).
The MCO is not the only Operator that can change operating system components on OpenShift Container Platform nodes. Other Operators can modify operating system-level features as well. One example is the Node Tuning Operator, which allows you to do node-level tuning through Tuned daemon profiles.
Tasks for the MCO configuration that can be done postinstallation are included in the following procedures. See descriptions of RHCOS bare metal installation for system configuration tasks that must be done during or before OpenShift Container Platform installation.
There might be situations where the configuration on a node does not fully match what the currently-applied machine config specifies. This state is called configuration drift. The Machine Config Daemon (MCD) regularly checks the nodes for configuration drift. If the MCD detects configuration drift, the MCO marks the node degraded until an administrator corrects the node configuration. A degraded node is online and operational, but, it cannot be updated. For more information on configuration drift, see Understanding configuration drift detection.
5.1.1.2. Project Copy linkLink copied to clipboard!
See the openshift-machine-config-operator GitHub site for details.
5.1.2. Understanding the Machine Config Operator node drain behavior Copy linkLink copied to clipboard!
When you use a machine config to change a system feature, such as adding new config files, modifying systemd units or kernel arguments, or updating SSH keys, the Machine Config Operator (MCO) applies those changes and ensures that each node is in the desired configuration state.
After you make the changes, the MCO generates a new rendered machine config. In the majority of cases, when applying the new rendered machine config, the Operator performs the following steps on each affected node until all of the affected nodes have the updated configuration:
- Cordon. The MCO marks the node as not schedulable for additional workloads.
- Drain. The MCO terminates all running workloads on the node, causing the workloads to be rescheduled onto other nodes.
- Apply. The MCO writes the new configuration to the nodes as needed.
- Reboot. The MCO restarts the node.
- Uncordon. The MCO marks the node as schedulable for workloads.
Throughout this process, the MCO maintains the required number of pods based on the MaxUnavailable value set in the machine config pool.
There are conditions which can prevent the MCO from draining a node. If the MCO fails to drain a node, the Operator will be unable to reboot the node, preventing any changes made to the node through a machine config. For more information and mitigation steps, see the MCCDrainError runbook.
If the MCO drains pods on the master node, note the following conditions:
- In single-node OpenShift clusters, the MCO skips the drain operation.
- The MCO does not drain static pods in order to prevent interference with services, such as etcd.
In certain cases the nodes are not drained. For more information, see "About the Machine Config Operator."
You can mitigate the disruption caused by drain and reboot cycles by disabling control plane reboots. For more information, see "Disabling the Machine Config Operator from automatically rebooting."
5.1.3. Understanding configuration drift detection Copy linkLink copied to clipboard!
There might be situations when the on-disk state of a node differs from what is configured in the machine config. This is known as configuration drift. For example, a cluster admin might manually modify a file, a systemd unit file, or a file permission that was configured through a machine config. This causes configuration drift. Configuration drift can cause problems between nodes in a Machine Config Pool or when the machine configs are updated.
The Machine Config Operator (MCO) uses the Machine Config Daemon (MCD) to check nodes for configuration drift on a regular basis. If detected, the MCO sets the node and the machine config pool (MCP) to Degraded and reports the error. A degraded node is online and operational, but, it cannot be updated.
The MCD performs configuration drift detection upon each of the following conditions:
- When a node boots.
- After any of the files (Ignition files and systemd drop-in units) specified in the machine config are modified outside of the machine config.
Before a new machine config is applied.
NoteIf you apply a new machine config to the nodes, the MCD temporarily shuts down configuration drift detection. This shutdown is needed because the new machine config necessarily differs from the machine config on the nodes. After the new machine config is applied, the MCD restarts detecting configuration drift using the new machine config.
When performing configuration drift detection, the MCD validates that the file contents and permissions fully match what the currently-applied machine config specifies. Typically, the MCD detects configuration drift in less than a second after the detection is triggered.
If the MCD detects configuration drift, the MCD performs the following tasks:
- Emits an error to the console logs
- Emits a Kubernetes event
- Stops further detection on the node
-
Sets the node and MCP to
degraded
You can check if you have a degraded node by listing the MCPs:
oc get mcp worker
$ oc get mcp worker
If you have a degraded MCP, the DEGRADEDMACHINECOUNT field is non-zero, similar to the following output:
Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-404caf3180818d8ac1f50c32f14b57c3 False True True 2 1 1 1 5h51m
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
worker rendered-worker-404caf3180818d8ac1f50c32f14b57c3 False True True 2 1 1 1 5h51m
You can determine if the problem is caused by configuration drift by examining the machine config pool:
oc describe mcp worker
$ oc describe mcp worker
Example output
Or, if you know which node is degraded, examine that node:
oc describe node/ci-ln-j4h8nkb-72292-pxqxz-worker-a-fjks4
$ oc describe node/ci-ln-j4h8nkb-72292-pxqxz-worker-a-fjks4
Example output
- 1
- The error message indicating that configuration drift was detected between the node and the listed machine config. Here the error message indicates that the contents of the
/etc/mco-test-file, which was added by the machine config, has changed outside of the machine config. - 2
- The state of the node is
Degraded.
You can correct configuration drift and return the node to the Ready state by performing one of the following remediations:
- Ensure that the contents and file permissions of the files on the node match what is configured in the machine config. You can manually rewrite the file contents or change the file permissions.
Generate a force file on the degraded node. The force file causes the MCD to bypass the usual configuration drift detection and reapplies the current machine config.
NoteGenerating a force file on a node causes that node to reboot.
5.1.4. Checking machine config pool status Copy linkLink copied to clipboard!
To see the status of the Machine Config Operator (MCO), its sub-components, and the resources it manages, use the following oc commands:
Procedure
To see the number of MCO-managed nodes available on your cluster for each machine config pool (MCP), run the following command:
oc get machineconfigpool
$ oc get machineconfigpoolCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-06c9c4… True False False 3 3 3 0 4h42m worker rendered-worker-f4b64… False True False 3 2 2 0 4h42m
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-06c9c4… True False False 3 3 3 0 4h42m worker rendered-worker-f4b64… False True False 3 2 2 0 4h42mCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
- UPDATED
-
The
Truestatus indicates that the MCO has applied the current machine config to the nodes in that MCP. The current machine config is specified in theSTATUSfield in theoc get mcpoutput. TheFalsestatus indicates a node in the MCP is updating. - UPDATING
-
The
Truestatus indicates that the MCO is applying the desired machine config, as specified in theMachineConfigPoolcustom resource, to at least one of the nodes in that MCP. The desired machine config is the new, edited machine config. Nodes that are updating might not be available for scheduling. TheFalsestatus indicates that all nodes in the MCP are updated. - DEGRADED
-
A
Truestatus indicates the MCO is blocked from applying the current or desired machine config to at least one of the nodes in that MCP, or the configuration is failing. Nodes that are degraded might not be available for scheduling. AFalsestatus indicates that all nodes in the MCP are ready. - MACHINECOUNT
- Indicates the total number of machines in that MCP.
- READYMACHINECOUNT
-
Indicates the number of machines that are both running the current machine config and are ready for scheduling. This count is always less than or equal to the
UPDATEDMACHINECOUNTnumber. - UPDATEDMACHINECOUNT
- Indicates the total number of machines in that MCP that have the current machine config.
- DEGRADEDMACHINECOUNT
- Indicates the total number of machines in that MCP that are marked as degraded or unreconcilable.
In the previous output, there are three control plane (master) nodes and three worker nodes. The control plane MCP and the associated nodes are updated to the current machine config. The nodes in the worker MCP are being updated to the desired machine config. Two of the nodes in the worker MCP are updated and one is still updating, as indicated by the
UPDATEDMACHINECOUNTbeing2. There are no issues, as indicated by theDEGRADEDMACHINECOUNTbeing0andDEGRADEDbeingFalse.While the nodes in the MCP are updating, the machine config listed under
CONFIGis the current machine config, which the MCP is being updated from. When the update is complete, the listed machine config is the desired machine config, which the MCP was updated to.NoteIf a node is being cordoned, that node is not included in the
READYMACHINECOUNT, but is included in theMACHINECOUNT. Also, the MCP status is set toUPDATING. Because the node has the current machine config, it is counted in theUPDATEDMACHINECOUNTtotal:Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-06c9c4… True False False 3 3 3 0 4h42m worker rendered-worker-c1b41a… False True False 3 2 3 0 4h42m
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-06c9c4… True False False 3 3 3 0 4h42m worker rendered-worker-c1b41a… False True False 3 2 3 0 4h42mCopy to Clipboard Copied! Toggle word wrap Toggle overflow To check the status of the nodes in an MCP by examining the
MachineConfigPoolcustom resource, run the following command: :oc describe mcp worker
$ oc describe mcp workerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf a node is being cordoned, the node is not included in the
Ready Machine Count. It is included in theUnavailable Machine Count:Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To see each existing
MachineConfigobject, run the following command:oc get machineconfigs
$ oc get machineconfigsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Note that the
MachineConfigobjects listed asrenderedare not meant to be changed or deleted.To view the contents of a particular machine config (in this case,
01-master-kubelet), run the following command:oc describe machineconfigs 01-master-kubelet
$ oc describe machineconfigs 01-master-kubeletCopy to Clipboard Copied! Toggle word wrap Toggle overflow The output from the command shows that this
MachineConfigobject contains both configuration files (cloud.confandkubelet.conf) and a systemd service (Kubernetes Kubelet):Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
If something goes wrong with a machine config that you apply, you can always back out that change. For example, if you had run oc create -f ./myconfig.yaml to apply a machine config, you could remove that machine config by running the following command:
oc delete -f ./myconfig.yaml
$ oc delete -f ./myconfig.yaml
If that was the only problem, the nodes in the affected pool should return to a non-degraded state. This actually causes the rendered configuration to roll back to its previously rendered state.
If you add your own machine configs to your cluster, you can use the commands shown in the previous example to check their status and the related status of the pool to which they are applied.
5.1.5. Viewing and interacting with certificates Copy linkLink copied to clipboard!
The following certificates are handled in the cluster by the Machine Config Controller (MCC) and can be found in the ControllerConfig resource:
-
/etc/kubernetes/kubelet-ca.crt -
/etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem -
/etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt
The MCC also handles the image registry certificates and its associated user bundle certificate.
You can get information about the listed certificates, including the underyling bundle the certificate comes from, and the signing and subject data.
Prerequisites
-
This procedure contains optional steps that require that the
python-yqRPM package is installed.
Procedure
Get detailed certificate information by running the following command:
oc get controllerconfig/machine-config-controller -o yaml | yq -y '.status.controllerCertificates'
$ oc get controllerconfig/machine-config-controller -o yaml | yq -y '.status.controllerCertificates'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get a simpler version of the information found in the
ControllerConfigresource by checking the machine config pool status using the following command:oc get mcp master -o yaml | yq -y '.status.certExpirys'
$ oc get mcp master -o yaml | yq -y '.status.certExpirys'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This method is meant for OpenShift Container Platform applications that already consume machine config pool information.
Check which image registry certificates are on the nodes:
Log in to a node:
oc debug node/<node_name>
$ oc debug node/<node_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set
/hostas the root directory within the debug shell:chroot /host
sh-5.1# chroot /hostCopy to Clipboard Copied! Toggle word wrap Toggle overflow Look at the contents of the
/etc/docker/cert.ddirectory:ls /etc/docker/certs.d
sh-5.1# ls /etc/docker/certs.dCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
image-registry.openshift-image-registry.svc.cluster.local:5000 image-registry.openshift-image-registry.svc:5000
image-registry.openshift-image-registry.svc.cluster.local:5000 image-registry.openshift-image-registry.svc:5000Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.2. Using MachineConfig objects to configure nodes Copy linkLink copied to clipboard!
You can use the tasks in this section to create MachineConfig objects that modify files, systemd unit files, and other operating system features running on OpenShift Container Platform nodes. For more ideas on working with machine configs, see content related to updating SSH authorized keys, verifying image signatures, enabling SCTP, and configuring iSCSI initiatornames for OpenShift Container Platform.
OpenShift Container Platform supports Ignition specification version 3.4. You should base all new machine configs you create going forward on Ignition specification version 3.4. If you are upgrading your OpenShift Container Platform cluster, any existing machine configs with a previous Ignition specification will be translated automatically to specification version 3.4.
There might be situations where the configuration on a node does not fully match what the currently-applied machine config specifies. This state is called configuration drift. The Machine Config Daemon (MCD) regularly checks the nodes for configuration drift. If the MCD detects configuration drift, the MCO marks the node degraded until an administrator corrects the node configuration. A degraded node is online and operational, but, it cannot be updated. For more information on configuration drift, see Understanding configuration drift detection.
Use the following "Configuring chrony time service" procedure as a model for how to go about adding other configuration files to OpenShift Container Platform nodes.
5.2.1. Configuring chrony time service Copy linkLink copied to clipboard!
You can set the time server and related settings used by the chrony time service (chronyd) by modifying the contents of the chrony.conf file and passing those contents to your nodes as a machine config.
Procedure
Create a Butane config including the contents of the
chrony.conffile. For example, to configure chrony on worker nodes, create a99-worker-chrony.bufile.NoteThe Butane version you specify in the config file should match the OpenShift Container Platform version and always ends in
0. For example,4.14.0. See "Creating machine configs with Butane" for information about Butane.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1 2
- On control plane nodes, substitute
masterforworkerin both of these locations. - 3
- Specify an octal value mode for the
modefield in the machine config file. After creating the file and applying the changes, themodeis converted to a decimal value. You can check the YAML file with the commandoc get mc <mc-name> -o yaml. - 4
- Specify any valid, reachable time source, such as the one provided by your DHCP server.
NoteFor all-machine to all-machine communication, the Network Time Protocol (NTP) on UDP is port
123. If an external NTP time server is configured, you must open UDP port123.Alternately, you can specify any of the following NTP servers:
1.rhel.pool.ntp.org,2.rhel.pool.ntp.org, or3.rhel.pool.ntp.org.Use Butane to generate a
MachineConfigobject file,99-worker-chrony.yaml, containing the configuration to be delivered to the nodes:butane 99-worker-chrony.bu -o 99-worker-chrony.yaml
$ butane 99-worker-chrony.bu -o 99-worker-chrony.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the configurations in one of two ways:
-
If the cluster is not running yet, after you generate manifest files, add the
MachineConfigobject file to the<installation_directory>/openshiftdirectory, and then continue to create the cluster. If the cluster is already running, apply the file:
oc apply -f ./99-worker-chrony.yaml
$ oc apply -f ./99-worker-chrony.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
If the cluster is not running yet, after you generate manifest files, add the
5.2.2. Disabling the chrony time service Copy linkLink copied to clipboard!
You can disable the chrony time service (chronyd) for nodes with a specific role by using a MachineConfig custom resource (CR).
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
Create the
MachineConfigCR that disableschronydfor the specified node role.Save the following YAML in the
disable-chronyd.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Node role where you want to disable
chronyd, for example,master.
Create the
MachineConfigCR by running the following command:oc create -f disable-chronyd.yaml
$ oc create -f disable-chronyd.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.2.3. Adding kernel arguments to nodes Copy linkLink copied to clipboard!
In some special cases, you might want to add kernel arguments to a set of nodes in your cluster. This should only be done with caution and clear understanding of the implications of the arguments you set.
Improper use of kernel arguments can result in your systems becoming unbootable.
Examples of kernel arguments you could set include:
-
nosmt: Disables symmetric multithreading (SMT) in the kernel. Multithreading allows multiple logical threads for each CPU. You could consider
nosmtin multi-tenant environments to reduce risks from potential cross-thread attacks. By disabling SMT, you essentially choose security over performance. - systemd.unified_cgroup_hierarchy: Enables Linux control group version 2 (cgroup v2). cgroup v2 is the next version of the kernel control group and offers multiple improvements.
enforcing=0: Configures Security Enhanced Linux (SELinux) to run in permissive mode. In permissive mode, the system acts as if SELinux is enforcing the loaded security policy, including labeling objects and emitting access denial entries in the logs, but it does not actually deny any operations. While not supported for production systems, permissive mode can be helpful for debugging.
WarningDisabling SELinux on RHCOS in production is not supported. Once SELinux has been disabled on a node, it must be re-provisioned before re-inclusion in a production cluster.
See Kernel.org kernel parameters for a list and descriptions of kernel arguments.
In the following procedure, you create a MachineConfig object that identifies:
- A set of machines to which you want to add the kernel argument. In this case, machines with a worker role.
- Kernel arguments that are appended to the end of the existing kernel arguments.
- A label that indicates where in the list of machine configs the change is applied.
Prerequisites
- Have administrative privilege to a working OpenShift Container Platform cluster.
Procedure
List existing
MachineConfigobjects for your OpenShift Container Platform cluster to determine how to label your machine config:oc get MachineConfig
$ oc get MachineConfigCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a
MachineConfigobject file that identifies the kernel argument (for example,05-worker-kernelarg-selinuxpermissive.yaml)Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the new machine config:
oc create -f 05-worker-kernelarg-selinuxpermissive.yaml
$ oc create -f 05-worker-kernelarg-selinuxpermissive.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the machine configs to see that the new one was added:
oc get MachineConfig
$ oc get MachineConfigCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the nodes:
oc get nodes
$ oc get nodesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can see that scheduling on each worker node is disabled as the change is being applied.
Check that the kernel argument worked by going to one of the worker nodes and listing the kernel command-line arguments (in
/proc/cmdlineon the host):oc debug node/ip-10-0-141-105.ec2.internal
$ oc debug node/ip-10-0-141-105.ec2.internalCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You should see the
enforcing=0argument added to the other kernel arguments.
5.2.4. Enabling multipathing with kernel arguments on RHCOS Copy linkLink copied to clipboard!
Red Hat Enterprise Linux CoreOS (RHCOS) supports multipathing on the primary disk, allowing stronger resilience to hardware failure to achieve higher host availability. Postinstallation support is available by activating multipathing via the machine config.
Enabling multipathing during installation is supported and recommended for nodes provisioned in OpenShift Container Platform. In setups where any I/O to non-optimized paths results in I/O system errors, you must enable multipathing at installation time. For more information about enabling multipathing during installation time, see "Enabling multipathing post installation" in the Installing on bare metal documentation.
On IBM Z® and IBM® LinuxONE, you can enable multipathing only if you configured your cluster for it during installation. For more information, see "Installing RHCOS and starting the OpenShift Container Platform bootstrap process" in Installing a cluster with z/VM on IBM Z® and IBM® LinuxONE.
When an OpenShift Container Platform cluster is installed or configured as a postinstallation activity on a single VIOS host with "vSCSI" storage on IBM Power® with multipath configured, the CoreOS nodes with multipath enabled fail to boot. This behavior is expected, as only one path is available to the node.
Prerequisites
- You have a running OpenShift Container Platform cluster.
- You are logged in to the cluster as a user with administrative privileges.
- You have confirmed that the disk is enabled for multipathing. Multipathing is only supported on hosts that are connected to a SAN via an HBA adapter.
Procedure
To enable multipathing postinstallation on control plane nodes:
Create a machine config file, such as
99-master-kargs-mpath.yaml, that instructs the cluster to add themasterlabel and that identifies the multipath kernel argument, for example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To enable multipathing postinstallation on worker nodes:
Create a machine config file, such as
99-worker-kargs-mpath.yaml, that instructs the cluster to add theworkerlabel and that identifies the multipath kernel argument, for example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Create the new machine config by using either the master or worker YAML file you previously created:
oc create -f ./99-worker-kargs-mpath.yaml
$ oc create -f ./99-worker-kargs-mpath.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the machine configs to see that the new one was added:
oc get MachineConfig
$ oc get MachineConfigCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the nodes:
oc get nodes
$ oc get nodesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can see that scheduling on each worker node is disabled as the change is being applied.
Check that the kernel argument worked by going to one of the worker nodes and listing the kernel command-line arguments (in
/proc/cmdlineon the host):oc debug node/ip-10-0-141-105.ec2.internal
$ oc debug node/ip-10-0-141-105.ec2.internalCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You should see the added kernel arguments.
5.2.5. Adding a real-time kernel to nodes Copy linkLink copied to clipboard!
Some OpenShift Container Platform workloads require a high degree of determinism.While Linux is not a real-time operating system, the Linux real-time kernel includes a preemptive scheduler that provides the operating system with real-time characteristics.
If your OpenShift Container Platform workloads require these real-time characteristics, you can switch your machines to the Linux real-time kernel. For OpenShift Container Platform, 4.14 you can make this switch using a MachineConfig object. Although making the change is as simple as changing a machine config kernelType setting to realtime, there are a few other considerations before making the change:
- Currently, real-time kernel is supported only on worker nodes, and only for radio access network (RAN) use.
- The following procedure is fully supported with bare metal installations that use systems that are certified for Red Hat Enterprise Linux for Real Time 8.
- Real-time support in OpenShift Container Platform is limited to specific subscriptions.
- The following procedure is also supported for use with Google Cloud Platform.
Prerequisites
- Have a running OpenShift Container Platform cluster (version 4.4 or later).
- Log in to the cluster as a user with administrative privileges.
Procedure
Create a machine config for the real-time kernel: Create a YAML file (for example,
99-worker-realtime.yaml) that contains aMachineConfigobject for therealtimekernel type. This example tells the cluster to use a real-time kernel for all worker nodes:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the machine config to the cluster. Type the following to add the machine config to the cluster:
oc create -f 99-worker-realtime.yaml
$ oc create -f 99-worker-realtime.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the real-time kernel: Once each impacted node reboots, log in to the cluster and run the following commands to make sure that the real-time kernel has replaced the regular kernel for the set of nodes you configured:
oc get nodes
$ oc get nodesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME STATUS ROLES AGE VERSION ip-10-0-143-147.us-east-2.compute.internal Ready worker 103m v1.27.3 ip-10-0-146-92.us-east-2.compute.internal Ready worker 101m v1.27.3 ip-10-0-169-2.us-east-2.compute.internal Ready worker 102m v1.27.3
NAME STATUS ROLES AGE VERSION ip-10-0-143-147.us-east-2.compute.internal Ready worker 103m v1.27.3 ip-10-0-146-92.us-east-2.compute.internal Ready worker 101m v1.27.3 ip-10-0-169-2.us-east-2.compute.internal Ready worker 102m v1.27.3Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc debug node/ip-10-0-143-147.us-east-2.compute.internal
$ oc debug node/ip-10-0-143-147.us-east-2.compute.internalCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The kernel name contains
rtand text “PREEMPT RT” indicates that this is a real-time kernel.To go back to the regular kernel, delete the
MachineConfigobject:oc delete -f 99-worker-realtime.yaml
$ oc delete -f 99-worker-realtime.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.2.6. Configuring journald settings Copy linkLink copied to clipboard!
If you need to configure settings for the journald service on OpenShift Container Platform nodes, you can do that by modifying the appropriate configuration file and passing the file to the appropriate pool of nodes as a machine config.
This procedure describes how to modify journald rate limiting settings in the /etc/systemd/journald.conf file and apply them to worker nodes. See the journald.conf man page for information on how to use that file.
Prerequisites
- Have a running OpenShift Container Platform cluster.
- Log in to the cluster as a user with administrative privileges.
Procedure
Create a Butane config file,
40-worker-custom-journald.bu, that includes an/etc/systemd/journald.conffile with the required settings.NoteThe Butane version you specify in the config file should match the OpenShift Container Platform version and always ends in
0. For example,4.14.0. See "Creating machine configs with Butane" for information about Butane.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use Butane to generate a
MachineConfigobject file,40-worker-custom-journald.yaml, containing the configuration to be delivered to the worker nodes:butane 40-worker-custom-journald.bu -o 40-worker-custom-journald.yaml
$ butane 40-worker-custom-journald.bu -o 40-worker-custom-journald.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the machine config to the pool:
oc apply -f 40-worker-custom-journald.yaml
$ oc apply -f 40-worker-custom-journald.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the new machine config is applied and that the nodes are not in a degraded state. It might take a few minutes. The worker pool will show the updates in progress, as each node successfully has the new machine config applied:
oc get machineconfigpool
$ oc get machineconfigpool NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-35 True False False 3 3 3 0 34m worker rendered-worker-d8 False True False 3 1 1 0 34mCopy to Clipboard Copied! Toggle word wrap Toggle overflow To check that the change was applied, you can log in to a worker node:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.2.7. Adding extensions to RHCOS Copy linkLink copied to clipboard!
RHCOS is a minimal container-oriented RHEL operating system, designed to provide a common set of capabilities to OpenShift Container Platform clusters across all platforms. While adding software packages to RHCOS systems is generally discouraged, the MCO provides an extensions feature you can use to add a minimal set of features to RHCOS nodes.
Currently, the following extensions are available:
-
usbguard: Adding the
usbguardextension protects RHCOS systems from attacks from intrusive USB devices. See USBGuard for details. -
kerberos: Adding the
kerberosextension provides a mechanism that allows both users and machines to identify themselves to the network to receive defined, limited access to the areas and services that an administrator has configured. See Using Kerberos for details, including how to set up a Kerberos client and mount a Kerberized NFS share.
The following procedure describes how to use a machine config to add one or more extensions to your RHCOS nodes.
Prerequisites
- Have a running OpenShift Container Platform cluster (version 4.6 or later).
- Log in to the cluster as a user with administrative privileges.
Procedure
Create a machine config for extensions: Create a YAML file (for example,
80-extensions.yaml) that contains aMachineConfigextensionsobject. This example tells the cluster to add theusbguardextension.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the machine config to the cluster. Type the following to add the machine config to the cluster:
oc create -f 80-extensions.yaml
$ oc create -f 80-extensions.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow This sets all worker nodes to have rpm packages for
usbguardinstalled.Check that the extensions were applied:
oc get machineconfig 80-worker-extensions
$ oc get machineconfig 80-worker-extensionsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 80-worker-extensions 3.4.0 57s
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 80-worker-extensions 3.4.0 57sCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the new machine config is now applied and that the nodes are not in a degraded state. It may take a few minutes. The worker pool will show the updates in progress, as each machine successfully has the new machine config applied:
oc get machineconfigpool
$ oc get machineconfigpoolCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-35 True False False 3 3 3 0 34m worker rendered-worker-d8 False True False 3 1 1 0 34m
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-35 True False False 3 3 3 0 34m worker rendered-worker-d8 False True False 3 1 1 0 34mCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the extensions. To check that the extension was applied, run:
oc get node | grep worker
$ oc get node | grep workerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME STATUS ROLES AGE VERSION ip-10-0-169-2.us-east-2.compute.internal Ready worker 102m v1.27.3
NAME STATUS ROLES AGE VERSION ip-10-0-169-2.us-east-2.compute.internal Ready worker 102m v1.27.3Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc debug node/ip-10-0-169-2.us-east-2.compute.internal
$ oc debug node/ip-10-0-169-2.us-east-2.compute.internalCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
... To use host binaries, run `chroot /host` sh-4.4# chroot /host sh-4.4# rpm -q usbguard usbguard-0.7.4-4.el8.x86_64.rpm
... To use host binaries, run `chroot /host` sh-4.4# chroot /host sh-4.4# rpm -q usbguard usbguard-0.7.4-4.el8.x86_64.rpmCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.2.8. Loading custom firmware blobs in the machine config manifest Copy linkLink copied to clipboard!
Because the default location for firmware blobs in /usr/lib is read-only, you can locate a custom firmware blob by updating the search path. This enables you to load local firmware blobs in the machine config manifest when the blobs are not managed by RHCOS.
Procedure
Create a Butane config file,
98-worker-firmware-blob.bu, that updates the search path so that it is root-owned and writable to local storage. The following example places the custom blob file from your local workstation onto nodes under/var/lib/firmware.NoteThe Butane version you specify in the config file should match the OpenShift Container Platform version and always ends in
0. For example,4.14.0. See "Creating machine configs with Butane" for information about Butane.Butane config file for custom firmware blob
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Sets the path on the node where the firmware package is copied to.
- 2
- Specifies a file with contents that are read from a local file directory on the system running Butane. The path of the local file is relative to a
files-dirdirectory, which must be specified by using the--files-diroption with Butane in the following step. - 3
- Sets the permissions for the file on the RHCOS node. It is recommended to set
0644permissions. - 4
- The
firmware_class.pathparameter customizes the kernel search path of where to look for the custom firmware blob that was copied from your local workstation onto the root file system of the node. This example uses/var/lib/firmwareas the customized path.
Run Butane to generate a
MachineConfigobject file that uses a copy of the firmware blob on your local workstation named98-worker-firmware-blob.yaml. The firmware blob contains the configuration to be delivered to the nodes. The following example uses the--files-diroption to specify the directory on your workstation where the local file or files are located:butane 98-worker-firmware-blob.bu -o 98-worker-firmware-blob.yaml --files-dir <directory_including_package_name>
$ butane 98-worker-firmware-blob.bu -o 98-worker-firmware-blob.yaml --files-dir <directory_including_package_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the configurations to the nodes in one of two ways:
-
If the cluster is not running yet, after you generate manifest files, add the
MachineConfigobject file to the<installation_directory>/openshiftdirectory, and then continue to create the cluster. If the cluster is already running, apply the file:
oc apply -f 98-worker-firmware-blob.yaml
$ oc apply -f 98-worker-firmware-blob.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow A
MachineConfigobject YAML file is created for you to finish configuring your machines.
-
If the cluster is not running yet, after you generate manifest files, add the
-
Save the Butane config in case you need to update the
MachineConfigobject in the future.
5.2.9. Changing the core user password for node access Copy linkLink copied to clipboard!
By default, Red Hat Enterprise Linux CoreOS (RHCOS) creates a user named core on the nodes in your cluster. You can use the core user to access the node through a cloud provider serial console or a bare metal baseboard controller manager (BMC). This can be helpful, for example, if a node is down and you cannot access that node by using SSH or the oc debug node command. However, by default, there is no password for this user, so you cannot log in without creating one.
You can create a password for the core user by using a machine config. The Machine Config Operator (MCO) assigns the password and injects the password into the /etc/shadow file, allowing you to log in with the core user. The MCO does not examine the password hash. As such, the MCO cannot report if there is a problem with the password.
- The password works only through a cloud provider serial console or a BMC. It does not work with SSH.
-
If you have a machine config that includes an
/etc/shadowfile or a systemd unit that sets a password, it takes precedence over the password hash.
You can change the password, if needed, by editing the machine config you used to create the password. Also, you can remove the password by deleting the machine config. Deleting the machine config does not remove the user account.
Procedure
Using a tool that is supported by your operating system, create a hashed password. For example, create a hashed password using
mkpasswdby running the following command:mkpasswd -m SHA-512 testpass
$ mkpasswd -m SHA-512 testpassCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
$6$CBZwA6s6AVFOtiZe$aUKDWpthhJEyR3nnhM02NM1sKCpHn9XN.NPrJNQ3HYewioaorpwL3mKGLxvW0AOb4pJxqoqP4nFX77y0p00.8.
$ $6$CBZwA6s6AVFOtiZe$aUKDWpthhJEyR3nnhM02NM1sKCpHn9XN.NPrJNQ3HYewioaorpwL3mKGLxvW0AOb4pJxqoqP4nFX77y0p00.8.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a machine config file that contains the
coreusername and the hashed password:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the machine config by running the following command:
oc create -f <file-name>.yaml
$ oc create -f <file-name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow The nodes do not reboot and should become available in a few moments. You can use the
oc get mcpto watch for the machine config pools to be updated, as shown in the following example:NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-d686a3ffc8fdec47280afec446fce8dd True False False 3 3 3 0 64m worker rendered-worker-4605605a5b1f9de1d061e9d350f251e5 False True False 3 0 0 0 64m
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-d686a3ffc8fdec47280afec446fce8dd True False False 3 3 3 0 64m worker rendered-worker-4605605a5b1f9de1d061e9d350f251e5 False True False 3 0 0 0 64mCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
After the nodes return to the
UPDATED=Truestate, start a debug session for a node by running the following command:oc debug node/<node_name>
$ oc debug node/<node_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set
/hostas the root directory within the debug shell by running the following command:chroot /host
sh-4.4# chroot /hostCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check the contents of the
/etc/shadowfile:Example output
... core:$6$2sE/010goDuRSxxv$o18K52wor.wIwZp:19418:0:99999:7::: ...
... core:$6$2sE/010goDuRSxxv$o18K52wor.wIwZp:19418:0:99999:7::: ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow The hashed password is assigned to the
coreuser.
5.3. Configuring MCO-related custom resources Copy linkLink copied to clipboard!
Besides managing MachineConfig objects, the MCO manages two custom resources (CRs): KubeletConfig and ContainerRuntimeConfig. Those CRs let you change node-level settings impacting how the Kubelet and CRI-O container runtime services behave.
5.3.1. Creating a KubeletConfig CR to edit kubelet parameters Copy linkLink copied to clipboard!
The kubelet configuration is currently serialized as an Ignition configuration, so it can be directly edited. However, there is also a new kubelet-config-controller added to the Machine Config Controller (MCC). This lets you use a KubeletConfig custom resource (CR) to edit the kubelet parameters.
As the fields in the kubeletConfig object are passed directly to the kubelet from upstream Kubernetes, the kubelet validates those values directly. Invalid values in the kubeletConfig object might cause cluster nodes to become unavailable. For valid values, see the Kubernetes documentation.
Consider the following guidance:
-
Edit an existing
KubeletConfigCR to modify existing settings or add new settings, instead of creating a CR for each change. It is recommended that you create a CR only to modify a different machine config pool, or for changes that are intended to be temporary, so that you can revert the changes. -
Create one
KubeletConfigCR for each machine config pool with all the config changes you want for that pool. -
As needed, create multiple
KubeletConfigCRs with a limit of 10 per cluster. For the firstKubeletConfigCR, the Machine Config Operator (MCO) creates a machine config appended withkubelet. With each subsequent CR, the controller creates anotherkubeletmachine config with a numeric suffix. For example, if you have akubeletmachine config with a-2suffix, the nextkubeletmachine config is appended with-3.
If you are applying a kubelet or container runtime config to a custom machine config pool, the custom role in the machineConfigSelector must match the name of the custom machine config pool.
For example, because the following custom machine config pool is named infra, the custom role must also be infra:
If you want to delete the machine configs, delete them in reverse order to avoid exceeding the limit. For example, you delete the kubelet-3 machine config before deleting the kubelet-2 machine config.
If you have a machine config with a kubelet-9 suffix, and you create another KubeletConfig CR, a new machine config is not created, even if there are fewer than 10 kubelet machine configs.
Example KubeletConfig CR
oc get kubeletconfig
$ oc get kubeletconfig
NAME AGE set-kubelet-config 15m
NAME AGE
set-kubelet-config 15m
Example showing a KubeletConfig machine config
oc get mc | grep kubelet
$ oc get mc | grep kubelet
... 99-worker-generated-kubelet-1 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.4.0 26m ...
...
99-worker-generated-kubelet-1 b5c5119de007945b6fe6fb215db3b8e2ceb12511 3.4.0 26m
...
The following procedure is an example to show how to configure the maximum number of pods per node, the maximum PIDs per node, and the maximum container log size size on the worker nodes.
Prerequisites
Obtain the label associated with the static
MachineConfigPoolCR for the type of node you want to configure. Perform one of the following steps:View the machine config pool:
oc describe machineconfigpool <name>
$ oc describe machineconfigpool <name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
oc describe machineconfigpool worker
$ oc describe machineconfigpool workerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- If a label has been added it appears under
labels.
If the label is not present, add a key/value pair:
oc label machineconfigpool worker custom-kubelet=set-kubelet-config
$ oc label machineconfigpool worker custom-kubelet=set-kubelet-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure
View the available machine configuration objects that you can select:
oc get machineconfig
$ oc get machineconfigCopy to Clipboard Copied! Toggle word wrap Toggle overflow By default, the two kubelet-related configs are
01-master-kubeletand01-worker-kubelet.Check the current value for the maximum pods per node:
oc describe node <node_name>
$ oc describe node <node_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow For example:
oc describe node ci-ln-5grqprb-f76d1-ncnqq-worker-a-mdv94
$ oc describe node ci-ln-5grqprb-f76d1-ncnqq-worker-a-mdv94Copy to Clipboard Copied! Toggle word wrap Toggle overflow Look for
value: pods: <value>in theAllocatablestanza:Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Configure the worker nodes as needed:
Create a YAML file similar to the following that contains the kubelet configuration:
ImportantKubelet configurations that target a specific machine config pool also affect any dependent pools. For example, creating a kubelet configuration for the pool containing worker nodes will also apply to any subset pools, including the pool containing infrastructure nodes. To avoid this, you must create a new machine config pool with a selection expression that only includes worker nodes, and have your kubelet configuration target this new pool.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Use
podPidsLimitto set the maximum number of PIDs in any pod. -
Use
containerLogMaxSizeto set the maximum size of the container log file before it is rotated. Use
maxPodsto set the maximum pods per node.NoteThe rate at which the kubelet talks to the API server depends on queries per second (QPS) and burst values. The default values,
50forkubeAPIQPSand100forkubeAPIBurst, are sufficient if there are limited pods running on each node. It is recommended to update the kubelet QPS and burst rates if there are enough CPU and memory resources on the node.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
-
Use
Update the machine config pool for workers with the label:
oc label machineconfigpool worker custom-kubelet=set-kubelet-config
$ oc label machineconfigpool worker custom-kubelet=set-kubelet-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
KubeletConfigobject:oc create -f change-maxPods-cr.yaml
$ oc create -f change-maxPods-cr.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Verify that the
KubeletConfigobject is created:oc get kubeletconfig
$ oc get kubeletconfigCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME AGE set-kubelet-config 15m
NAME AGE set-kubelet-config 15mCopy to Clipboard Copied! Toggle word wrap Toggle overflow Depending on the number of worker nodes in the cluster, wait for the worker nodes to be rebooted one by one. For a cluster with 3 worker nodes, this could take about 10 to 15 minutes.
Verify that the changes are applied to the node:
Check on a worker node that the
maxPodsvalue changed:oc describe node <node_name>
$ oc describe node <node_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Locate the
Allocatablestanza:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- In this example, the
podsparameter should report the value you set in theKubeletConfigobject.
Verify the change in the
KubeletConfigobject:oc get kubeletconfigs set-kubelet-config -o yaml
$ oc get kubeletconfigs set-kubelet-config -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow This should show a status of
Trueandtype:Success, as shown in the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.3.2. Creating a ContainerRuntimeConfig CR to edit CRI-O parameters Copy linkLink copied to clipboard!
You can change some of the settings associated with the OpenShift Container Platform CRI-O runtime for the nodes associated with a specific machine config pool (MCP). Using a ContainerRuntimeConfig custom resource (CR), you set the configuration values and add a label to match the MCP. The MCO then rebuilds the crio.conf and storage.conf configuration files on the associated nodes with the updated values.
To revert the changes implemented by using a ContainerRuntimeConfig CR, you must delete the CR. Removing the label from the machine config pool does not revert the changes.
You can modify the following settings by using a ContainerRuntimeConfig CR:
-
Log level: The
logLevelparameter sets the CRI-Olog_levelparameter, which is the level of verbosity for log messages. The default isinfo(log_level = info). Other options includefatal,panic,error,warn,debug, andtrace. -
Overlay size: The
overlaySizeparameter sets the CRI-O Overlay storage driversizeparameter, which is the maximum size of a container image. -
Container runtime: The
defaultRuntimeparameter sets the container runtime to eitherruncorcrun. The default isrunc.
You should have one ContainerRuntimeConfig CR for each machine config pool with all the config changes you want for that pool. If you are applying the same content to all the pools, you only need one ContainerRuntimeConfig CR for all the pools.
You should edit an existing ContainerRuntimeConfig CR to modify existing settings or add new settings instead of creating a new CR for each change. It is recommended to create a new ContainerRuntimeConfig CR only to modify a different machine config pool, or for changes that are intended to be temporary so that you can revert the changes.
You can create multiple ContainerRuntimeConfig CRs, as needed, with a limit of 10 per cluster. For the first ContainerRuntimeConfig CR, the MCO creates a machine config appended with containerruntime. With each subsequent CR, the controller creates a new containerruntime machine config with a numeric suffix. For example, if you have a containerruntime machine config with a -2 suffix, the next containerruntime machine config is appended with -3.
If you want to delete the machine configs, you should delete them in reverse order to avoid exceeding the limit. For example, you should delete the containerruntime-3 machine config before deleting the containerruntime-2 machine config.
If you have a machine config with a containerruntime-9 suffix, and you create another ContainerRuntimeConfig CR, a new machine config is not created, even if there are fewer than 10 containerruntime machine configs.
Example showing multiple ContainerRuntimeConfig CRs
oc get ctrcfg
$ oc get ctrcfg
Example output
NAME AGE ctr-overlay 15m ctr-level 5m45s
NAME AGE
ctr-overlay 15m
ctr-level 5m45s
Example showing multiple containerruntime machine configs
oc get mc | grep container
$ oc get mc | grep container
Example output
The following example sets the log_level field to debug and sets the overlay size to 8 GB:
Example ContainerRuntimeConfig CR
- 1
- Specifies the machine config pool label. For a container runtime config, the role must match the name of the associated machine config pool.
- 2
- Optional: Specifies the level of verbosity for log messages.
- 3
- Optional: Specifies the maximum size of a container image.
- 4
- Optional: Specifies the container runtime to deploy to new containers. The default value is
runc.
Procedure
To change CRI-O settings using the ContainerRuntimeConfig CR:
Create a YAML file for the
ContainerRuntimeConfigCR:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
ContainerRuntimeConfigCR:oc create -f <file_name>.yaml
$ oc create -f <file_name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the CR is created:
oc get ContainerRuntimeConfig
$ oc get ContainerRuntimeConfigCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME AGE overlay-size 3m19s
NAME AGE overlay-size 3m19sCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that a new
containerruntimemachine config is created:oc get machineconfigs | grep containerrun
$ oc get machineconfigs | grep containerrunCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
99-worker-generated-containerruntime 2c9371fbb673b97a6fe8b1c52691999ed3a1bfc2 3.4.0 31s
99-worker-generated-containerruntime 2c9371fbb673b97a6fe8b1c52691999ed3a1bfc2 3.4.0 31sCopy to Clipboard Copied! Toggle word wrap Toggle overflow Monitor the machine config pool until all are shown as ready:
oc get mcp worker
$ oc get mcp workerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-169 False True False 3 1 1 0 9h
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-169 False True False 3 1 1 0 9hCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the settings were applied in CRI-O:
Open an
oc debugsession to a node in the machine config pool and runchroot /host.oc debug node/<node_name>
$ oc debug node/<node_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow chroot /host
sh-4.4# chroot /hostCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the changes in the
crio.conffile:crio config | grep 'log_level'
sh-4.4# crio config | grep 'log_level'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
log_level = "debug"
log_level = "debug"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the changes in the `storage.conf`file:
head -n 7 /etc/containers/storage.conf
sh-4.4# head -n 7 /etc/containers/storage.confCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.3.3. Setting the default maximum container root partition size for Overlay with CRI-O Copy linkLink copied to clipboard!
The root partition of each container shows all of the available disk space of the underlying host. Follow this guidance to set a maximum partition size for the root disk of all containers.
To configure the maximum Overlay size, as well as other CRI-O options like the log level, you can create the following ContainerRuntimeConfig custom resource definition (CRD):
Procedure
Create the configuration object:
oc apply -f overlaysize.yml
$ oc apply -f overlaysize.ymlCopy to Clipboard Copied! Toggle word wrap Toggle overflow To apply the new CRI-O configuration to your worker nodes, edit the worker machine config pool:
oc edit machineconfigpool worker
$ oc edit machineconfigpool workerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add the
custom-criolabel based on thematchLabelsname you set in theContainerRuntimeConfigCRD:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Save the changes, then view the machine configs:
oc get machineconfigs
$ oc get machineconfigsCopy to Clipboard Copied! Toggle word wrap Toggle overflow New
99-worker-generated-containerruntimeandrendered-worker-xyzobjects are created:Example output
99-worker-generated-containerruntime 4173030d89fbf4a7a0976d1665491a4d9a6e54f1 3.4.0 7m42s rendered-worker-xyz 4173030d89fbf4a7a0976d1665491a4d9a6e54f1 3.4.0 7m36s
99-worker-generated-containerruntime 4173030d89fbf4a7a0976d1665491a4d9a6e54f1 3.4.0 7m42s rendered-worker-xyz 4173030d89fbf4a7a0976d1665491a4d9a6e54f1 3.4.0 7m36sCopy to Clipboard Copied! Toggle word wrap Toggle overflow After those objects are created, monitor the machine config pool for the changes to be applied:
oc get mcp worker
$ oc get mcp workerCopy to Clipboard Copied! Toggle word wrap Toggle overflow The worker nodes show
UPDATINGasTrue, as well as the number of machines, the number updated, and other details:Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-xyz False True False 3 2 2 0 20h
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-xyz False True False 3 2 2 0 20hCopy to Clipboard Copied! Toggle word wrap Toggle overflow When complete, the worker nodes transition back to
UPDATINGasFalse, and theUPDATEDMACHINECOUNTnumber matches theMACHINECOUNT:Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-xyz True False False 3 3 3 0 20h
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-xyz True False False 3 3 3 0 20hCopy to Clipboard Copied! Toggle word wrap Toggle overflow Looking at a worker machine, you see that the new 8 GB max size configuration is applied to all of the workers:
Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Looking inside a container, you see that the root partition is now 8 GB:
Example output
~ $ df -h Filesystem Size Used Available Use% Mounted on overlay 8.0G 8.0K 8.0G 0% /
~ $ df -h Filesystem Size Used Available Use% Mounted on overlay 8.0G 8.0K 8.0G 0% /Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.3.4. Creating a drop-in file for the default capabilities of CRI-O Copy linkLink copied to clipboard!
You can change some of the settings associated with the OpenShift Container Platform CRI-O runtime for the nodes associated with a specific machine config pool (MCP). By using a controller custom resource (CR), you set the configuration values and add a label to match the MCP. The Machine Config Operator (MCO) then rebuilds the crio.conf and default.conf configuration files on the associated nodes with the updated values.
Earlier versions of OpenShift Container Platform included specific machine configs by default. If you updated to a later version of OpenShift Container Platform, those machine configs were retained to ensure that clusters running on the same OpenShift Container Platform version have the same machine configs.
You can create multiple ContainerRuntimeConfig CRs, as needed, with a limit of 10 per cluster. For the first ContainerRuntimeConfig CR, the MCO creates a machine config appended with containerruntime. With each subsequent CR, the controller creates a containerruntime machine config with a numeric suffix. For example, if you have a containerruntime machine config with a -2 suffix, the next containerruntime machine config is appended with -3.
If you want to delete the machine configs, delete them in reverse order to avoid exceeding the limit. For example, delete the containerruntime-3 machine config before you delete the containerruntime-2 machine config.
If you have a machine config with a containerruntime-9 suffix and you create another ContainerRuntimeConfig CR, a new machine config is not created, even if there are fewer than 10 containerruntime machine configs.
Example of multiple ContainerRuntimeConfig CRs
oc get ctrcfg
$ oc get ctrcfg
Example output
NAME AGE ctr-overlay 15m ctr-level 5m45s
NAME AGE
ctr-overlay 15m
ctr-level 5m45s
Example of multiple containerruntime related system configs
cat /proc/1/status | grep Cap
$ cat /proc/1/status | grep Cap
capsh --decode=<decode_CapBnd_value>
$ capsh --decode=<decode_CapBnd_value>
- 1
- Replace
<decode_CapBnd_value>with the specific value you want to decode.