Chapter 2. Deploying workloads on bare metal
You can deploy OpenShift sandboxed containers workloads on on-premise bare-metal servers with Red Hat Enterprise Linux CoreOS (RHCOS) installed on the worker nodes.
- RHEL nodes are not supported.
- Nested virtualization is not supported.
You can use any installation method including user-provisioned, installer-provisioned, or Assisted Installer to deploy your cluster.
You can also install OpenShift sandboxed containers on Amazon Web Services (AWS) bare-metal instances. Bare-metal instances offered by other cloud providers are not supported.
Deployment workflow
You deploy OpenShift sandboxed containers workloads by performing the following steps:
- Prepare your environment.
-
Create a
KataConfigcustom resource. -
Configure your workload objects to use the
kataruntime class.
2.1. Preparing your environment Copy linkLink copied to clipboard!
Perform the following steps to prepare your environment:
- Ensure that your cluster has sufficient resources.
- Install the OpenShift sandboxed containers Operator.
Optional: Configure node eligibility checks to ensure that your worker nodes support OpenShift sandboxed containers:
- Install the Node Feature Discovery (NFD) Operator. See the NFD Operator documentation for details.
-
Create a
NodeFeatureDiscoverycustom resource (CR) to define the node configuration parameters that the NFD Operator checks.
2.1.1. Resource requirements Copy linkLink copied to clipboard!
OpenShift sandboxed containers lets users run workloads on their OpenShift Container Platform clusters inside a sandboxed runtime (Kata). Each pod is represented by a virtual machine (VM). Each VM runs in a QEMU process and hosts a kata-agent process that acts as a supervisor for managing container workloads, and the processes running in those containers. Two additional processes add more overhead:
-
containerd-shim-kata-v2is used to communicate with the pod. -
virtiofsdhandles host file system access on behalf of the guest.
Each VM is configured with a default amount of memory. Additional memory is hot-plugged into the VM for containers that explicitly request memory.
A container running without a memory resource consumes free memory until the total memory used by the VM reaches the default allocation. The guest and its I/O buffers also consume memory.
If a container is given a specific amount of memory, then that memory is hot-plugged into the VM before the container starts.
When a memory limit is specified, the workload is terminated if it consumes more memory than the limit. If no memory limit is specified, the kernel running on the VM might run out of memory. If the kernel runs out of memory, it might terminate other processes on the VM.
Default memory sizes
The following table lists some the default values for resource allocation.
| Resource | Value |
|---|---|
| Memory allocated by default to a virtual machine | 2Gi |
| Guest Linux kernel memory usage at boot | ~110Mi |
| Memory used by the QEMU process (excluding VM memory) | ~30Mi |
|
Memory used by the | ~10Mi |
|
Memory used by the | ~20Mi |
|
File buffer cache data after running | ~300Mi* [1] |
File buffers appear and are accounted for in multiple locations:
- In the guest where it appears as file buffer cache.
-
In the
virtiofsddaemon that maps allowed user-space file I/O operations. - In the QEMU process as guest memory.
Total memory usage is properly accounted for by the memory utilization metrics, which only count that memory once.
Pod overhead describes the amount of system resources that a pod on a node uses. You can get the current pod overhead for the Kata runtime by using oc describe runtimeclass kata as shown below.
Example
$ oc describe runtimeclass kata
Example output
kind: RuntimeClass
apiVersion: node.k8s.io/v1
metadata:
name: kata
overhead:
podFixed:
memory: "500Mi"
cpu: "500m"
You can change the pod overhead by changing the spec.overhead field for a RuntimeClass. For example, if the configuration that you run for your containers consumes more than 350Mi of memory for the QEMU process and guest kernel data, you can alter the RuntimeClass overhead to suit your needs.
The specified default overhead values are supported by Red Hat. Changing default overhead values is not supported and can result in technical issues.
When performing any kind of file system I/O in the guest, file buffers are allocated in the guest kernel. The file buffers are also mapped in the QEMU process on the host, as well as in the virtiofsd process.
For example, if you use 300Mi of file buffer cache in the guest, both QEMU and virtiofsd appear to use 300Mi additional memory. However, the same memory is used in all three cases. Therefore, the total memory usage is only 300Mi, mapped in three different places. This is correctly accounted for when reporting the memory utilization metrics.
2.1.2. Installing the OpenShift sandboxed containers Operator Copy linkLink copied to clipboard!
You can install the OpenShift sandboxed containers Operator by using the OpenShift Container Platform web console or command line interface (CLI).
2.1.2.1. Installing the Operator by using the web console Copy linkLink copied to clipboard!
You can install the OpenShift sandboxed containers Operator by using the Red Hat OpenShift Container Platform web console.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole.
Procedure
-
In the OpenShift Container Platform web console, navigate to Operators
OperatorHub. -
In the Filter by keyword field, type
OpenShift sandboxed containers. - Select the OpenShift sandboxed containers Operator tile and click Install.
- On the Install Operator page, select stable from the list of available Update Channel options.
Verify that Operator recommended Namespace is selected for Installed Namespace. This installs the Operator in the mandatory
openshift-sandboxed-containers-operatornamespace. If this namespace does not yet exist, it is automatically created.NoteAttempting to install the OpenShift sandboxed containers Operator in a namespace other than
openshift-sandboxed-containers-operatorcauses the installation to fail.- Verify that Automatic is selected for Approval Strategy. Automatic is the default value, and enables automatic updates to OpenShift sandboxed containers when a new z-stream release is available.
- Click Install.
The OpenShift sandboxed containers Operator is now installed on your cluster.
Verification
-
Navigate to Operators
Installed Operators. - Verify that the OpenShift sandboxed containers Operator is displayed.
2.1.2.2. Installing the Operator by using the CLI Copy linkLink copied to clipboard!
You can install the OpenShift sandboxed containers Operator by using the CLI.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have access to the cluster as a user with the
cluster-adminrole.
Procedure
Create a
Namespace.yamlmanifest file:apiVersion: v1 kind: Namespace metadata: name: openshift-sandboxed-containers-operatorCreate the namespace by running the following command:
$ oc create -f Namespace.yamlCreate an
OperatorGroup.yamlmanifest file:apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: openshift-sandboxed-containers-operator namespace: openshift-sandboxed-containers-operator spec: targetNamespaces: - openshift-sandboxed-containers-operatorCreate the operator group by running the following command:
$ oc create -f OperatorGroup.yamlCreate a
Subscription.yamlmanifest file:apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: openshift-sandboxed-containers-operator namespace: openshift-sandboxed-containers-operator spec: channel: stable installPlanApproval: Automatic name: sandboxed-containers-operator source: redhat-operators sourceNamespace: openshift-marketplace startingCSV: sandboxed-containers-operator.v1.6.0Create the subscription by running the following command:
$ oc create -f Subscription.yaml
The OpenShift sandboxed containers Operator is now installed on your cluster.
Verification
Ensure that the Operator is correctly installed by running the following command:
$ oc get csv -n openshift-sandboxed-containers-operatorExample output
NAME DISPLAY VERSION REPLACES PHASE openshift-sandboxed-containers openshift-sandboxed-containers-operator 1.6.0 1.5.3 Succeeded
2.1.3. Creating the NodeFeatureDiscovery CR Copy linkLink copied to clipboard!
You create a NodeFeatureDiscovery custom resource (CR) to define the configuration parameters that the Node Feature Discovery (NFD) Operator checks to determine that the worker nodes can support OpenShift sandboxed containers.
To install the kata runtime on only selected worker nodes that you know are eligible, apply the feature.node.kubernetes.io/runtime.kata=true label to the selected nodes and set checkNodeEligibility: true in the KataConfig CR.
To install the kata runtime on all worker nodes, set checkNodeEligibility: false in the KataConfig CR.
In both these scenarios, you do not need to create the NodeFeatureDiscovery CR. You should only apply the feature.node.kubernetes.io/runtime.kata=true label manually if you are sure that the node is eligible to run OpenShift sandboxed containers.
The following procedure applies the feature.node.kubernetes.io/runtime.kata=true label to all eligible nodes and configures the KataConfig resource to check for node eligibility.
Prerequisites
- You have installed the NFD Operator.
Procedure
Create an
nfd.yamlmanifest file according to the following example:apiVersion: nfd.openshift.io/v1 kind: NodeFeatureDiscovery metadata: name: nfd-kata namespace: openshift-nfd spec: workerConfig: configData: | sources: custom: - name: "feature.node.kubernetes.io/runtime.kata" matchOn: - cpuId: ["SSE4", "VMX"] loadedKMod: ["kvm", "kvm_intel"] - cpuId: ["SSE4", "SVM"] loadedKMod: ["kvm", "kvm_amd"] # ...Create the
NodeFeatureDiscoveryCR:$ oc create -f nfd.yamlThe
NodeFeatureDiscoveryCR applies thefeature.node.kubernetes.io/runtime.kata=truelabel to all qualifying worker nodes.
Create a
kata-config.yamlmanifest file according to the following example:apiVersion: kataconfiguration.openshift.io/v1 kind: KataConfig metadata: name: example-kataconfig spec: checkNodeEligibility: trueCreate the
KataConfigCR:$ oc create -f kata-config.yaml
Verification
Verify that qualifying nodes in the cluster have the correct label applied:
$ oc get nodes --selector='feature.node.kubernetes.io/runtime.kata=true'Example output
NAME STATUS ROLES AGE VERSION compute-3.example.com Ready worker 4h38m v1.25.0 compute-2.example.com Ready worker 4h35m v1.25.0
2.2. Deploying workloads by using the web console Copy linkLink copied to clipboard!
You can deploy OpenShift sandboxed containers workloads by using the web console.
2.2.1. Creating a KataConfig custom resource Copy linkLink copied to clipboard!
You must create a KataConfig custom resource (CR) to install kata as a RuntimeClass on your worker nodes.
The kata runtime class is installed on all worker nodes by default. If you want to install kata only on specific nodes, you can add labels to those nodes and then define the label in the KataConfig CR.
OpenShift sandboxed containers installs kata as a secondary, optional runtime on the cluster and not as the primary runtime.
Creating the KataConfig CR automatically reboots the worker nodes. The reboot can take from 10 to more than 60 minutes. The following factors might increase the reboot time:
- A larger OpenShift Container Platform deployment with a greater number of worker nodes.
- Activation of the BIOS and Diagnostics utility.
- Deployment on a hard disk drive rather than an SSD.
- Deployment on physical nodes such as bare metal, rather than on virtual nodes.
- A slow CPU and network.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. - Optional: You have installed the Node Feature Discovery Operator if you want to enable node eligibility checks.
Procedure
-
In the OpenShift Container Platform web console, navigate to Operators
Installed Operators. - Select the OpenShift sandboxed containers Operator.
- On the KataConfig tab, click Create KataConfig.
Enter the following details:
-
Name: Optional: The default name is
example-kataconfig. -
Labels: Optional: Enter any relevant, identifying attributes to the
KataConfigresource. Each label represents a key-value pair. - checkNodeEligibility: Optional: Select to use the Node Feature Discovery Operator (NFD) to detect node eligibility.
kataConfigPoolSelector. Optional: To install
kataon selected nodes, add a match expression for the labels on the selected nodes:- Expand the kataConfigPoolSelector area.
- In the kataConfigPoolSelector area, expand matchExpressions. This is a list of label selector requirements.
- Click Add matchExpressions.
- In the Key field, enter the label key the selector applies to.
-
In the Operator field, enter the key’s relationship to the label values. Valid operators are
In,NotIn,Exists, andDoesNotExist. - Expand the Values area and then click Add value.
-
In the Value field, enter
trueorfalsefor key label value.
-
logLevel: Define the level of log data retrieved for nodes with the
kataruntime class.
-
Name: Optional: The default name is
Click Create. The
KataConfigCR is created and installs thekataruntime class on the worker nodes.Wait for the
katainstallation to complete and the worker nodes to reboot before verifying the installation.
Verification
-
On the KataConfig tab, click the
KataConfigCR to view its details. Click the YAML tab to view the
statusstanza.The
statusstanza contains theconditionsandkataNodeskeys. The value ofstatus.kataNodesis an array of nodes, each of which lists nodes in a particular state ofkatainstallation. A message appears each time there is an update.Click Reload to refresh the YAML.
When all workers in the
status.kataNodesarray display the valuesinstalledandconditions.InProgress: Falsewith no specified reason, thekatais installed on the cluster.
See KataConfig status messages for details.
2.2.2. Configuring workload objects Copy linkLink copied to clipboard!
You deploy an OpenShift sandboxed containers workload by configuring kata as the runtime class for the following pod-templated objects:
-
Podobjects -
ReplicaSetobjects -
ReplicationControllerobjects -
StatefulSetobjects -
Deploymentobjects -
DeploymentConfigobjects
Do not deploy workloads in the openshift-sandboxed-containers-operator namespace. Create a dedicated namespace for these resources.
Prerequisites
- You have created a secret object for your provider.
- You have created a config map for your provider.
-
You have created a
KataConfigcustom resource (CR).
Procedure
-
In the OpenShift Container Platform web console, navigate to Workloads
workload type, for example, Pods. - On the workload type page, click an object to view its details.
- Click the YAML tab.
Add
spec.runtimeClassName: katato the manifest of each pod-templated workload object as in the following example:apiVersion: v1 kind: <object> # ... spec: runtimeClassName: kata # ...OpenShift Container Platform creates the workload object and begins scheduling it.
Verification
-
Inspect the
spec.runtimeClassNamefield of a pod-templated object. If the value iskata, then the workload is running on OpenShift sandboxed containers, using peer pods.
2.3. Deploying workloads by using the command line Copy linkLink copied to clipboard!
You can deploy OpenShift sandboxed containers workloads by using the command line.
2.3.1. Optional: Provisioning local block volumes by using the Local Storage Operator Copy linkLink copied to clipboard!
Local block volumes for OpenShift sandboxed containers can be provisioned using the Local Storage Operator (LSO). The local volume provisioner looks for any block volume devices at the paths specified in the defined resource.
Prerequisites
- The Local Storage Operator is installed.
You have a local disk that meets the following conditions:
- It is attached to a node.
- It is not mounted.
- It does not contain partitions.
Procedure
Create the local volume resource. This resource must define the nodes and paths to the local volumes.
NoteDo not use different storage class names for the same device. Doing so will create multiple persistent volumes (PVs).
Example: Block
apiVersion: "local.storage.openshift.io/v1" kind: "LocalVolume" metadata: name: "local-disks" namespace: "openshift-local-storage"1 spec: nodeSelector:2 nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - ip-10-0-136-143 - ip-10-0-140-255 - ip-10-0-144-180 storageClassDevices: - storageClassName: "local-sc"3 forceWipeDevicesAndDestroyAllData: false4 volumeMode: Block devicePaths:5 - /path/to/device6 - 1
- The namespace where the Local Storage Operator is installed.
- 2
- Optional: A node selector containing a list of nodes where the local storage volumes are attached. This example uses the node hostnames, obtained from
oc get node. If a value is not defined, then the Local Storage Operator will attempt to find matching disks on all available nodes. - 3
- The name of the storage class to use when creating persistent volume objects.
- 4
- This setting defines whether or not to call
wipefs, which removes partition table signatures (magic strings) making the disk ready to use for Local Storage Operator provisioning. No other data besides signatures is erased. The default is "false" (wipefsis not invoked). SettingforceWipeDevicesAndDestroyAllDatato "true" can be useful in scenarios where previous data can remain on disks that need to be re-used. In these scenarios, setting this field to true eliminates the need for administrators to erase the disks manually. - 5
- The path containing a list of local storage devices to choose from. You must use this path when deploying sandboxed container nodes on a block device.
- 6
- Replace this value with your actual local disks filepath to the
LocalVolumeresourceby-id, such as/dev/disk/by-id/wwn. PVs are created for these local disks when the provisioner is deployed successfully.
Create the local volume resource in your OpenShift Container Platform cluster. Specify the file you just created:
$ oc create -f <local-volume>.yamlVerify that the provisioner was created and that the corresponding daemon sets were created:
$ oc get all -n openshift-local-storageExample output
NAME READY STATUS RESTARTS AGE pod/diskmaker-manager-9wzms 1/1 Running 0 5m43s pod/diskmaker-manager-jgvjp 1/1 Running 0 5m43s pod/diskmaker-manager-tbdsj 1/1 Running 0 5m43s pod/local-storage-operator-7db4bd9f79-t6k87 1/1 Running 0 14m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/local-storage-operator-metrics ClusterIP 172.30.135.36 <none> 8383/TCP,8686/TCP 14m NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/diskmaker-manager 3 3 3 3 3 <none> 5m43s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/local-storage-operator 1/1 1 1 14m NAME DESIRED CURRENT READY AGE replicaset.apps/local-storage-operator-7db4bd9f79 1 1 1 14mNote the
desiredandcurrentnumber of daemon set processes. Adesiredcount of0indicates that the label selectors were invalid.Verify that the persistent volumes were created:
$ oc get pvExample output
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE local-pv-1cec77cf 100Gi RWO Delete Available local-sc 88m local-pv-2ef7cd2a 100Gi RWO Delete Available local-sc 82m local-pv-3fa1c73 100Gi RWO Delete Available local-sc 48m
Editing the LocalVolume object does not change existing persistent volumes because doing so might result in a destructive operation.
2.3.2. Optional: Deploying nodes on a block device Copy linkLink copied to clipboard!
If you provisioned local block volumes for OpenShift sandboxed containers, you can choose to deploy nodes on any block device at the paths specified in the defined volume resource.
Prerequisites
- You provisioned a block device using the Local Storage Operator
Procedure
- Run the following command for each node you want to deploy using a block device:
$ oc debug node/worker-0 -- chcon -vt container_file_t /host/path/to/device
+ The /path/to/device must be the same path you defined when creating the local storage resource.
+ .Example output
system_u:object_r:container_file_t:s0 /host/path/to/device
2.3.3. Creating a KataConfig custom resource Copy linkLink copied to clipboard!
You must create a KataConfig custom resource (CR) to install kata as a runtime class on your worker nodes.
Creating the KataConfig CR triggers the OpenShift sandboxed containers Operator to do the following:
-
Install the required RHCOS extensions, such as QEMU and
kata-containers, on your RHCOS node. - Ensure that the CRI-O runtime is configured with the correct runtime handlers.
-
Create a
RuntimeClassCR namedkatawith a default configuration. This enables users to configure workloads to usekataas the runtime by referencing the CR in theRuntimeClassNamefield. This CR also specifies the resource overhead for the runtime.
OpenShift sandboxed containers installs kata as a secondary, optional runtime on the cluster and not as the primary runtime.
Creating the KataConfig CR automatically reboots the worker nodes. The reboot can take from 10 to more than 60 minutes. Factors that impede reboot time are as follows:
- A larger OpenShift Container Platform deployment with a greater number of worker nodes.
- Activation of the BIOS and Diagnostics utility.
- Deployment on a hard disk drive rather than an SSD.
- Deployment on physical nodes such as bare metal, rather than on virtual nodes.
- A slow CPU and network.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. - Optional: You have installed the Node Feature Discovery Operator if you want to enable node eligibility checks.
Procedure
Create a
cluster-kataconfig.yamlmanifest file according to the following example:apiVersion: kataconfiguration.openshift.io/v1 kind: KataConfig metadata: name: cluster-kataconfig spec: checkNodeEligibility: false1 logLevel: info- 1
- Optional: Set`checkNodeEligibility` to
trueto run node eligibility checks.
Optional: To install
kataon selected nodes, specify the node labels according to the following example:apiVersion: kataconfiguration.openshift.io/v1 kind: KataConfig metadata: name: cluster-kataconfig spec: kataConfigPoolSelector: matchLabels: <label_key>: '<label_value>'1 # ...- 1
- Specify the labels of the selected nodes.
Create the
KataConfigCR:$ oc create -f cluster-kataconfig.yamlThe new
KataConfigCR is created and installskataas a runtime class on the worker nodes.Wait for the
katainstallation to complete and the worker nodes to reboot before verifying the installation.
Verification
Monitor the installation progress by running the following command:
$ watch "oc describe kataconfig | sed -n /^Status:/,/^Events/p"When the status of all workers under
kataNodesisinstalledand the conditionInProgressisFalsewithout specifying a reason, thekatais installed on the cluster.
See KataConfig status messages for details.
2.3.4. Optional: Modifying pod overhead Copy linkLink copied to clipboard!
Pod overhead describes the amount of system resources that a pod on a node uses. You can modify the pod overhead by changing the spec.overhead field for a RuntimeClass custom resource. For example, if the configuration that you run for your containers consumes more than 350Mi of memory for the QEMU process and guest kernel data, you can alter the RuntimeClass overhead to suit your needs.
When performing any kind of file system I/O in the guest, file buffers are allocated in the guest kernel. The file buffers are also mapped in the QEMU process on the host, as well as in the virtiofsd process.
For example, if you use 300Mi of file buffer cache in the guest, both QEMU and virtiofsd appear to use 300Mi additional memory. However, the same memory is being used in all three cases. Therefore, the total memory usage is only 300Mi, mapped in three different places. This is correctly accounted for when reporting the memory utilization metrics.
The default values are supported by Red Hat. Changing default overhead values is not supported and can result in technical issues.
Procedure
Obtain the
RuntimeClassobject by running the following command:$ oc describe runtimeclass kataUpdate the
overhead.podFixed.memoryandcpuvalues and save asRuntimeClass.yaml:kind: RuntimeClass apiVersion: node.k8s.io/v1 metadata: name: kata overhead: podFixed: memory: "500Mi" cpu: "500m"
2.3.5. Configuring workload objects Copy linkLink copied to clipboard!
You deploy an OpenShift sandboxed containers workload by configuring kata as the runtime class for the following pod-templated objects:
-
Podobjects -
ReplicaSetobjects -
ReplicationControllerobjects -
StatefulSetobjects -
Deploymentobjects -
DeploymentConfigobjects
Do not deploy workloads in the openshift-sandboxed-containers-operator namespace. Create a dedicated namespace for these resources.
Prerequisites
- You have created a secret object for your provider.
- You have created a config map for your provider.
-
You have created a
KataConfigcustom resource (CR).
Procedure
Add
spec.runtimeClassName: katato the manifest of each pod-templated workload object as in the following example:apiVersion: v1 kind: <object> # ... spec: runtimeClassName: kata # ...OpenShift Container Platform creates the workload object and begins scheduling it.
Verification
-
Inspect the
spec.runtimeClassNamefield of a pod-templated object. If the value iskata, then the workload is running on OpenShift sandboxed containers, using peer pods.