Home
Products
OpenShift Container Platform
4.17
Introduction to Virtualization
Chapter 5. Postinstallation configuration

Chapter 5. Postinstallation configuration

5.1. Postinstallation configuration
Copy link

The following procedures are typically performed after you install OpenShift Virtualization. You can configure the components that are relevant for your environment:

Node placement rules for OpenShift Virtualization Operators, workloads, and controllers
Network configuration:
- Installing the Kubernetes NMState and SR-IOV Operators
- Configuring a Linux bridge network for external access to virtual machines (VMs)
- Configuring a dedicated secondary network for live migration
- Configuring an SR-IOV network
- Enabling the creation of load balancer services by using the OpenShift Container Platform web console
Storage configuration:
- Defining a default storage class for the Container Storage Interface (CSI)
- Configuring local storage by using the Hostpath Provisioner (HPP)

5.2. Specifying nodes for OpenShift Virtualization components
Copy link

The default scheduling for virtual machines (VMs) on bare-metal nodes is appropriate. Optionally, you can specify the nodes where you want to deploy OpenShift Virtualization Operators, workloads, and controllers by configuring node placement rules.

Note

You can configure node placement rules for some components after installing OpenShift Virtualization, but virtual machines cannot be present if you want to configure node placement rules for workloads.

5.2.1. About node placement rules for OpenShift Virtualization components
Copy link

You can use node placement rules for the following tasks:

Deploy virtual machines only on nodes intended for virtualization workloads.
Deploy Operators only on infrastructure nodes.
Maintain separation between workloads.

Depending on the object, you can use one or more of the following rule types:

nodeSelector: Allows pods to be scheduled on nodes that are labeled with the key-value pair or pairs that you specify in this field. The node must have labels that exactly match all listed pairs.
affinity: Enables you to use more expressive syntax to set rules that match nodes with pods. Affinity also allows for more nuance in how the rules are applied. For example, you can specify that a rule is a preference, not a requirement. If a rule is a preference, pods are still scheduled when the rule is not satisfied.
tolerations: Allows pods to be scheduled on nodes that have matching taints. If a taint is applied to a node, that node only accepts pods that tolerate the taint.

5.2.2. Applying node placement rules
Copy link

You can apply node placement rules by editing a Subscription, HyperConverged, or HostPathProvisioner object using the command line.

Prerequisites

You have installed the OpenShift CLI (oc).
You are logged in with cluster administrator permissions.

Procedure

Edit the object in your default editor by running the following command:
```
$ oc edit <resource_type> <resource_name> -n openshift-cnv
```
Save the file to apply the changes.

5.2.3. Node placement rule examples
Copy link

You can specify node placement rules for a OpenShift Virtualization component by editing a Subscription, HyperConverged, or HostPathProvisioner object.

5.2.3.1. Subscription object node placement rule examples
Copy link

To specify the nodes where OLM deploys the OpenShift Virtualization Operators, edit the Subscription object during OpenShift Virtualization installation.

Currently, you cannot configure node placement rules for the Subscription object by using the web console.

The Subscription object does not support the affinity node placement rule.

Example Subscription object with nodeSelector rule

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: hco-operatorhub
  namespace: openshift-cnv
spec:
  source: redhat-operators
  sourceNamespace: openshift-marketplace
  name: kubevirt-hyperconverged
  startingCSV: kubevirt-hyperconverged-operator.v4.17.48
  channel: "stable"
  config:
    nodeSelector:
      example.io/example-infra-key: example-infra-value

OLM deploys the OpenShift Virtualization Operators on nodes labeled example.io/example-infra-key = example-infra-value.

Example Subscription object with tolerations rule

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: hco-operatorhub
  namespace: openshift-cnv
spec:
  source:  redhat-operators
  sourceNamespace: openshift-marketplace
  name: kubevirt-hyperconverged
  startingCSV: kubevirt-hyperconverged-operator.v4.17.48
  channel: "stable"
  config:
    tolerations:
    - key: "key"
      operator: "Equal"
      value: "virtualization"
      effect: "NoSchedule"

OLM deploys OpenShift Virtualization Operators on nodes labeled key = virtualization:NoSchedule taint. Only pods with the matching tolerations are scheduled on these nodes.

5.2.3.2. HyperConverged object node placement rule example
Copy link

To specify the nodes where OpenShift Virtualization deploys its components, you can edit the nodePlacement object in the HyperConverged custom resource (CR) file that you create during OpenShift Virtualization installation.

Example HyperConverged object with nodeSelector rule

apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
  name: kubevirt-hyperconverged
  namespace: openshift-cnv
spec:
  infra:
    nodePlacement:
      nodeSelector:
        example.io/example-infra-key: example-infra-value
  workloads:
    nodePlacement:
      nodeSelector:
        example.io/example-workloads-key: example-workloads-value

Infrastructure resources are placed on nodes labeled example.io/example-infra-key = example-infra-value.
Workloads are placed on nodes labeled example.io/example-workloads-key = example-workloads-value.

Example HyperConverged object with affinity rule

apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
  name: kubevirt-hyperconverged
  namespace: openshift-cnv
spec:
  infra:
    nodePlacement:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: example.io/example-infra-key
                operator: In
                values:
                - example-infra-value
  workloads:
    nodePlacement:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: example.io/example-workloads-key
                operator: In
                values:
                - example-workloads-value
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            preference:
              matchExpressions:
              - key: example.io/num-cpus
                operator: Gt
                values:
                - 8

Infrastructure resources are placed on nodes labeled example.io/example-infra-key = example-value.
Workloads are placed on nodes labeled example.io/example-workloads-key = example-workloads-value.
Nodes that have more than eight CPUs are preferred for workloads, but if they are not available, pods are still scheduled.

Example HyperConverged object with tolerations rule

apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
  name: kubevirt-hyperconverged
  namespace: openshift-cnv
spec:
  workloads:
    nodePlacement:
      tolerations:
      - key: "key"
        operator: "Equal"
        value: "virtualization"
        effect: "NoSchedule"

Nodes reserved for OpenShift Virtualization components are labeled with the key = virtualization:NoSchedule taint. Only pods with matching tolerations are scheduled on reserved nodes.

5.2.3.3. HostPathProvisioner object node placement rule example
Copy link

You can edit the HostPathProvisioner object directly or by using the web console.

Warning

You must schedule the hostpath provisioner (HPP) and the OpenShift Virtualization components on the same nodes. Otherwise, virtualization pods that use the hostpath provisioner cannot run. You cannot run virtual machines.

After you deploy a virtual machine (VM) with the HPP storage class, you can remove the hostpath provisioner pod from the same node by using the node selector. However, you must first revert that change, at least for that specific node, and wait for the pod to run before trying to delete the VM.

You can configure node placement rules by specifying nodeSelector, affinity, or tolerations for the spec.workload field of the HostPathProvisioner object that you create when you install the hostpath provisioner.

Example HostPathProvisioner object with nodeSelector rule

apiVersion: hostpathprovisioner.kubevirt.io/v1beta1
kind: HostPathProvisioner
metadata:
  name: hostpath-provisioner
spec:
  imagePullPolicy: IfNotPresent
  pathConfig:
    path: "</path/to/backing/directory>"
    useNamingPrefix: false
  workload:
    nodeSelector:
      example.io/example-workloads-key: example-workloads-value

Workloads are placed on nodes labeled example.io/example-workloads-key = example-workloads-value.

5.3. Postinstallation network configuration
Copy link

By default, OpenShift Virtualization uses a single internal pod network after installation.

After you install OpenShift Virtualization, you can install networking Operators and configure additional networks.

5.3.1. Installing networking Operators
Copy link

You must install the Kubernetes NMState Operator to configure a Linux bridge network for live migration or external access to virtual machines (VMs). For installation instructions, see Installing the Kubernetes NMState Operator by using the web console.

You can install the SR-IOV Operator to manage SR-IOV network devices and network attachments. For installation instructions, see Installing the SR-IOV Network Operator.

You can add the About MetalLB and the MetalLB Operator to manage the lifecycle for an instance of MetalLB on your cluster. For installation instructions, see Installing the MetalLB Operator from the OperatorHub by using the web console.

5.3.2. Configuring a Linux bridge network
Copy link

After you install the Kubernetes NMState Operator, you can configure a Linux bridge network for live migration or external access to virtual machines (VMs).

5.3.2.1. Creating a Linux bridge NNCP
Copy link

You can create a NodeNetworkConfigurationPolicy (NNCP) manifest for a Linux bridge network.

Prerequisites

You have installed the Kubernetes NMState Operator.

Procedure

Create the NodeNetworkConfigurationPolicy manifest. This example includes sample values that you must replace with your own information.
```
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: br1-eth1-policy
spec:
  desiredState:
    interfaces:
      - name: br1
        description: Linux bridge with eth1 as a port
        type: linux-bridge
        state: up
        ipv4:
          enabled: false
        bridge:
          options:
            stp:
              enabled: false
          port:
            - name: eth1
```
- metadata.name defines the name of the node network configuration policy.
- spec.desiredState.interfaces.name defines the name of the new Linux bridge.
- spec.desiredState.interfaces.description is an optional field that can be used to define a human-readable description for the bridge.
- spec.desiredState.interfaces.type defines the interface type. In this example, the type is a Linux bridge.
- spec.desiredState.interfaces.state defines the requested state for the interface after creation.
- spec.desiredState.interfaces.ipv4.enabled defines whether the ipv4 protocol is active. Setting this to false disables IPv4 addressing on this bridge.
- spec.desiredState.interfaces.bridge.options.stp.enabled defines whether Spanning Tree Protocol (STP) is active. Setting this to false disables STP on this bridge.
- spec.desiredState.interfaces.bridge.port.name defines the node NIC that the bridge is attached to.

5.3.2.2. Creating a Linux bridge NAD by using the web console
Copy link

You can create a network attachment definition (NAD) to provide layer-2 networking to pods and virtual machines by using the OpenShift Container Platform web console.

A Linux bridge network attachment definition is the most efficient method for connecting a virtual machine to a VLAN.

Warning

Configuring IP address management (IPAM) in a network attachment definition for virtual machines is not supported.

Procedure

In the web console, click Networking NetworkAttachmentDefinitions.
Click Create Network Attachment Definition.
Note
The network attachment definition must be in the same namespace as the pod or virtual machine.
Enter a unique Name and optional Description.
Select CNV Linux bridge from the Network Type list.
Enter the name of the bridge in the Bridge Name field.
Optional: If the resource has VLAN IDs configured, enter the ID numbers in the VLAN Tag Number field.
Optional: Select MAC Spoof Check to enable MAC spoof filtering. This feature provides security against a MAC spoofing attack by allowing only a single MAC address to exit the pod.
Click Create.

5.3.3. Next steps
Copy link

Attaching a virtual machine (VM) to a Linux bridge network

5.3.4. Configuring a network for live migration
Copy link

After you have configured a Linux bridge network, you can configure a dedicated network for live migration. A dedicated network minimizes the effects of network saturation on tenant workloads during live migration.

5.3.4.1. Configuring a dedicated secondary network for live migration
Copy link

To configure a dedicated secondary network for live migration, you must first create a bridge network attachment definition (NAD) by using the CLI. You can then add the name of the NetworkAttachmentDefinition object to the HyperConverged custom resource (CR).

Prerequisites

You installed the OpenShift CLI (oc).
You logged in to the cluster as a user with the cluster-admin role.
Each node has at least two Network Interface Cards (NICs).
The NICs for live migration are connected to the same VLAN.

Procedure

Create a NetworkAttachmentDefinition manifest according to the following example:
Example configuration file
```
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: my-secondary-network
  namespace: openshift-cnv
spec:
  config: '{
    "cniVersion": "0.3.1",
    "name": "migration-bridge",
    "type": "macvlan",
    "master": "eth1",
    "mode": "bridge",
    "ipam": {
      "type": "whereabouts",
      "range": "10.200.5.0/24"
    }
  }'
```
- metadata.name defines the name of the NetworkAttachmentDefinition object.
- config.master defines the name of the NIC to be used for live migration.
- config.type defines the name of the CNI plugin that provides the network for the NAD.
- config.range defines an IP address range for the secondary network. This range must not overlap the IP addresses of the main network.
Open the HyperConverged CR in your default editor by running the following command:
```
$ oc edit hyperconverged kubevirt-hyperconverged -n openshift-cnv
```

Add the name of the NetworkAttachmentDefinition object to the spec.liveMigrationConfig stanza of the HyperConverged CR:

Example HyperConverged manifest

apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
  name: kubevirt-hyperconverged
  namespace: openshift-cnv
spec:
  liveMigrationConfig:
    completionTimeoutPerGiB: 800
    network: <network>
    parallelMigrationsPerCluster: 5
    parallelOutboundMigrationsPerNode: 2
    progressTimeout: 150
# ...

spec.liveMigrationConfig.network defines the name of the Multus NetworkAttachmentDefinition object to be used for live migrations.

Save your changes and exit the editor. The virt-handler pods restart and connect to the secondary network.

Verification

When the node that the virtual machine runs on is placed into maintenance mode, the VM automatically migrates to another node in the cluster. You can verify that the migration occurred over the secondary network and not the default pod network by checking the target IP address in the virtual machine instance (VMI) metadata.
```
$ oc get vmi <vmi_name> -o jsonpath='{.status.migrationState.targetNodeAddress}'
```

5.3.4.2. Selecting a dedicated network by using the web console
Copy link

You can select a dedicated network for live migration by using the OpenShift Container Platform web console.

Prerequisites

You configured a Multus network for live migration.
You created a network attachment definition for the network.

Procedure

Go to Virtualization > Overview in the OpenShift Container Platform web console.
Click the Settings tab and then click Live migration.
Select the network from the Live migration network list.

5.3.5. Configuring an SR-IOV network
Copy link

After you install the SR-IOV Operator, you can configure an SR-IOV network.

5.3.5.1. Configuring SR-IOV network devices
Copy link

The SR-IOV Network Operator adds the SriovNetworkNodePolicy.sriovnetwork.openshift.io custom resource definition (CRD) to OpenShift Container Platform. You can configure an SR-IOV network device by creating a SriovNetworkNodePolicy custom resource (CR).

Note

When applying the configuration specified in a SriovNetworkNodePolicy CR, the SR-IOV Operator might drain the nodes, and in some cases, reboot nodes. Reboot only happens in the following cases:

With Mellanox NICs (mlx5 driver) a node reboot happens every time the number of virtual functions (VFs) increase on a physical function (PF).
With Intel NICs, a reboot only happens if the kernel parameters do not include intel_iommu=on and iommu=pt.

It might take several minutes for a configuration change to apply.

Prerequisites

You installed the OpenShift CLI (oc).
You have access to the cluster as a user with the cluster-admin role.
You have installed the SR-IOV Network Operator.
You have enough available nodes in your cluster to handle the evicted workload from drained nodes.
You have not selected any control plane nodes for SR-IOV network device configuration.

Procedure

Create an SriovNetworkNodePolicy object, and then save the YAML in the <name>-sriov-node-network.yaml file. Replace <name> with the name for this configuration.
```
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  name: <name>
  namespace: openshift-sriov-network-operator
spec:
  resourceName: <sriov_resource_name>
  nodeSelector:
    feature.node.kubernetes.io/network-sriov.capable: "true"
  priority: <priority>
  mtu: <mtu>
  numVfs: <num>
  nicSelector:
    vendor: "<vendor_code>"
    deviceID: "<device_id>"
    pfNames: ["<pf_name>", ...]
    rootDevices: ["<pci_bus_id>", "..."]
  deviceType: vfio-pci
  isRdma: false
```
- metadata.name defines a name for the SriovNetworkNodePolicy object.
- metadata.namespace defines the namespace where the SR-IOV Network Operator is installed.
- spec.resourceName defines the resource name of the SR-IOV device plugin. You can create multiple SriovNetworkNodePolicy objects for a resource name.
- spec.nodeSelector.feature.node.kubernetes.io/network-sriov.capable defines the node selector to select which nodes are configured. Only SR-IOV network devices on selected nodes are configured. The SR-IOV Container Network Interface (CNI) plugin and device plugin are deployed only on selected nodes.
- spec.priority is an optional field that defines an integer value between 0 and 99. A smaller number gets higher priority, so a priority of 10 is higher than a priority of 99. The default value is 99.
- spec.mtu is an optional field that defines a value for the maximum transmission unit (MTU) of the virtual function. The maximum MTU value can vary for different NIC models.
- spec.numVfs defines the number of the virtual functions (VF) to create for the SR-IOV physical network device. For an Intel network interface controller (NIC), the number of VFs cannot be larger than the total VFs supported by the device. For a Mellanox NIC, the number of VFs cannot be larger than 127.
- spec.nicSelector defines the Ethernet device for the Operator to configure. You do not need to specify values for all the parameters.
  Note
  It is recommended to identify the Ethernet adapter with enough precision to minimize the possibility of selecting an Ethernet device unintentionally. If you specify rootDevices, you must also specify a value for vendor, deviceID, or pfNames.
  If you specify both pfNames and rootDevices at the same time, ensure that they point to an identical device.
- spec.nicSelector.vendor is an optional field that defines the vendor hex code of the SR-IOV network device. The only allowed values are either 8086 or 15b3.
- spec.nicSelector.deviceID is an optional field that defines the device hex code of SR-IOV network device. The only allowed values are 158b, 1015, 1017.
- spec.nicSelector.pfNames is an optional field that defines an array of one or more physical function (PF) names for the Ethernet device.
- spec.nicSelector.rootDevices is an optional field that defines an array of one or more PCI bus addresses for the physical function of the Ethernet device. Provide the address in the following format: 0000:02:00.1.
- spec.deviceType defines the driver type. The vfio-pci driver type is required for virtual functions in OpenShift Virtualization.
- spec.isRdma is an optional field that defines whether to enable remote direct memory access (RDMA) mode. For a Mellanox card, set isRdma to false. The default value is false.
  Note
  If isRDMA flag is set to true, you can continue to use the RDMA enabled VF as a normal network device. A device can be used in either mode.
Optional: Label the SR-IOV capable cluster nodes with SriovNetworkNodePolicy.Spec.NodeSelector if they are not already labeled. For more information about labeling nodes, see "Understanding how to update labels on nodes".
Create the SriovNetworkNodePolicy object. When running the following command, replace <name> with the name for this configuration:
```
$ oc create -f <name>-sriov-node-network.yaml
```
After applying the configuration update, all the pods in sriov-network-operator namespace transition to the Running status.
To verify that the SR-IOV network device is configured, enter the following command. Replace <node_name> with the name of a node with the SR-IOV network device that you just configured.
```
$ oc get sriovnetworknodestates -n openshift-sriov-network-operator <node_name> -o jsonpath='{.status.syncStatus}'
```

5.3.6. Next steps
Copy link

Attaching a virtual machine (VM) to an SR-IOV network

5.3.7. Enabling load balancer service creation by using the web console
Copy link

You can enable the creation of load balancer services for a virtual machine (VM) by using the OpenShift Container Platform web console.

Prerequisites

You have configured a load balancer for the cluster.
You have logged in as a user with the cluster-admin role.
You created a network attachment definition for the network.

Procedure

Go to Virtualization Overview.
On the Settings tab, click Cluster.
Expand General settings and SSH configuration.
Set SSH over LoadBalancer service to on.

5.3.8. Configuring additional routes to the cdi-uploadproxy service
Copy link

As a cluster administrator, you can configure additional routes to the cdi-uploadproxy service, enabling users to upload virtual machine images from outside the cluster.

Prerequisites

You installed the OpenShift CLI (oc).
You logged in to the cluster as a user with the cluster-admin role.

Procedure

Configure the route to the external host by running the following command:
```
$ oc create route reencrypt <route_name> -n openshift-cnv \
    --insecure-policy=Redirect \
    --hostname=<host_name_or_address> \
    --service=cdi-uploadproxy
```
where:
<route_name>
Specifies the name to assign to this custom route.
<host_name_or_address>
Specifies the fully qualified domain name or IP address of the external host providing image upload access.
Run the following command to annotate the route. This ensures that the correct Containerized Data Importer (CDI) CA certificate is injected when certificates are rotated:
```
$ oc annotate route <route_name> -n openshift-cnv \
    operator.cdi.kubevirt.io/injectUploadProxyCert="true"
```
where:
<route_name>
Specifies the name of the route you created.

5.4. Postinstallation storage configuration
Copy link

The following storage configuration tasks are mandatory:

You must configure a default storage class for your cluster. Otherwise, the cluster cannot receive automated boot source updates.
You must configure storage profiles if your storage provider is not recognized by the Containerized Data Importer (CDI). A storage profile provides recommended storage settings based on the associated storage class.

Optional: You can configure local storage by using the hostpath provisioner (HPP).

See the storage configuration overview for more options, including configuring the CDI, data volumes, and automatic boot source updates.

5.4.1. Configuring local storage by using the HPP
Copy link

When you install the OpenShift Virtualization Operator, the Hostpath Provisioner (HPP) Operator is automatically installed. The HPP Operator creates the HPP provisioner.

The HPP is a local storage provisioner designed for OpenShift Virtualization. To use the HPP, you must create an HPP custom resource (CR).

Important

HPP storage pools must not be in the same partition as the operating system. Otherwise, the storage pools might fill the operating system partition. If the operating system partition is full, this might negatively impact performance, or the node can become unstable or unusable.

5.4.1.1. Creating a storage class for the CSI driver with the storagePools stanza
Copy link

To use the hostpath provisioner (HPP) you must create an associated storage class for the Container Storage Interface (CSI) driver.

When you create a storage class, you set parameters that affect the dynamic provisioning of persistent volumes (PVs) that belong to that storage class. You cannot update a StorageClass object’s parameters after you create it.

Note

Virtual machines use data volumes that are based on local PVs. Local PVs are bound to specific nodes. While a disk image is prepared for consumption by the virtual machine, it is possible that the virtual machine cannot be scheduled to the node where the local storage PV was previously pinned.

To solve this problem, use the Kubernetes pod scheduler to bind the persistent volume claim (PVC) to a PV on the correct node. By using the StorageClass value with volumeBindingMode parameter set to WaitForFirstConsumer, the binding and provisioning of the PV is delayed until a pod is created using the PVC.

Procedure

Create a storageclass_csi.yaml file to define the storage class:
```
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: hostpath-csi
provisioner: kubevirt.io.hostpath-provisioner
reclaimPolicy: Delete 
```
1
```
volumeBindingMode: WaitForFirstConsumer 
```
2
```
parameters:
  storagePool: my-storage-pool 
```
3
- reclaimPolicy defines whether the underlying storage is deleted or retained when a user deletes a PVC. The two possible reclaimPolicy values are Delete and Retain. If you do not specify a value, the default value is Delete.
- volumeBindingMode defines the timing of PV creation. The WaitForFirstConsumer configuration in this example means that PV creation is delayed until a pod is scheduled to a specific node.
- parameters.storagePool defines the name of the storage pool defined in the HPP custom resource (CR).
Save the file and exit.
Create the StorageClass object by running the following command:
```
$ oc create -f storageclass_csi.yaml
```

5.5. Configuring higher VM workload density
Copy link

You can increase the number of virtual machines (VMs) on nodes by overcommitting memory (RAM). Increasing VM workload density can be useful in the following situations:

You have many similar workloads.
You have underused workloads.

Note

Memory overcommitment can lower workload performance on a highly utilized system.

5.5.1. Using wasp-agent to increase VM workload density
Copy link

The wasp-agent component facilitates memory overcommitment by assigning swap resources to worker nodes.

Important

Swap resources can be only assigned to virtual machine workloads (VM pods) of the Burstable Quality of Service (QoS) class. VM pods of the Guaranteed QoS class and pods of any QoS class that do not belong to VMs cannot swap resources.

For descriptions of QoS classes, see Configure Quality of Service for Pods (Kubernetes documentation).

Using spec.domain.resources.requests.memory in the VM manifest disables the memory overcommit configuration. Use spec.domain.memory.guest instead.

Prerequisites

You have installed the OpenShift CLI (oc).
You are logged into the cluster with the cluster-admin role.
A memory overcommit ratio is defined.
The node belongs to a worker pool.

Note

The wasp-agent component deploys an Open Container Initiative (OCI) hook to enable swap usage for containers on the node level. The low-level nature requires the DaemonSet object to be privileged.

Procedure

Configure the kubelet service to permit swap usage:

Create or edit a KubeletConfig file with the parameters shown in the following example:

Example of a KubeletConfig file

apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: custom-config
spec:
  machineConfigPoolSelector:
    matchLabels:
      pools.operator.machineconfiguration.openshift.io/worker: ''  # MCP
      #machine.openshift.io/cluster-api-machine-role: worker # machine
      #node-role.kubernetes.io/worker: '' # node
  kubeletConfig:
    failSwapOn: false

Wait for the worker nodes to sync with the new configuration by running the following command:
```
$ oc wait mcp worker --for condition=Updated=True --timeout=-1s
```

Provision swap by creating a MachineConfig object:

Create a MachineConfig file with the parameters shown in the following example:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 90-worker-swap
spec:
  config:
    ignition:
      version: 3.5.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,YXBpVmVyc2lvbjoga3ViZWxldC5jb25maWcuazhzLmlvL3YxYmV0YTEKa2luZDogS3ViZWxldENvbmZpZ3VyYXRpb24KZmFpbFN3YXBPbjogZmFsc2UK
        mode: 420
        overwrite: true
        path: /etc/openshift/kubelet.conf.d/90-swap.conf
    systemd:
      units:
        - contents: |
            [Unit]
            Description=Enable swap
            ConditionFirstBoot=no
            ConditionPathExists=/var/tmp/swapfile

            [Service]
            Type=oneshot
            ExecStart=/bin/sh -c "sudo swapon /var/tmp/swapfile"

            [Install]
            RequiredBy=kubelet-dependencies.target
          enabled: true
          name: swap-enable.service
        - contents: |
            [Unit]
            Description=Provision and enable swap
            ConditionFirstBoot=no
            ConditionPathExists=!/var/tmp/swapfile

            [Service]
            Type=oneshot
            Environment=SWAP_SIZE_MB=5000
            ExecStart=/bin/sh -c "sudo fallocate -l ${SWAP_SIZE_MB}M /var/tmp/swapfile && \
            sudo chmod 600 /var/tmp/swapfile && \
            sudo mkswap /var/tmp/swapfile && \
            sudo swapon /var/tmp/swapfile && \
            free -h"

            [Install]
            RequiredBy=kubelet-dependencies.target
          enabled: true
          name: swap-provision.service
        - contents: |
            [Unit]
            Description=Restrict swap for system slice
            ConditionFirstBoot=no

            [Service]
            Type=oneshot
            ExecStart=/bin/sh -c "sudo systemctl set-property --runtime system.slice MemorySwapMax=0 IODeviceLatencyTargetSec=\"/ 50ms\""

            [Install]
            RequiredBy=kubelet-dependencies.target
          enabled: true
          name: cgroup-system-slice-config.service

To have enough swap space for the worst-case scenario, make sure to have at least as much swap space provisioned as overcommitted RAM. Calculate the amount of swap space to be provisioned on a node by using the following formula:

NODE_SWAP_SPACE = NODE_RAM * (MEMORY_OVER_COMMIT_PERCENT / 100% - 1)

Example

NODE_SWAP_SPACE = 16 GB * (150% / 100% - 1)
               = 16 GB * (1.5 - 1)
               = 16 GB * (0.5)
               =  8 GB

Create a privileged service account by running the following commands:

$ oc adm new-project wasp

$ oc create sa -n wasp wasp

$ oc create clusterrolebinding wasp --clusterrole=cluster-admin --serviceaccount=wasp:wasp

$ oc adm policy add-scc-to-user -n wasp privileged -z wasp

Wait for the worker nodes to sync with the new configuration by running the following command:
```
$ oc wait mcp worker --for condition=Updated=True --timeout=-1s
```

Determine the pull URL for the wasp agent image by running the following command:

$ oc get csv -n openshift-cnv -l=operators.coreos.com/kubevirt-hyperconverged.openshift-cnv -ojson | jq '.items[0].spec.relatedImages[] | select(.name|test(".*wasp-agent.*")) | .image'

Deploy wasp-agent by creating a DaemonSet object as shown in the following example:

kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: wasp-agent
  namespace: wasp
  labels:
    app: wasp
    tier: node
spec:
  selector:
    matchLabels:
      name: wasp
  template:
    metadata:
      annotations:
        description: >-
          Configures swap for workloads
      labels:
        name: wasp
    spec:
      containers:
        - env:
            - name: SWAP_UTILIZATION_THRESHOLD_FACTOR
              value: "0.8"
            - name: MAX_AVERAGE_SWAP_IN_PAGES_PER_SECOND
              value: "1000000000"
            - name: MAX_AVERAGE_SWAP_OUT_PAGES_PER_SECOND
              value: "1000000000"
            - name: AVERAGE_WINDOW_SIZE_SECONDS
              value: "30"
            - name: VERBOSITY
              value: "1"
            - name: FSROOT
              value: /host
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          image: >-
            quay.io/openshift-virtualization/wasp-agent:v4.17


          imagePullPolicy: Always
          name: wasp-agent
          resources:
            requests:
              cpu: 100m
              memory: 50M
          securityContext:
            privileged: true
          volumeMounts:
            - mountPath: /host
              name: host
            - mountPath: /rootfs
              name: rootfs
      hostPID: true
      hostUsers: true
      priorityClassName: system-node-critical
      serviceAccountName: wasp
      terminationGracePeriodSeconds: 5
      volumes:
        - hostPath:
            path: /
          name: host
        - hostPath:
            path: /
          name: rootfs
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 10%
      maxSurge: 0

1 1: Replace the image value with the image URL from the previous step.

Deploy alerting rules by creating a PrometheusRule object. For example:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    tier: node
    wasp.io: ""
  name: wasp-rules
  namespace: wasp
spec:
  groups:
    - name: alerts.rules
      rules:
        - alert: NodeHighSwapActivity
          annotations:
            description: High swap activity detected at {{ $labels.instance }}. The rate
              of swap out and swap in exceeds 200 in both operations in the last minute.
              This could indicate memory pressure and may affect system performance.
            runbook_url: https://github.com/openshift-virtualization/wasp-agent/tree/main/docs/runbooks/NodeHighSwapActivity.md
            summary: High swap activity detected at {{ $labels.instance }}.
          expr: rate(node_vmstat_pswpout[1m]) > 200 and rate(node_vmstat_pswpin[1m]) >
            200
          for: 1m
          labels:
            kubernetes_operator_component: kubevirt
            kubernetes_operator_part_of: kubevirt
            operator_health_impact: warning
            severity: warning

Add the cluster-monitoring label to the wasp namespace by running the following command:
```
$ oc label namespace wasp openshift.io/cluster-monitoring="true"
```
Enable memory overcommitment in OpenShift Virtualization by using the web console or the CLI.
- Web console
  1. In the OpenShift Container Platform web console, go to Virtualization Overview Settings General settings Memory density.
  2. Set Enable memory density to on.
- CLI
  - Configure your OpenShift Virtualization to enable higher memory density and set the overcommit rate:
    
    $ oc -n openshift-cnv patch HyperConverged/kubevirt-hyperconverged --type='json' -p='[ \ { \ "op": "replace", \ "path": "/spec/higherWorkloadDensity/memoryOvercommitPercentage", \ "value": 150 \ } \ ]'
    
    Successful output
    
    hyperconverged.hco.kubevirt.io/kubevirt-hyperconverged patched

Verification

To verify the deployment of wasp-agent, run the following command:
```
$ oc rollout status ds wasp-agent -n wasp
```
If the deployment is successful, the following message is displayed:
Example output
```
daemon set "wasp-agent" successfully rolled out
```

To verify that swap is correctly provisioned, complete the following steps:

View a list of worker nodes by running the following command:
```
$ oc get nodes -l node-role.kubernetes.io/worker
```

Select a node from the list and display its memory usage by running the following command:

$ oc debug node/<selected_node> -- free -m

Replace <selected_node> with the node name.

If swap is provisioned, an amount greater than zero is displayed in the Swap: row.

Expand

Table 5.1. Example output
	total	used	free	shared	buff/cache	available
Mem:	31846	23155	1044	6014	14483	8690
Swap:	8191	2337	5854

Verify the OpenShift Virtualization memory overcommitment configuration by running the following command:
```
$ oc -n openshift-cnv get HyperConverged/kubevirt-hyperconverged -o jsonpath='{.spec.higherWorkloadDensity}{"\n"}'
```
Example output
```
{"memoryOvercommitPercentage":150}
```
The returned value must match the value you had previously configured.

5.5.2. Removing the wasp-agent component
Copy link

If you no longer need memory overcommitment, you can remove the wasp-agent component and associated resources from your cluster.

Prerequisites

You are logged in to the cluster with the cluster-admin role.
You have installed the OpenShift CLI (oc).

Procedure

Remove the wasp-agent DaemonSet:

$ oc delete daemonset wasp-agent -n wasp

If deployed, remove the alerting rules:

$ oc delete prometheusrule wasp-rules -n wasp

Optionally, delete the wasp namespace if no other resources depend on it:
```
$ oc delete namespace wasp
```

Revert the memory overcommitment configuration:

$ oc -n openshift-cnv patch HyperConverged/kubevirt-hyperconverged \
  --type='json' \
  -p='[{"op": "remove", "path": "/spec/higherWorkloadDensity"}]'

Delete the MachineConfig that provisions swap memory:
```
$ oc delete machineconfig 90-worker-swap
```
Delete the associated KubeletConfig:
```
$ oc delete kubeletconfig custom-config
```

Wait for the worker nodes to reconcile:

$ oc wait mcp worker --for condition=Updated=True --timeout=-1s

Verification

Confirm that the wasp-agent DaemonSet is removed:
```
$ oc get daemonset -n wasp
```
No wasp-agent should be listed.
Confirm that swap is no longer enabled on a node:
```
$ oc debug node/<selected_node> -- free -m
```
Ensure that the Swap: row shows 0 or that no swap space shows as provisioned.

Chapter 5. Postinstallation configuration

5.1. Postinstallation configuration
Copy link

5.2. Specifying nodes for OpenShift Virtualization components
Copy link

5.2.1. About node placement rules for OpenShift Virtualization components
Copy link

5.2.2. Applying node placement rules
Copy link

5.2.3. Node placement rule examples
Copy link

5.2.3.1. Subscription object node placement rule examples
Copy link

5.2.3.2. HyperConverged object node placement rule example
Copy link

5.2.3.3. HostPathProvisioner object node placement rule example
Copy link

5.3. Postinstallation network configuration
Copy link

5.3.1. Installing networking Operators
Copy link

5.3.2. Configuring a Linux bridge network
Copy link

5.3.2.1. Creating a Linux bridge NNCP
Copy link

5.3.2.2. Creating a Linux bridge NAD by using the web console
Copy link

5.3.3. Next steps
Copy link

5.3.4. Configuring a network for live migration
Copy link

5.3.4.1. Configuring a dedicated secondary network for live migration
Copy link

5.3.4.2. Selecting a dedicated network by using the web console
Copy link

5.3.5. Configuring an SR-IOV network
Copy link

5.3.5.1. Configuring SR-IOV network devices
Copy link

5.3.6. Next steps
Copy link

5.3.7. Enabling load balancer service creation by using the web console
Copy link

5.3.8. Configuring additional routes to the cdi-uploadproxy service
Copy link

5.4. Postinstallation storage configuration
Copy link

5.4.1. Configuring local storage by using the HPP
Copy link

5.4.1.1. Creating a storage class for the CSI driver with the storagePools stanza
Copy link

5.5. Configuring higher VM workload density
Copy link

5.5.1. Using wasp-agent to increase VM workload density
Copy link

5.5.2. Removing the wasp-agent component
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat

Making open source more inclusive

About Red Hat Documentation

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 5. Postinstallation configuration

5.1. Postinstallation configurationCopy linkLink copied to clipboard!

5.2. Specifying nodes for OpenShift Virtualization componentsCopy linkLink copied to clipboard!

5.2.1. About node placement rules for OpenShift Virtualization componentsCopy linkLink copied to clipboard!

5.2.2. Applying node placement rulesCopy linkLink copied to clipboard!

5.2.3. Node placement rule examplesCopy linkLink copied to clipboard!

5.2.3.1. Subscription object node placement rule examplesCopy linkLink copied to clipboard!

5.2.3.2. HyperConverged object node placement rule exampleCopy linkLink copied to clipboard!

5.2.3.3. HostPathProvisioner object node placement rule exampleCopy linkLink copied to clipboard!

5.3. Postinstallation network configurationCopy linkLink copied to clipboard!

5.3.1. Installing networking OperatorsCopy linkLink copied to clipboard!

5.3.2. Configuring a Linux bridge networkCopy linkLink copied to clipboard!

5.3.2.1. Creating a Linux bridge NNCPCopy linkLink copied to clipboard!

5.3.2.2. Creating a Linux bridge NAD by using the web consoleCopy linkLink copied to clipboard!

5.3.3. Next stepsCopy linkLink copied to clipboard!

5.3.4. Configuring a network for live migrationCopy linkLink copied to clipboard!

5.3.4.1. Configuring a dedicated secondary network for live migrationCopy linkLink copied to clipboard!

5.3.4.2. Selecting a dedicated network by using the web consoleCopy linkLink copied to clipboard!

5.3.5. Configuring an SR-IOV networkCopy linkLink copied to clipboard!

5.3.5.1. Configuring SR-IOV network devicesCopy linkLink copied to clipboard!

5.3.6. Next stepsCopy linkLink copied to clipboard!

5.3.7. Enabling load balancer service creation by using the web consoleCopy linkLink copied to clipboard!

5.3.8. Configuring additional routes to the cdi-uploadproxy serviceCopy linkLink copied to clipboard!

5.4. Postinstallation storage configurationCopy linkLink copied to clipboard!

5.4.1. Configuring local storage by using the HPPCopy linkLink copied to clipboard!

5.4.1.1. Creating a storage class for the CSI driver with the storagePools stanzaCopy linkLink copied to clipboard!

5.5. Configuring higher VM workload densityCopy linkLink copied to clipboard!

5.5.1. Using wasp-agent to increase VM workload densityCopy linkLink copied to clipboard!

5.5.2. Removing the wasp-agent componentCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat

Making open source more inclusive

About Red Hat Documentation

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

5.1. Postinstallation configuration
Copy link

5.2. Specifying nodes for OpenShift Virtualization components
Copy link

5.2.1. About node placement rules for OpenShift Virtualization components
Copy link

5.2.2. Applying node placement rules
Copy link

5.2.3. Node placement rule examples
Copy link

5.2.3.1. Subscription object node placement rule examples
Copy link

5.2.3.2. HyperConverged object node placement rule example
Copy link

5.2.3.3. HostPathProvisioner object node placement rule example
Copy link

5.3. Postinstallation network configuration
Copy link

5.3.1. Installing networking Operators
Copy link

5.3.2. Configuring a Linux bridge network
Copy link

5.3.2.1. Creating a Linux bridge NNCP
Copy link

5.3.2.2. Creating a Linux bridge NAD by using the web console
Copy link

5.3.3. Next steps
Copy link

5.3.4. Configuring a network for live migration
Copy link

5.3.4.1. Configuring a dedicated secondary network for live migration
Copy link

5.3.4.2. Selecting a dedicated network by using the web console
Copy link

5.3.5. Configuring an SR-IOV network
Copy link

5.3.5.1. Configuring SR-IOV network devices
Copy link

5.3.6. Next steps
Copy link

5.3.7. Enabling load balancer service creation by using the web console
Copy link

5.3.8. Configuring additional routes to the cdi-uploadproxy service
Copy link

5.4. Postinstallation storage configuration
Copy link

5.4.1. Configuring local storage by using the HPP
Copy link

5.4.1.1. Creating a storage class for the CSI driver with the storagePools stanza
Copy link

5.5. Configuring higher VM workload density
Copy link

5.5.1. Using wasp-agent to increase VM workload density
Copy link

5.5.2. Removing the wasp-agent component
Copy link