Chapter 10. Node maintenance
10.1. Automatic renewal of TLS certificates
All TLS certificates for OpenShift Virtualization components are renewed and rotated automatically. You are not required to refresh them manually.
10.1.1. Automatic renewal of TLS certificates
TLS certificates are automatically deleted and replaced according to the following schedule:
- KubeVirt certificates are renewed daily.
- Containerized Data Importer controller (CDI) certificates are renewed every 15 days.
- MAC pool certificates are renewed every year.
Automatic TLS certificate rotation does not disrupt any operations. For example, the following operations continue to function without any disruption:
- Migrations
- Image uploads
- VNC and console connections
10.2. Node maintenance mode
10.2.1. Understanding node maintenance mode
Placing a node into maintenance marks the node as unschedulable and drains all the virtual machines and pods from it. Virtual machine instances that have a LiveMigrate
eviction strategy are live migrated to another node without loss of service. This eviction strategy is configured by default in virtual machine created from common templates but must be configured manually for custom virtual machines.
Virtual machine instances without an eviction strategy will be deleted on the node and recreated on another node.
Virtual machines must have a PersistentVolumeClaim (PVC) with a shared ReadWriteMany (RWX) access mode to be live migrated.
Additional resources:
10.3. Setting a node to maintenance mode
10.3.1. Understanding node maintenance mode
Placing a node into maintenance marks the node as unschedulable and drains all the virtual machines and pods from it. Virtual machine instances that have a LiveMigrate
eviction strategy are live migrated to another node without loss of service. This eviction strategy is configured by default in virtual machine created from common templates but must be configured manually for custom virtual machines.
Virtual machine instances without an eviction strategy will be deleted on the node and recreated on another node.
Virtual machines must have a PersistentVolumeClaim (PVC) with a shared ReadWriteMany (RWX) access mode to be live migrated.
Place a node into maintenance from either the web console or the CLI.
10.3.2. Setting a node to maintenance mode in the web console
Set a node to maintenance mode using the Options menu
found on each node in the Compute
Procedure
-
In the OpenShift Virtualization console, click Compute
Nodes. You can set the node to maintenance from this screen, which makes it easier to perform actions on multiple nodes in the one screen or from the Node Details screen where you can view comprehensive details of the selected node:
- Click the Options menu at the end of the node and select Start Maintenance.
-
Click the node name to open the Node Details screen and click Actions
Start Maintenance.
- Click Start Maintenance in the confirmation window.
The node will live migrate virtual machine instances that have the LiveMigration
eviction strategy, and the node is no longer schedulable. All other pods and virtual machines on the node are deleted and recreated on another node.
10.3.3. Setting a node to maintenance mode in the CLI
Set a node to maintenance mode by creating a NodeMaintenance
Custom Resource (CR) object that references the node name and the reason for setting it to maintenance mode.
Procedure
Create the node maintenance CR configuration. This example uses a CR that is called
node02-maintenance.yaml
:apiVersion: nodemaintenance.kubevirt.io/v1beta1 kind: NodeMaintenance metadata: name: node02-maintenance spec: nodeName: node02 reason: "Replacing node02"
Create the
NodeMaintenance
object in the cluster:$ oc apply -f <node02-maintenance.yaml>
The node live migrates virtual machine instances that have the LiveMigration
eviction strategy, and taint the node so that it is no longer schedulable. All other pods and virtual machines on the node are deleted and recreated on another node.
Additional resources:
10.4. Resuming a node from maintenance mode
Resuming a node brings it out of maintenance mode and schedulable again.
Resume a node from maintenance from either the web console or the CLI.
10.4.1. Resuming a node from maintenance mode in the web console
Resume a node from maintenance mode using the Options menu
found on each node in the Compute
Procedure
-
In the OpenShift Virtualization console, click Compute
Nodes. You can resume the node from this screen, which makes it easier to perform actions on multiple nodes in the one screen, or from the Node Details screen where you can view comprehensive details of the selected node:
- Click the Options menu at the end of the node and select Stop Maintenance.
-
Click the node name to open the Node Details screen and click Actions
Stop Maintenance.
- Click Stop Maintenance in the confirmation window.
The node becomes schedulable, but virtual machine instances that were running on the node prior to maintenance will not automatically migrate back to this node.
10.4.2. Resuming a node from maintenance mode in the CLI
Resume a node from maintenance mode and make it schedulable again by deleting the NodeMaintenance
object for the node.
Procedure
Find the
NodeMaintenance
object:$ oc get nodemaintenance
Optional: Insepct the
NodeMaintenance
object to ensure it is associated with the correct node:$ oc describe nodemaintenance <node02-maintenance>
Example output
Name: node02-maintenance Namespace: Labels: Annotations: API Version: nodemaintenance.kubevirt.io/v1beta1 Kind: NodeMaintenance ... Spec: Node Name: node02 Reason: Replacing node02
Delete the
NodeMaintenance
object:$ oc delete nodemaintenance <node02-maintenance>