Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Chapter 11. Nodes
11.1. Node maintenance Link kopierenLink in die Zwischenablage kopiert!
Nodes can be placed into maintenance mode by using the
oc adm
NodeMaintenance
The
node-maintenance-operator
oc
For more information on remediation, fencing, and maintaining nodes, see the Workload Availability for Red Hat OpenShift documentation.
Virtual machines (VMs) must have a persistent volume claim (PVC) with a shared
ReadWriteMany
The Node Maintenance Operator watches for new or deleted
NodeMaintenance
NodeMaintenance
NodeMaintenance
Using a
NodeMaintenance
oc adm cordon
oc adm drain
11.1.1. Eviction strategies Link kopierenLink in die Zwischenablage kopiert!
Placing a node into maintenance marks the node as unschedulable and drains all the VMs and pods from it.
You can configure eviction strategies for virtual machines (VMs) or for the cluster.
- VM eviction strategy
The VM
eviction strategy ensures that a virtual machine instance (VMI) is not interrupted if the node is placed into maintenance or drained. VMIs with this eviction strategy will be live migrated to another node.LiveMigrateYou can configure eviction strategies for virtual machines (VMs) by using the web console or the command line.
ImportantThe default eviction strategy is
. A non-migratable VM with aLiveMigrateeviction strategy might prevent nodes from draining or block an infrastructure upgrade because the VM is not evicted from the node. This situation causes a migration to remain in aLiveMigrateorPendingstate unless you shut down the VM manually.SchedulingYou must set the eviction strategy of non-migratable VMs to
, which does not block an upgrade, or toLiveMigrateIfPossible, for VMs that should not be migrated.None
- Cluster eviction strategy
- You can configure an eviction strategy for the cluster to prioritize workload continuity or infrastructure upgrade.
Configuring a cluster eviction strategy is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
| Eviction strategy | Description | Interrupts workflow | Blocks upgrades |
|---|---|---|---|
|
| Prioritizes workload continuity over upgrades. | No | Yes 2 |
|
| Prioritizes upgrades over workload continuity to ensure that the environment is updated. | Yes | No |
|
| Shuts down VMs with no eviction strategy. | Yes | No |
- Default eviction strategy for multi-node clusters.
- If a VM blocks an upgrade, you must shut down the VM manually.
- Default eviction strategy for single-node OpenShift.
11.1.1.1. Configuring a VM eviction strategy using the command line Link kopierenLink in die Zwischenablage kopiert!
You can configure an eviction strategy for a virtual machine (VM) by using the command line.
The default eviction strategy is
LiveMigrate
LiveMigrate
Pending
Scheduling
You must set the eviction strategy of non-migratable VMs to
LiveMigrateIfPossible
None
Procedure
Edit the
resource by running the following command:VirtualMachine$ oc edit vm <vm_name> -n <namespace>Example eviction strategy
apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: name: <vm_name> spec: template: spec: evictionStrategy: LiveMigrateIfPossible1 # ...- 1
- Specify the eviction strategy. The default value is
LiveMigrate.
Restart the VM to apply the changes:
$ virtctl restart <vm_name> -n <namespace>
11.1.1.2. Configuring a cluster eviction strategy by using the command line Link kopierenLink in die Zwischenablage kopiert!
You can configure an eviction strategy for a cluster by using the command line.
Configuring a cluster eviction strategy is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
Procedure
Edit the
resource by running the following command:hyperconverged$ oc edit hyperconverged kubevirt-hyperconverged -n openshift-cnvSet the cluster eviction strategy as shown in the following example:
Example cluster eviction strategy
apiVersion: hco.kubevirt.io/v1beta1 kind: HyperConverged metadata: name: kubevirt-hyperconverged spec: evictionStrategy: LiveMigrate # ...
11.1.2. Run strategies Link kopierenLink in die Zwischenablage kopiert!
A virtual machine (VM) configured with
spec.running: true
spec.runStrategy
The
spec.runStrategy
spec.running
A VM configuration with both keys is invalid.
11.1.2.1. Run strategies Link kopierenLink in die Zwischenablage kopiert!
The
spec.runStrategy
Always-
The virtual machine instance (VMI) is always present when a virtual machine (VM) is created on another node. A new VMI is created if the original stops for any reason. This is the same behavior as
running: true. RerunOnFailure- The VMI is re-created on another node if the previous instance fails. The instance is not re-created if the VM stops successfully, such as when it is shut down.
Manual-
You control the VMI state manually with the
start,stop, andrestartvirtctl client commands. The VM is not automatically restarted. Halted-
No VMI is present when a VM is created. This is the same behavior as
running: false.
Different combinations of the
virtctl start
stop
restart
The following table describes a VM’s transition between states. The first column shows the VM’s initial run strategy. The remaining columns show a virtctl command and the new run strategy after that command is run.
| Initial run strategy | Start | Stop | Restart |
|---|---|---|---|
| Always | - | Halted | Always |
| RerunOnFailure | RerunOnFailure | RerunOnFailure | RerunOnFailure |
| Manual | Manual | Manual | Manual |
| Halted | Always | - | - |
If a node in a cluster installed by using installer-provisioned infrastructure fails the machine health check and is unavailable, VMs with
runStrategy: Always
runStrategy: RerunOnFailure
11.1.2.2. Configuring a VM run strategy by using the command line Link kopierenLink in die Zwischenablage kopiert!
You can configure a run strategy for a virtual machine (VM) by using the command line.
The
spec.runStrategy
spec.running
Procedure
Edit the
resource by running the following command:VirtualMachine$ oc edit vm <vm_name> -n <namespace>Example run strategy
apiVersion: kubevirt.io/v1 kind: VirtualMachine spec: runStrategy: Always # ...
11.1.3. Maintaining bare metal nodes Link kopierenLink in die Zwischenablage kopiert!
When you deploy OpenShift Container Platform on bare metal infrastructure, there are additional considerations that must be taken into account compared to deploying on cloud infrastructure. Unlike in cloud environments where the cluster nodes are considered ephemeral, re-provisioning a bare metal node requires significantly more time and effort for maintenance tasks.
When a bare metal node fails, for example, if a fatal kernel error happens or a NIC card hardware failure occurs, workloads on the failed node need to be restarted elsewhere else on the cluster while the problem node is repaired or replaced. Node maintenance mode allows cluster administrators to gracefully power down nodes, moving workloads to other parts of the cluster and ensuring workloads do not get interrupted. Detailed progress and node status details are provided during maintenance.
11.2. Managing node labeling for obsolete CPU models Link kopierenLink in die Zwischenablage kopiert!
You can schedule a virtual machine (VM) on a node as long as the VM CPU model and policy are supported by the node.
11.2.1. About node labeling for obsolete CPU models Link kopierenLink in die Zwischenablage kopiert!
The OpenShift Virtualization Operator uses a predefined list of obsolete CPU models to ensure that a node supports only valid CPU models for scheduled VMs.
By default, the following CPU models are eliminated from the list of labels generated for the node:
Example 11.1. Obsolete CPU models
"486"
Conroe
athlon
core2duo
coreduo
kvm32
kvm64
n270
pentium
pentium2
pentium3
pentiumpro
phenom
qemu32
qemu64
This predefined list is not visible in the
HyperConverged
spec.obsoleteCPUs.cpuModels
HyperConverged
11.2.2. About node labeling for CPU features Link kopierenLink in die Zwischenablage kopiert!
Through the process of iteration, the base CPU features in the minimum CPU model are eliminated from the list of labels generated for the node.
For example:
-
An environment might have two supported CPU models: and
Penryn.Haswell If
is specified as the CPU model forPenryn, each base CPU feature forminCPUis compared to the list of CPU features supported byPenryn.HaswellExample 11.2. CPU features supported by
Penrynapic clflush cmov cx16 cx8 de fpu fxsr lahf_lm lm mca mce mmx msr mtrr nx pae pat pge pni pse pse36 sep sse sse2 sse4.1 ssse3 syscall tscExample 11.3. CPU features supported by
Haswellaes apic avx avx2 bmi1 bmi2 clflush cmov cx16 cx8 de erms fma fpu fsgsbase fxsr hle invpcid lahf_lm lm mca mce mmx movbe msr mtrr nx pae pat pcid pclmuldq pge pni popcnt pse pse36 rdtscp rtm sep smep sse sse2 sse4.1 sse4.2 ssse3 syscall tsc tsc-deadline x2apic xsaveIf both
andPenrynsupport a specific CPU feature, a label is not created for that feature. Labels are generated for CPU features that are supported only byHaswelland not byHaswell.PenrynExample 11.4. Node labels created for CPU features after iteration
aes avx avx2 bmi1 bmi2 erms fma fsgsbase hle invpcid movbe pcid pclmuldq popcnt rdtscp rtm sse4.2 tsc-deadline x2apic xsave
11.2.3. Configuring obsolete CPU models Link kopierenLink in die Zwischenablage kopiert!
You can configure a list of obsolete CPU models by editing the
HyperConverged
Procedure
Edit the
custom resource, specifying the obsolete CPU models in theHyperConvergedarray. For example:obsoleteCPUsapiVersion: hco.kubevirt.io/v1beta1 kind: HyperConverged metadata: name: kubevirt-hyperconverged namespace: openshift-cnv spec: obsoleteCPUs: cpuModels:1 - "<obsolete_cpu_1>" - "<obsolete_cpu_2>" minCPUModel: "<minimum_cpu_model>"2 - 1
- Replace the example values in the
cpuModelsarray with obsolete CPU models. Any value that you specify is added to a predefined list of obsolete CPU models. The predefined list is not visible in the CR. - 2
- Replace this value with the minimum CPU model that you want to use for basic CPU features. If you do not specify a value,
Penrynis used by default.
11.3. Preventing node reconciliation Link kopierenLink in die Zwischenablage kopiert!
Use
skip-node
node-labeller
11.3.1. Using skip-node annotation Link kopierenLink in die Zwischenablage kopiert!
If you want the
node-labeller
oc
Prerequisites
-
You have installed the OpenShift CLI ().
oc
Procedure
Annotate the node that you want to skip by running the following command:
$ oc annotate node <node_name> node-labeller.kubevirt.io/skip-node=trueReplace
with the name of the relevant node to skip.<node_name>Reconciliation resumes on the next cycle after the node annotation is removed or set to false.
11.4. Deleting a failed node to trigger virtual machine failover Link kopierenLink in die Zwischenablage kopiert!
If a node fails and machine health checks are not deployed on your cluster, virtual machines (VMs) with
runStrategy: Always
Node
If you installed your cluster by using installer-provisioned infrastructure and you properly configured machine health checks, the following events occur:
- Failed nodes are automatically recycled.
-
Virtual machines with
runStrategyset toorAlwaysare automatically scheduled on healthy nodes.RerunOnFailure
11.4.1. Prerequisites Link kopierenLink in die Zwischenablage kopiert!
-
A node where a virtual machine was running has the condition.
NotReady -
The virtual machine that was running on the failed node has set to
runStrategy.Always -
You have installed the OpenShift CLI ().
oc
11.4.2. Deleting nodes from a bare metal cluster Link kopierenLink in die Zwischenablage kopiert!
When you delete a node using the CLI, the node object is deleted in Kubernetes, but the pods that exist on the node are not deleted. Any bare pods not backed by a replication controller become inaccessible to OpenShift Container Platform. Pods backed by replication controllers are rescheduled to other available nodes. You must delete local manifest pods.
Procedure
Delete a node from an OpenShift Container Platform cluster running on bare metal by completing the following steps:
Mark the node as unschedulable:
$ oc adm cordon <node_name>Drain all pods on the node:
$ oc adm drain <node_name> --force=trueThis step might fail if the node is offline or unresponsive. Even if the node does not respond, it might still be running a workload that writes to shared storage. To avoid data corruption, power down the physical hardware before you proceed.
Delete the node from the cluster:
$ oc delete node <node_name>Although the node object is now deleted from the cluster, it can still rejoin the cluster after reboot or if the kubelet service is restarted. To permanently delete the node and all its data, you must decommission the node.
- If you powered down the physical hardware, turn it back on so that the node can rejoin the cluster.
11.4.3. Verifying virtual machine failover Link kopierenLink in die Zwischenablage kopiert!
After all resources are terminated on the unhealthy node, a new virtual machine instance (VMI) is automatically created on a healthy node for each relocated VM. To confirm that the VMI was created, view all VMIs by using the
oc
11.4.3.1. Listing all virtual machine instances using the CLI Link kopierenLink in die Zwischenablage kopiert!
You can list all virtual machine instances (VMIs) in your cluster, including standalone VMIs and those owned by virtual machines, by using the
oc
Procedure
List all VMIs by running the following command:
$ oc get vmis -A