Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Chapter 15. Improving cluster stability in high latency environments using worker latency profiles
If the cluster administrator has performed latency tests for platform verification, they can discover the need to adjust the operation of the cluster to ensure stability in cases of high latency. The cluster administrator need change only one parameter, recorded in a file, which controls four parameters affecting how supervisory processes read status and interpret the health of the cluster. Changing only the one parameter provides cluster tuning in an easy, supportable manner.
The
Kubelet
Kubelet
kube controller
kube controller
-
The node controller on the control plane updates the node health to and marks the node
Unhealthycondition`Unknown`.Ready - In response, the scheduler stops scheduling pods to that node.
-
The Node Lifecycle Controller adds a taint with a
node.kubernetes.io/unreachableeffect to the node and schedules any pods on the node for eviction after five minutes, by default.NoExecute
This behavior can cause problems if your network is prone to latency issues, especially if you have nodes at the network edge. In some cases, the Kubernetes Controller Manager might not receive an update from a healthy node due to network latency. The
Kubelet
To avoid this problem, you can use worker latency profiles to adjust the frequency that the
Kubelet
These worker latency profiles contain three sets of parameters that are pre-defined with carefully tuned values to control the reaction of the cluster to increased latency. No need to experimentally find the best values manually.
You can configure worker latency profiles when installing a cluster or at any time you notice increased latency in your cluster network.
15.1. Understanding worker latency profiles Link kopierenLink in die Zwischenablage kopiert!
Worker latency profiles are four different categories of carefully-tuned parameters. The four parameters which implement these values are
node-status-update-frequency
node-monitor-grace-period
default-not-ready-toleration-seconds
default-unreachable-toleration-seconds
Setting these parameters manually is not supported. Incorrect parameter settings adversely affect cluster stability.
All worker latency profiles configure the following parameters:
- node-status-update-frequency
- Specifies how often the kubelet posts node status to the API server.
- node-monitor-grace-period
-
Specifies the amount of time in seconds that the Kubernetes Controller Manager waits for an update from a kubelet before marking the node unhealthy and adding the
node.kubernetes.io/not-readyornode.kubernetes.io/unreachabletaint to the node. - default-not-ready-toleration-seconds
- Specifies the amount of time in seconds after marking a node unhealthy that the Kube API Server Operator waits before evicting pods from that node.
- default-unreachable-toleration-seconds
- Specifies the amount of time in seconds after marking a node unreachable that the Kube API Server Operator waits before evicting pods from that node.
The following Operators monitor the changes to the worker latency profiles and respond accordingly:
-
The Machine Config Operator (MCO) updates the parameter on the worker nodes.
node-status-update-frequency -
The Kubernetes Controller Manager updates the parameter on the control plane nodes.
node-monitor-grace-period -
The Kubernetes API Server Operator updates the and
default-not-ready-toleration-secondsparameters on the control plane nodes.default-unreachable-toleration-seconds
While the default configuration works in most cases, OpenShift Container Platform offers two other worker latency profiles for situations where the network is experiencing higher latency than usual. The three worker latency profiles are described in the following sections:
- Default worker latency profile
With the
profile, eachDefaultupdates it’s status every 10 seconds (Kubelet). Thenode-status-update-frequencychecks the statuses ofKube Controller Managerevery 5 seconds (Kubelet).node-monitor-grace-periodThe Kubernetes Controller Manager waits 40 seconds for a status update from
before considering theKubeletunhealthy. If no status is made available to the Kubernetes Controller Manager, it then marks the node with theKubeletornode.kubernetes.io/not-readytaint and evicts the pods on that node.node.kubernetes.io/unreachableIf a pod on that node has the
taint, the pod is run according toNoExecute. If the pod has no taint, it will be evicted in 300 seconds (tolerationSecondsanddefault-not-ready-toleration-secondssettings of thedefault-unreachable-toleration-seconds).Kube API ServerExpand Profile Component Parameter Value Default
kubelet
node-status-update-frequency10s
Kubelet Controller Manager
node-monitor-grace-period40s
Kubernetes API Server Operator
default-not-ready-toleration-seconds300s
Kubernetes API Server Operator
default-unreachable-toleration-seconds300s
- Medium worker latency profile
Use the
profile if the network latency is slightly higher than usual.MediumUpdateAverageReactionThe
profile reduces the frequency of kubelet updates to 20 seconds and changes the period that the Kubernetes Controller Manager waits for those updates to 2 minutes. The pod eviction period for a pod on that node is reduced to 60 seconds. If the pod has theMediumUpdateAverageReactionparameter, the eviction waits for the period specified by that parameter.tolerationSecondsThe Kubernetes Controller Manager waits for 2 minutes to consider a node unhealthy. In another minute, the eviction process starts.
Expand Profile Component Parameter Value MediumUpdateAverageReaction
kubelet
node-status-update-frequency20s
Kubelet Controller Manager
node-monitor-grace-period2m
Kubernetes API Server Operator
default-not-ready-toleration-seconds60s
Kubernetes API Server Operator
default-unreachable-toleration-seconds60s
- Low worker latency profile
Use the
profile if the network latency is extremely high.LowUpdateSlowReactionThe
profile reduces the frequency of kubelet updates to 1 minute and changes the period that the Kubernetes Controller Manager waits for those updates to 5 minutes. The pod eviction period for a pod on that node is reduced to 60 seconds. If the pod has theLowUpdateSlowReactionparameter, the eviction waits for the period specified by that parameter.tolerationSecondsThe Kubernetes Controller Manager waits for 5 minutes to consider a node unhealthy. In another minute, the eviction process starts.
Expand Profile Component Parameter Value LowUpdateSlowReaction
kubelet
node-status-update-frequency1m
Kubelet Controller Manager
node-monitor-grace-period5m
Kubernetes API Server Operator
default-not-ready-toleration-seconds60s
Kubernetes API Server Operator
default-unreachable-toleration-seconds60s
The latency profiles do not support custom machine config pools, only the default worker machine config pools.
15.2. Implementing worker latency profiles at cluster creation Link kopierenLink in die Zwischenablage kopiert!
To edit the configuration of the installer, you will first need to use the command
openshift-install create manifests
The
workerLatencyProfile
- Create the manifest needed to build the cluster, using a folder name appropriate for your installation.
-
Create a YAML file to define . The file must be in the
config.nodedirectory.manifests -
When defining in the manifest for the first time, specify any of the profiles at cluster creation time:
workerLatencyProfile,DefaultorMediumUpdateAverageReaction.LowUpdateSlowReaction
Verification
Here is an example manifest creation showing the
spec.workerLatencyProfilevalue in the manifest file:Default$ openshift-install create manifests --dir=<cluster-install-dir>Edit the manifest and add the value. In this example we use
to show an example manifest file with the "Default"vivalue added:workerLatencyProfile$ vi <cluster-install-dir>/manifests/config-node-default-profile.yamlExample output
apiVersion: config.openshift.io/v1 kind: Node metadata: name: cluster spec: workerLatencyProfile: "Default"
15.3. Using and changing worker latency profiles Link kopierenLink in die Zwischenablage kopiert!
To change a worker latency profile to deal with network latency, edit the
node.config
You must move one worker latency profile at a time. For example, you cannot move directly from the
Default
LowUpdateSlowReaction
Default
MediumUpdateAverageReaction
LowUpdateSlowReaction
Default
Default
You can also configure worker latency profiles upon installing an OpenShift Container Platform cluster.
Procedure
To move from the default worker latency profile:
Move to the medium worker latency profile:
Edit the
object:node.config$ oc edit nodes.config/clusterAdd
:spec.workerLatencyProfile: MediumUpdateAverageReactionExample
node.configobjectapiVersion: config.openshift.io/v1 kind: Node metadata: annotations: include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" release.openshift.io/create-only: "true" creationTimestamp: "2022-07-08T16:02:51Z" generation: 1 name: cluster ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 36282574-bf9f-409e-a6cd-3032939293eb resourceVersion: "1865" uid: 0c0f7a4c-4307-4187-b591-6155695ac85b spec: workerLatencyProfile: MediumUpdateAverageReaction1 # ...- 1
- Specifies the medium worker latency policy.
Scheduling on each worker node is disabled as the change is being applied.
Optional: Move to the low worker latency profile:
Edit the
object:node.config$ oc edit nodes.config/clusterChange the
value tospec.workerLatencyProfile:LowUpdateSlowReactionExample
node.configobjectapiVersion: config.openshift.io/v1 kind: Node metadata: annotations: include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" release.openshift.io/create-only: "true" creationTimestamp: "2022-07-08T16:02:51Z" generation: 1 name: cluster ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 36282574-bf9f-409e-a6cd-3032939293eb resourceVersion: "1865" uid: 0c0f7a4c-4307-4187-b591-6155695ac85b spec: workerLatencyProfile: LowUpdateSlowReaction1 # ...- 1
- Specifies use of the low worker latency policy.
Scheduling on each worker node is disabled as the change is being applied.
Verification
When all nodes return to the
condition, you can use the following command to look in the Kubernetes Controller Manager to ensure it was applied:Ready$ oc get KubeControllerManager -o yaml | grep -i workerlatency -A 5 -B 5Example output
# ... - lastTransitionTime: "2022-07-11T19:47:10Z" reason: ProfileUpdated status: "False" type: WorkerLatencyProfileProgressing - lastTransitionTime: "2022-07-11T19:47:10Z"1 message: all static pod revision(s) have updated latency profile reason: ProfileUpdated status: "True" type: WorkerLatencyProfileComplete - lastTransitionTime: "2022-07-11T19:20:11Z" reason: AsExpected status: "False" type: WorkerLatencyProfileDegraded - lastTransitionTime: "2022-07-11T19:20:36Z" status: "False" # ...- 1
- Specifies that the profile is applied and active.
To change the medium profile to default or change the default to medium, edit the
node.config
spec.workerLatencyProfile
15.4. Example steps for displaying resulting values of workerLatencyProfile Link kopierenLink in die Zwischenablage kopiert!
You can display the values in the
workerLatencyProfile
Verification
Check the
anddefault-not-ready-toleration-secondsfields output by the Kube API Server:default-unreachable-toleration-seconds$ oc get KubeAPIServer -o yaml | grep -A 1 default-Example output
default-not-ready-toleration-seconds: - "300" default-unreachable-toleration-seconds: - "300"Check the values of the
field from the Kube Controller Manager:node-monitor-grace-period$ oc get KubeControllerManager -o yaml | grep -A 1 node-monitorExample output
node-monitor-grace-period: - 40sCheck the
value from the Kubelet. Set the directorynodeStatusUpdateFrequencyas the root directory within the debug shell. By changing the root directory to/host, you can run binaries contained in the host’s executable paths:/host$ oc debug node/<worker-node-name> $ chroot /host # cat /etc/kubernetes/kubelet.conf|grep nodeStatusUpdateFrequencyExample output
“nodeStatusUpdateFrequency”: “10s”
These outputs validate the set of timing variables for the Worker Latency Profile.