Home
Products
OpenShift Container Platform
4.15
Scalability and performance
Chapter 19. Improving cluster stability in high latency environments using worker latency profiles

Chapter 19. Improving cluster stability in high latency environments using worker latency profiles

If the cluster administrator has performed latency tests for platform verification, they can discover the need to adjust the operation of the cluster to ensure stability in cases of high latency. The cluster administrator needs to change only one parameter, recorded in a file, which controls four parameters affecting how supervisory processes read status and interpret the health of the cluster. Changing only the one parameter provides cluster tuning in an easy, supportable manner.

The Kubelet process provides the starting point for monitoring cluster health. The Kubelet sets status values for all nodes in the OpenShift Container Platform cluster. The Kubernetes Controller Manager (kube controller) reads the status values every 10 seconds, by default. If the kube controller cannot read a node status value, it loses contact with that node after a configured period. The default behavior is:

The node controller on the control plane updates the node health to Unhealthy and marks the node Ready condition`Unknown`.
In response, the scheduler stops scheduling pods to that node.
The Node Lifecycle Controller adds a node.kubernetes.io/unreachable taint with a NoExecute effect to the node and schedules any pods on the node for eviction after five minutes, by default.

This behavior can cause problems if your network is prone to latency issues, especially if you have nodes at the network edge. In some cases, the Kubernetes Controller Manager might not receive an update from a healthy node due to network latency. The Kubelet evicts pods from the node even though the node is healthy.

To avoid this problem, you can use worker latency profiles to adjust the frequency that the Kubelet and the Kubernetes Controller Manager wait for status updates before taking action. These adjustments help to ensure that your cluster runs properly if network latency between the control plane and the worker nodes is not optimal.

These worker latency profiles contain three sets of parameters that are predefined with carefully tuned values to control the reaction of the cluster to increased latency. There is no need to experimentally find the best values manually.

You can configure worker latency profiles when installing a cluster or at any time you notice increased latency in your cluster network.

19.1. Understanding worker latency profiles
Copy link

Worker latency profiles are four different categories of carefully-tuned parameters. The four parameters which implement these values are node-status-update-frequency, node-monitor-grace-period, default-not-ready-toleration-seconds and default-unreachable-toleration-seconds. These parameters can use values which allow you to control the reaction of the cluster to latency issues without needing to determine the best values by using manual methods.

Important

Setting these parameters manually is not supported. Incorrect parameter settings adversely affect cluster stability.

All worker latency profiles configure the following parameters:

node-status-update-frequency: Specifies how often the kubelet posts node status to the API server.
node-monitor-grace-period: Specifies the amount of time in seconds that the Kubernetes Controller Manager waits for an update from a kubelet before marking the node unhealthy and adding the node.kubernetes.io/not-ready or node.kubernetes.io/unreachable taint to the node.
default-not-ready-toleration-seconds: Specifies the amount of time in seconds after marking a node unhealthy that the Kube API Server Operator waits before evicting pods from that node.
default-unreachable-toleration-seconds: Specifies the amount of time in seconds after marking a node unreachable that the Kube API Server Operator waits before evicting pods from that node.

The following Operators monitor the changes to the worker latency profiles and respond accordingly:

The Machine Config Operator (MCO) updates the node-status-update-frequency parameter on the worker nodes.
The Kubernetes Controller Manager updates the node-monitor-grace-period parameter on the control plane nodes.
The Kubernetes API Server Operator updates the default-not-ready-toleration-seconds and default-unreachable-toleration-seconds parameters on the control plane nodes.

Although the default configuration works in most cases, OpenShift Container Platform offers two other worker latency profiles for situations where the network is experiencing higher latency than usual. The three worker latency profiles are described in the following sections:

Default worker latency profile

With the Default profile, each Kubelet updates it’s status every 10 seconds (node-status-update-frequency). The Kube Controller Manager checks the statuses of Kubelet every 5 seconds (node-monitor-grace-period).

The Kubernetes Controller Manager waits 40 seconds (node-monitor-grace-period) for a status update from Kubelet before considering the Kubelet unhealthy. If no status is made available to the Kubernetes Controller Manager, it then marks the node with the node.kubernetes.io/not-ready or node.kubernetes.io/unreachable taint and evicts the pods on that node.

If a pod is on a node that has the NoExecute taint, the pod runs according to tolerationSeconds. If the node has no taint, it will be evicted in 300 seconds (default-not-ready-toleration-seconds and default-unreachable-toleration-seconds settings of the Kube API Server).

Expand

Profile	Component	Parameter	Value
Default	kubelet	`node-status-update-frequency`	10s
	Kubelet Controller Manager	`node-monitor-grace-period`	40s
	Kubernetes API Server Operator	`default-not-ready-toleration-seconds`	300s
	Kubernetes API Server Operator	`default-unreachable-toleration-seconds`	300s

Medium worker latency profile

Use the MediumUpdateAverageReaction profile if the network latency is slightly higher than usual.

The MediumUpdateAverageReaction profile reduces the frequency of kubelet updates to 20 seconds and changes the period that the Kubernetes Controller Manager waits for those updates to 2 minutes. The pod eviction period for a pod on that node is reduced to 60 seconds. If the pod has the tolerationSeconds parameter, the eviction waits for the period specified by that parameter.

The Kubernetes Controller Manager waits for 2 minutes to consider a node unhealthy. In another minute, the eviction process starts.

Expand

Profile	Component	Parameter	Value
MediumUpdateAverageReaction	kubelet	`node-status-update-frequency`	20s
	Kubelet Controller Manager	`node-monitor-grace-period`	2m
	Kubernetes API Server Operator	`default-not-ready-toleration-seconds`	60s
	Kubernetes API Server Operator	`default-unreachable-toleration-seconds`	60s

Low worker latency profile

Use the LowUpdateSlowReaction profile if the network latency is extremely high.

The LowUpdateSlowReaction profile reduces the frequency of kubelet updates to 1 minute and changes the period that the Kubernetes Controller Manager waits for those updates to 5 minutes. The pod eviction period for a pod on that node is reduced to 60 seconds. If the pod has the tolerationSeconds parameter, the eviction waits for the period specified by that parameter.

The Kubernetes Controller Manager waits for 5 minutes to consider a node unhealthy. In another minute, the eviction process starts.

Expand

Profile	Component	Parameter	Value
LowUpdateSlowReaction	kubelet	`node-status-update-frequency`	1m
	Kubelet Controller Manager	`node-monitor-grace-period`	5m
	Kubernetes API Server Operator	`default-not-ready-toleration-seconds`	60s
	Kubernetes API Server Operator	`default-unreachable-toleration-seconds`	60s

Note

The latency profiles do not support custom machine config pools, only the default worker machine config pools.

19.2. Implementing worker latency profiles at cluster creation
Copy link

Important

To edit the configuration of the installation program, first use the command openshift-install create manifests to create the default node manifest and other manifest YAML files. This file structure must exist before you can add workerLatencyProfile. The platform on which you are installing might have varying requirements. Refer to the Installing section of the documentation for your specific platform.

The workerLatencyProfile must be added to the manifest in the following sequence:

Create the manifest needed to build the cluster, using a folder name appropriate for your installation.
Create a YAML file to define config.node. The file must be in the manifests directory.
When defining workerLatencyProfile in the manifest for the first time, specify any of the profiles at cluster creation time: Default, MediumUpdateAverageReaction or LowUpdateSlowReaction.

Verification

Here is an example manifest creation showing the spec.workerLatencyProfile Default value in the manifest file:
```
openshift-install create manifests --dir=<cluster-install-dir>
```
```
$ openshift-install create manifests --dir=<cluster-install-dir>
```
Copy to Clipboard Toggle word wrap

Edit the manifest and add the value. In this example we use vi to show an example manifest file with the "Default" workerLatencyProfile value added:

vi <cluster-install-dir>/manifests/config-node-default-profile.yaml

$ vi <cluster-install-dir>/manifests/config-node-default-profile.yaml

Copy to Clipboard

Toggle word wrap

Example output

apiVersion: config.openshift.io/v1
kind: Node
metadata:
name: cluster
spec:
workerLatencyProfile: "Default"

apiVersion: config.openshift.io/v1
kind: Node
metadata:
name: cluster
spec:
workerLatencyProfile: "Default"

Copy to Clipboard

Toggle word wrap

19.3. Using and changing worker latency profiles
Copy link

To change a worker latency profile to deal with network latency, edit the node.config object to add the name of the profile. You can change the profile at any time as latency increases or decreases.

You must move one worker latency profile at a time. For example, you cannot move directly from the Default profile to the LowUpdateSlowReaction worker latency profile. You must move from the Default worker latency profile to the MediumUpdateAverageReaction profile first, then to LowUpdateSlowReaction. Similarly, when returning to the Default profile, you must move from the low profile to the medium profile first, then to Default.

Note

You can also configure worker latency profiles upon installing an OpenShift Container Platform cluster.

Procedure

To move from the default worker latency profile:

Move to the medium worker latency profile:

Edit the node.config object:
```
oc edit nodes.config/cluster
```
```
$ oc edit nodes.config/cluster
```
Copy to Clipboard Toggle word wrap

Add spec.workerLatencyProfile: MediumUpdateAverageReaction:

Example node.config object

apiVersion: config.openshift.io/v1
kind: Node
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    release.openshift.io/create-only: "true"
  creationTimestamp: "2022-07-08T16:02:51Z"
  generation: 1
  name: cluster
  ownerReferences:
  - apiVersion: config.openshift.io/v1
    kind: ClusterVersion
    name: version
    uid: 36282574-bf9f-409e-a6cd-3032939293eb
  resourceVersion: "1865"
  uid: 0c0f7a4c-4307-4187-b591-6155695ac85b
spec:
  workerLatencyProfile: MediumUpdateAverageReaction 

# ...

apiVersion: config.openshift.io/v1
kind: Node
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    release.openshift.io/create-only: "true"
  creationTimestamp: "2022-07-08T16:02:51Z"
  generation: 1
  name: cluster
  ownerReferences:
  - apiVersion: config.openshift.io/v1
    kind: ClusterVersion
    name: version
    uid: 36282574-bf9f-409e-a6cd-3032939293eb
  resourceVersion: "1865"
  uid: 0c0f7a4c-4307-4187-b591-6155695ac85b
spec:
  workerLatencyProfile: MediumUpdateAverageReaction



# ...

Copy to Clipboard

Toggle word wrap

1: Specifies the medium worker latency policy.

Scheduling on each worker node is disabled as the change is being applied.

Optional: Move to the low worker latency profile:

Edit the node.config object:
```
oc edit nodes.config/cluster
```
```
$ oc edit nodes.config/cluster
```
Copy to Clipboard Toggle word wrap

Change the spec.workerLatencyProfile value to LowUpdateSlowReaction:

Example node.config object

apiVersion: config.openshift.io/v1
kind: Node
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    release.openshift.io/create-only: "true"
  creationTimestamp: "2022-07-08T16:02:51Z"
  generation: 1
  name: cluster
  ownerReferences:
  - apiVersion: config.openshift.io/v1
    kind: ClusterVersion
    name: version
    uid: 36282574-bf9f-409e-a6cd-3032939293eb
  resourceVersion: "1865"
  uid: 0c0f7a4c-4307-4187-b591-6155695ac85b
spec:
  workerLatencyProfile: LowUpdateSlowReaction 

# ...

apiVersion: config.openshift.io/v1
kind: Node
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    release.openshift.io/create-only: "true"
  creationTimestamp: "2022-07-08T16:02:51Z"
  generation: 1
  name: cluster
  ownerReferences:
  - apiVersion: config.openshift.io/v1
    kind: ClusterVersion
    name: version
    uid: 36282574-bf9f-409e-a6cd-3032939293eb
  resourceVersion: "1865"
  uid: 0c0f7a4c-4307-4187-b591-6155695ac85b
spec:
  workerLatencyProfile: LowUpdateSlowReaction



# ...

Copy to Clipboard

Toggle word wrap

1: Specifies use of the low worker latency policy.

Scheduling on each worker node is disabled as the change is being applied.

Verification

When all nodes return to the Ready condition, you can use the following command to look in the Kubernetes Controller Manager to ensure it was applied:

oc get KubeControllerManager -o yaml | grep -i workerlatency -A 5 -B 5

$ oc get KubeControllerManager -o yaml | grep -i workerlatency -A 5 -B 5

Copy to Clipboard

Toggle word wrap

Example output

# ...
    - lastTransitionTime: "2022-07-11T19:47:10Z"
      reason: ProfileUpdated
      status: "False"
      type: WorkerLatencyProfileProgressing
    - lastTransitionTime: "2022-07-11T19:47:10Z"
      message: all static pod revision(s) have updated latency profile
      reason: ProfileUpdated
      status: "True"
      type: WorkerLatencyProfileComplete
    - lastTransitionTime: "2022-07-11T19:20:11Z"
      reason: AsExpected
      status: "False"
      type: WorkerLatencyProfileDegraded
    - lastTransitionTime: "2022-07-11T19:20:36Z"
      status: "False"
# ...

# ...
    - lastTransitionTime: "2022-07-11T19:47:10Z"
      reason: ProfileUpdated
      status: "False"
      type: WorkerLatencyProfileProgressing
    - lastTransitionTime: "2022-07-11T19:47:10Z"


      message: all static pod revision(s) have updated latency profile
      reason: ProfileUpdated
      status: "True"
      type: WorkerLatencyProfileComplete
    - lastTransitionTime: "2022-07-11T19:20:11Z"
      reason: AsExpected
      status: "False"
      type: WorkerLatencyProfileDegraded
    - lastTransitionTime: "2022-07-11T19:20:36Z"
      status: "False"
# ...

Copy to Clipboard

Toggle word wrap

1: Specifies that the profile is applied and active.

To change the medium profile to default or change the default to medium, edit the node.config object and set the spec.workerLatencyProfile parameter to the appropriate value.

19.4. Example steps for displaying resulting values of workerLatencyProfile
Copy link

You can display the values in the workerLatencyProfile with the following commands.

Verification

Check the default-not-ready-toleration-seconds and default-unreachable-toleration-seconds fields output by the Kube API Server:

oc get KubeAPIServer -o yaml | grep -A 1 default-

$ oc get KubeAPIServer -o yaml | grep -A 1 default-

Copy to Clipboard

Toggle word wrap

Example output

default-not-ready-toleration-seconds:
- "300"
default-unreachable-toleration-seconds:
- "300"

default-not-ready-toleration-seconds:
- "300"
default-unreachable-toleration-seconds:
- "300"

Copy to Clipboard

Toggle word wrap

Check the values of the node-monitor-grace-period field from the Kube Controller Manager:
```
oc get KubeControllerManager -o yaml | grep -A 1 node-monitor
```
```
$ oc get KubeControllerManager -o yaml | grep -A 1 node-monitor
```
Copy to Clipboard Toggle word wrap
Example output
```
node-monitor-grace-period:
- 40s
```
```
node-monitor-grace-period:
- 40s
```
Copy to Clipboard Toggle word wrap
Check the nodeStatusUpdateFrequency value from the Kubelet. Set the directory /host as the root directory within the debug shell. By changing the root directory to /host, you can run binaries contained in the host’s executable paths:
```
oc debug node/<worker-node-name>
```
```
$ oc debug node/<worker-node-name>
```
Copy to Clipboard Toggle word wrap
```
chroot /host
```
```
$ chroot /host
```
Copy to Clipboard Toggle word wrap
```
cat /etc/kubernetes/kubelet.conf|grep nodeStatusUpdateFrequency
```
```
# cat /etc/kubernetes/kubelet.conf|grep nodeStatusUpdateFrequency
```
Copy to Clipboard Toggle word wrap
Example output
```
“nodeStatusUpdateFrequency”: “10s”
```
```
“nodeStatusUpdateFrequency”: “10s”
```
Copy to Clipboard Toggle word wrap

These outputs validate the set of timing variables for the Worker Latency Profile.

Chapter 19. Improving cluster stability in high latency environments using worker latency profiles

19.1. Understanding worker latency profiles
Copy link

19.2. Implementing worker latency profiles at cluster creation
Copy link

19.3. Using and changing worker latency profiles
Copy link

19.4. Example steps for displaying resulting values of workerLatencyProfile
Copy link

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Chapter 19. Improving cluster stability in high latency environments using worker latency profiles

19.1. Understanding worker latency profilesCopy linkLink copied to clipboard!

19.2. Implementing worker latency profiles at cluster creationCopy linkLink copied to clipboard!

19.3. Using and changing worker latency profilesCopy linkLink copied to clipboard!

19.4. Example steps for displaying resulting values of workerLatencyProfileCopy linkLink copied to clipboard!

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

Making open source more inclusive

About Red Hat

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

19.1. Understanding worker latency profiles
Copy link

19.2. Implementing worker latency profiles at cluster creation
Copy link

19.3. Using and changing worker latency profiles
Copy link

19.4. Example steps for displaying resulting values of workerLatencyProfile
Copy link