Chapter 5. Configuring PID limits
A process identifier (PID) is a unique identifier assigned by the Linux kernel to each process or thread currently running on a system. The number of processes that can run simultaneously on a system is limited to 4,194,304 by the Linux kernel. This number might also be affected by limited access to other system resources such as memory, CPU, and disk space.
In Red Hat OpenShift Service on AWS 4.11 and later, by default, a pod can have a maximum of 4,096 PIDs. If your workload requires more than that, you can increase the allowed maximum number of PIDs by configuring a KubeletConfig
object.
Red Hat OpenShift Service on AWS clusters running versions earlier than 4.11 use a default PID limit of 1024
.
5.1. Understanding process ID limits
In Red Hat OpenShift Service on AWS, consider these two supported limits for process ID (PID) usage before you schedule work on your cluster:
Maximum number of PIDs per pod.
The default value is 4,096 in Red Hat OpenShift Service on AWS 4.11 and later. This value is controlled by the
podPidsLimit
parameter set on the node.Maximum number of PIDs per node.
The default value depends on node resources. In Red Hat OpenShift Service on AWS, this value is controlled by the
--system-reserved
parameter, which reserves PIDs on each node based on the total resources of the node.
When a pod exceeds the allowed maximum number of PIDs per pod, the pod might stop functioning correctly and might be evicted from the node. See the Kubernetes documentation for eviction signals and thresholds for more information.
When a node exceeds the allowed maximum number of PIDs per node, the node can become unstable because new processes cannot have PIDs assigned. If existing processes cannot complete without creating additional processes, the entire node can become unusable and require reboot. This situation can result in data loss, depending on the processes and applications being run. Customer administrators and Red Hat Site Reliability Engineering are notified when this threshold is reached, and a Worker node is experiencing PIDPressure
warning will appear in the cluster logs.
5.2. Risks of setting higher process ID limits for Red Hat OpenShift Service on AWS pods
The podPidsLimit
parameter for a pod controls the maximum number of processes and threads that can run simultaneously in that pod.
You can increase the value for podPidsLimit
from the default of 4,096 to a maximum of 16,384. Changing this value might incur downtime for applications, because changing the podPidsLimit
requires rebooting the affected node.
If you are running a large number of pods per node, and you have a high podPidsLimit
value on your nodes, you risk exceeding the PID maximum for the node.
To find the maximum number of pods that you can run simultaneously on a single node without exceeding the PID maximum for the node, divide 3,650,000 by your podPidsLimit
value. For example, if your podPidsLimit
value is 16,384, and you expect the pods to use close to that number of process IDs, you can safely run 222 pods on a single node.
Memory, CPU, and available storage can also limit the maximum number of pods that can run simultaneously, even when the podPidsLimit
value is set appropriately. For more information, see "Planning your environment" and "Limits and scalability".
Additional resources
5.3. Setting a higher process ID limit on an existing Red Hat OpenShift Service on AWS cluster
You can set a higher podPidsLimit
on an existing Red Hat OpenShift Service on AWS (ROSA) cluster by creating or editing a KubeletConfig
object that changes the --pod-pids-limit
parameter.
Changing the podPidsLimit
on an existing cluster will trigger non-control plane nodes in the cluster to reboot one at a time. Make this change outside of peak usage hours for your cluster and avoid upgrading or hibernating your cluster until all nodes have rebooted.
Prerequisites
- You have a Red Hat OpenShift Service on AWS cluster.
-
You have installed the ROSA CLI (
rosa
). -
You have installed the OpenShift CLI (
oc
). - You have logged in to your Red Hat account by using the ROSA CLI.
Procedure
Create or edit the
KubeletConfig
object to change the PID limit.If this is the first time you are changing the default PID limit, create the
KubeletConfig
object and set the--pod-pids-limit
value by running the following command:$ rosa create kubeletconfig -c <cluster_name> --name <kubeletconfig_name> --pod-pids-limit=<value>
NoteThe
--name
parameter is optional on ROSA Classic clusters, because only oneKubeletConfig
object is supported per ROSA Classic cluster.For example, the following command sets a maximum of 16,384 PIDs per pod for cluster
my-cluster
:$ rosa create kubeletconfig -c my-cluster --name set-high-pids --pod-pids-limit=16384
If you previously created a
KubeletConfig
object, edit the existingKubeletConfig
object and set the--pod-pids-limit
value by running the following command:$ rosa edit kubeletconfig -c <cluster_name> --name <kubeletconfig_name> --pod-pids-limit=<value>
A cluster-wide rolling reboot of worker nodes is triggered.
Verify that all of the worker nodes rebooted by running the following command:
$ oc get machineconfigpool
Example output
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-06c9c4… True False False 3 3 3 0 4h42m worker rendered-worker-f4b64… True False False 4 4 4 0 4h42m
Verification
When each node in the cluster has rebooted, you can verify that the new setting is in place.
Check the Pod Pids limit in the
KubeletConfig
object:$ rosa describe kubeletconfig --cluster=<cluster_name>
The new PIDs limit appears in the output, as shown in the following example:
Example output
Pod Pids Limit: 16384
5.4. Removing custom configuration from a cluster
You can remove custom configuration from your cluster by removing the KubeletConfig
object that contains the configuration details.
Prerequisites
- You have an existing Red Hat OpenShift Service on AWS cluster.
- You have installed the ROSA CLI (rosa).
- You have logged in to your Red Hat account by using the ROSA CLI.
Procedure
Remove custom configuration from the cluster by deleting the relevant custom
KubeletConfig
object:$ rosa delete kubeletconfig --cluster <cluster_name> --name <kubeletconfig_name>
Verification steps
Confirm that the custom
KubeletConfig
object is not listed for the cluster:$ rosa describe kubeletconfig --name <cluster_name>