Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
Chapter 44. Setting limits for applications
You can use the control groups (cgroups
) kernel functionality to set limits, prioritize or isolate the hardware resources of processes. This allows you to granularly control resource usage of applications to utilize them more efficiently.
44.1. Understanding control groups
Control groups is a Linux kernel feature that enables you to organize processes into hierarchically ordered groups - cgroups
. The hierarchy (control groups tree) is defined by providing structure to cgroups
virtual file system, mounted by default on the /sys/fs/cgroup/
directory. The systemd
system and service manager utilizes cgroups
to organize all units and services that it governs. Alternatively, you can manage cgroups
hierarchies manually by creating and removing sub-directories in the /sys/fs/cgroup/
directory.
The resource controllers (a kernel component) then modify the behavior of processes in cgroups
by limiting, prioritizing or allocating system resources, (such as CPU time, memory, network bandwidth, or various combinations) of those processes.
The added value of cgroups
is process aggregation which enables division of hardware resources among applications and users. Thereby an increase in overall efficiency, stability and security of users' environment can be achieved.
- Control groups version 1
Control groups version 1 (
cgroups-v1
) provide a per-resource controller hierarchy. It means that each resource, such as CPU, memory, I/O, and so on, has its own control group hierarchy. It is possible to combine different control group hierarchies in a way that one controller can coordinate with another one in managing their respective resources. However, the two controllers may belong to different process hierarchies, which does not permit their proper coordination.The
cgroups-v1
controllers were developed across a large time span and as a result, the behavior and naming of their control files is not uniform.- Control groups version 2
The problems with controller coordination, which stemmed from hierarchy flexibility, led to the development of control groups version 2.
Control groups version 2 (
cgroups-v2
) provides a single control group hierarchy against which all resource controllers are mounted.The control file behavior and naming is consistent among different controllers.
Notecgroups-v2
is fully supported in RHEL 8.2 and later versions. For more information, see Control Group v2 is now fully supported in RHEL 8.
This sub-section was based on a Devconf.cz 2019 presentation.[4]
Additional resources
- What are kernel resource controllers
-
cgroups(7)
manual page - Role of systemd in control groups
44.2. What are kernel resource controllers
The functionality of control groups is enabled by kernel resource controllers. RHEL 8 supports various controllers for control groups version 1 (cgroups-v1
) and control groups version 2 (cgroups-v2
).
A resource controller, also called a control group subsystem, is a kernel subsystem that represents a single resource, such as CPU time, memory, network bandwidth or disk I/O. The Linux kernel provides a range of resource controllers that are mounted automatically by the systemd
system and service manager. Find a list of currently mounted resource controllers in the /proc/cgroups
file.
The following controllers are available for cgroups-v1
:
-
blkio
- can set limits on input/output access to and from block devices. -
cpu
- can adjust the parameters of the Completely Fair Scheduler (CFS) scheduler for control group’s tasks. It is mounted together with thecpuacct
controller on the same mount. -
cpuacct
- creates automatic reports on CPU resources used by tasks in a control group. It is mounted together with thecpu
controller on the same mount. -
cpuset
- can be used to restrict control group tasks to run only on a specified subset of CPUs and to direct the tasks to use memory only on specified memory nodes. -
devices
- can control access to devices for tasks in a control group. -
freezer
- can be used to suspend or resume tasks in a control group. -
memory
- can be used to set limits on memory use by tasks in a control group and generates automatic reports on memory resources used by those tasks. -
net_cls
- tags network packets with a class identifier (classid
) that enables the Linux traffic controller (thetc
command) to identify packets that originate from a particular control group task. A subsystem ofnet_cls
, thenet_filter
(iptables), can also use this tag to perform actions on such packets. Thenet_filter
tags network sockets with a firewall identifier (fwid
) that allows the Linux firewall (throughiptables
command) to identify packets originating from a particular control group task. -
net_prio
- sets the priority of network traffic. -
pids
- can set limits for a number of processes and their children in a control group. -
perf_event
- can group tasks for monitoring by theperf
performance monitoring and reporting utility. -
rdma
- can set limits on Remote Direct Memory Access/InfiniBand specific resources in a control group. -
hugetlb
- can be used to limit the usage of large size virtual memory pages by tasks in a control group.
The following controllers are available for cgroups-v2
:
-
io
- A follow-up toblkio
ofcgroups-v1
. -
memory
- A follow-up tomemory
ofcgroups-v1
. -
pids
- Same aspids
incgroups-v1
. -
rdma
- Same asrdma
incgroups-v1
. -
cpu
- A follow-up tocpu
andcpuacct
ofcgroups-v1
. -
cpuset
- Supports only the core functionality (cpus{,.effective}
,mems{,.effective}
) with a new partition feature. -
perf_event
- Support is inherent, no explicit control file. You can specify av2 cgroup
as a parameter to theperf
command that will profile all the tasks within thatcgroup
.
A resource controller can be used either in a cgroups-v1
hierarchy or a cgroups-v2
hierarchy, not simultaneously in both.
Additional resources
-
cgroups(7)
manual page -
Documentation in
/usr/share/doc/kernel-doc-<kernel_version>/Documentation/cgroups-v1/
directory (after installing thekernel-doc
package).
44.3. What are namespaces
Namespaces are one of the most important methods for organizing and identifying software objects.
A namespace wraps a global system resource (for example a mount point, a network device, or a hostname) in an abstraction that makes it appear to processes within the namespace that they have their own isolated instance of the global resource. One of the most common technologies that utilize namespaces are containers.
Changes to a particular global resource are visible only to processes in that namespace and do not affect the rest of the system or other namespaces.
To inspect which namespaces a process is a member of, you can check the symbolic links in the /proc/<PID>/ns/
directory.
The following table shows supported namespaces and resources which they isolate:
Namespace | Isolates |
---|---|
Mount | Mount points |
UTS | Hostname and NIS domain name |
IPC | System V IPC, POSIX message queues |
PID | Process IDs |
Network | Network devices, stacks, ports, etc |
User | User and group IDs |
Control groups | Control group root directory |
Additional resources
-
namespaces(7)
andcgroup_namespaces(7)
manual pages - Understanding control groups
44.4. Setting CPU limits to applications using cgroups-v1
Sometimes an application consumes a lot of CPU time, which may negatively impact the overall health of your environment. Use the /sys/fs/
virtual file system to configure CPU limits to an application using control groups version 1 (cgroups-v1
).
Prerequisites
- You have root permissions.
- You have an application whose CPU consumption you want to restrict.
You verified that the
cgroups-v1
controllers were mounted:# mount -l | grep cgroup tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpu,cpuacct) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,perf_event) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,pids) ...
Procedure
Identify the process ID (PID) of the application you want to restrict in CPU consumption:
# top top - 11:34:09 up 11 min, 1 user, load average: 0.51, 0.27, 0.22 Tasks: 267 total, 3 running, 264 sleeping, 0 stopped, 0 zombie %Cpu(s): 49.0 us, 3.3 sy, 0.0 ni, 47.5 id, 0.0 wa, 0.2 hi, 0.0 si, 0.0 st MiB Mem : 1826.8 total, 303.4 free, 1046.8 used, 476.5 buff/cache MiB Swap: 1536.0 total, 1396.0 free, 140.0 used. 616.4 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6955 root 20 0 228440 1752 1472 R 99.3 0.1 0:32.71 sha1sum 5760 jdoe 20 0 3603868 205188 64196 S 3.7 11.0 0:17.19 gnome-shell 6448 jdoe 20 0 743648 30640 19488 S 0.7 1.6 0:02.73 gnome-terminal- 1 root 20 0 245300 6568 4116 S 0.3 0.4 0:01.87 systemd 505 root 20 0 0 0 0 I 0.3 0.0 0:00.75 kworker/u4:4-events_unbound ...
The example output of the
top
program reveals thatPID 6955
(illustrative applicationsha1sum
) consumes a lot of CPU resources.Create a sub-directory in the
cpu
resource controller directory:# mkdir /sys/fs/cgroup/cpu/Example/
The directory above represents a control group, where you can place specific processes and apply certain CPU limits to the processes. At the same time, some
cgroups-v1
interface files andcpu
controller-specific files will be created in the directory.Optionally, inspect the newly created control group:
# ll /sys/fs/cgroup/cpu/Example/ -rw-r—r--. 1 root root 0 Mar 11 11:42 cgroup.clone_children -rw-r—r--. 1 root root 0 Mar 11 11:42 cgroup.procs -r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.stat -rw-r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage -r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_all -r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_percpu -r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_percpu_sys -r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_percpu_user -r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_sys -r—r—r--. 1 root root 0 Mar 11 11:42 cpuacct.usage_user -rw-r—r--. 1 root root 0 Mar 11 11:42 cpu.cfs_period_us -rw-r—r--. 1 root root 0 Mar 11 11:42 cpu.cfs_quota_us -rw-r—r--. 1 root root 0 Mar 11 11:42 cpu.rt_period_us -rw-r—r--. 1 root root 0 Mar 11 11:42 cpu.rt_runtime_us -rw-r—r--. 1 root root 0 Mar 11 11:42 cpu.shares -r—r—r--. 1 root root 0 Mar 11 11:42 cpu.stat -rw-r—r--. 1 root root 0 Mar 11 11:42 notify_on_release -rw-r—r--. 1 root root 0 Mar 11 11:42 tasks
The example output shows files, such as
cpuacct.usage
,cpu.cfs._period_us
, that represent specific configurations and/or limits, which can be set for processes in theExample
control group. Notice that the respective file names are prefixed with the name of the control group controller to which they belong.By default, the newly created control group inherits access to the system’s entire CPU resources without a limit.
Configure CPU limits for the control group:
# echo "1000000" > /sys/fs/cgroup/cpu/Example/cpu.cfs_period_us # echo "200000" > /sys/fs/cgroup/cpu/Example/cpu.cfs_quota_us
The
cpu.cfs_period_us
file represents a period of time in microseconds (µs, represented here as "us") for how frequently a control group’s access to CPU resources should be reallocated. The upper limit is 1 second and the lower limit is 1000 microseconds.The
cpu.cfs_quota_us
file represents the total amount of time in microseconds for which all processes collectively in a control group can run during one period (as defined bycpu.cfs_period_us
). As soon as processes in a control group, during a single period, use up all the time specified by the quota, they are throttled for the remainder of the period and not allowed to run until the next period. The lower limit is 1000 microseconds.The example commands above set the CPU time limits so that all processes collectively in the
Example
control group will be able to run only for 0.2 seconds (defined bycpu.cfs_quota_us
) out of every 1 second (defined bycpu.cfs_period_us
).Optionally, verify the limits:
# cat /sys/fs/cgroup/cpu/Example/cpu.cfs_period_us /sys/fs/cgroup/cpu/Example/cpu.cfs_quota_us 1000000 200000
Add the application’s PID to the
Example
control group:# echo "6955" > /sys/fs/cgroup/cpu/Example/cgroup.procs or # echo "6955" > /sys/fs/cgroup/cpu/Example/tasks
The previous command ensures that a desired application becomes a member of the
Example
control group and hence does not exceed the CPU limits configured for theExample
control group. The PID should represent an existing process in the system. ThePID 6955
here was assigned to processsha1sum /dev/zero &
, used to illustrate the use-case of thecpu
controller.Verify that the application runs in the specified control group:
# cat /proc/6955/cgroup 12:cpuset:/ 11:hugetlb:/ 10:net_cls,net_prio:/ 9:memory:/user.slice/user-1000.slice/user@1000.service 8:devices:/user.slice 7:blkio:/ 6:freezer:/ 5:rdma:/ 4:pids:/user.slice/user-1000.slice/user@1000.service 3:perf_event:/ 2:cpu,cpuacct:/Example 1:name=systemd:/user.slice/user-1000.slice/user@1000.service/gnome-terminal-server.service
The example output above shows that the process of the desired application runs in the
Example
control group, which applies CPU limits to the application’s process.Identify the current CPU consumption of your throttled application:
# top top - 12:28:42 up 1:06, 1 user, load average: 1.02, 1.02, 1.00 Tasks: 266 total, 6 running, 260 sleeping, 0 stopped, 0 zombie %Cpu(s): 11.0 us, 1.2 sy, 0.0 ni, 87.5 id, 0.0 wa, 0.2 hi, 0.0 si, 0.2 st MiB Mem : 1826.8 total, 287.1 free, 1054.4 used, 485.3 buff/cache MiB Swap: 1536.0 total, 1396.7 free, 139.2 used. 608.3 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6955 root 20 0 228440 1752 1472 R 20.6 0.1 47:11.43 sha1sum 5760 jdoe 20 0 3604956 208832 65316 R 2.3 11.2 0:43.50 gnome-shell 6448 jdoe 20 0 743836 31736 19488 S 0.7 1.7 0:08.25 gnome-terminal- 505 root 20 0 0 0 0 I 0.3 0.0 0:03.39 kworker/u4:4-events_unbound 4217 root 20 0 74192 1612 1320 S 0.3 0.1 0:01.19 spice-vdagentd ...
Notice that the CPU consumption of the
PID 6955
has decreased from 99% to 20%.
The cgroups-v2
counterpart for cpu.cfs_period_us
and cpu.cfs_quota_us
is the cpu.max
file. The cpu.max
file is available through the cpu
controller.
Additional resources
- Understanding control groups
- What kernel resource controllers are
-
cgroups(7)
,sysfs(5)
manual pages