Chapter 10. Using cgroupfs to manually manage cgroups
You can manage cgroup
hierarchies on your system by creating directories on the cgroupfs
virtual file system. The file system is mounted by default on the /sys/fs/cgroup/
directory and you can specify desired configurations in dedicated control files.
In general, Red Hat recommends you use systemd
for controlling the usage of system resources. You should manually configure the cgroups
virtual file system only in special cases. For example, when you need to use cgroup-v1
controllers that have no equivalents in cgroup-v2
hierarchy.
10.1. Creating cgroups and enabling controllers in cgroups-v2 file system Copy linkLink copied to clipboard!
You can manage the control groups (cgroups
) by creating or removing directories and by writing to files in the cgroups
virtual file system. The file system is by default mounted on the /sys/fs/cgroup/
directory. To use settings from the cgroups
controllers, you also need to enable the required controllers for child cgroups
. The root cgroup
has, by default, enabled the memory
and pids
controllers for its child cgroups
. Therefore, you must create at least two levels of child cgroups
inside the /sys/fs/cgroup/
root cgroup
. This way you optionally remove the memory
and pids
controllers from the child cgroups
and keep better organizational clarity of cgroup
files.
Prerequisites
- You have root permissions on the system.
Procedure
Create the
/sys/fs/cgroup/Example/
directory:mkdir /sys/fs/cgroup/Example/
# mkdir /sys/fs/cgroup/Example/
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
/sys/fs/cgroup/Example/
directory defines a child group. When you create the/sys/fs/cgroup/Example/
directory, somecgroups-v2
interface files are automatically created in the directory. The/sys/fs/cgroup/Example/
directory contains also controller-specific files for thememory
andpids
controllers.Optional: Inspect the newly created child control group:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The example output shows general
cgroup
control interface files such ascgroup.procs
orcgroup.controllers
. These files are common to all control groups, regardless of enabled controllers.The files such as
memory.high
andpids.max
relate to thememory
andpids
controllers, which are in the root control group (/sys/fs/cgroup/
), and are enabled by default bysystemd
.By default, the newly created child group inherits all settings from the parent
cgroup
. In this case, there are no limits from the rootcgroup
.Verify that the required controllers are available in the
/sys/fs/cgroup/cgroup.controllers
file:cat /sys/fs/cgroup/cgroup.controllers cpuset cpu io memory hugetlb pids rdma
# cat /sys/fs/cgroup/cgroup.controllers cpuset cpu io memory hugetlb pids rdma
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Enable the required controllers. In this example it is
cpu
andcpuset
controllers:echo "+cpu" >> /sys/fs/cgroup/cgroup.subtree_control echo "+cpuset" >> /sys/fs/cgroup/cgroup.subtree_control
# echo "+cpu" >> /sys/fs/cgroup/cgroup.subtree_control # echo "+cpuset" >> /sys/fs/cgroup/cgroup.subtree_control
Copy to Clipboard Copied! Toggle word wrap Toggle overflow These commands enable the
cpu
andcpuset
controllers for the immediate child groups of the/sys/fs/cgroup/
root control group. Including the newly createdExample
control group. A child group is where you can specify processes and apply control checks to each of the processes based on your criteria.Users can read the contents of the
cgroup.subtree_control
file at any level to get an idea of what controllers are going to be available for enablement in the immediate child group.NoteBy default, the
/sys/fs/cgroup/cgroup.subtree_control
file in the root control group containsmemory
andpids
controllers.Enable the required controllers for child
cgroups
of theExample
control group:echo "+cpu +cpuset" >> /sys/fs/cgroup/Example/cgroup.subtree_control
# echo "+cpu +cpuset" >> /sys/fs/cgroup/Example/cgroup.subtree_control
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command ensures that the immediate child control group will only have controllers relevant to regulate the CPU time distribution - not to
memory
orpids
controllers.Create the
/sys/fs/cgroup/Example/tasks/
directory:mkdir /sys/fs/cgroup/Example/tasks/
# mkdir /sys/fs/cgroup/Example/tasks/
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
/sys/fs/cgroup/Example/tasks/
directory defines a child group with files that relate purely tocpu
andcpuset
controllers. You can now assign processes to this control group and utilizecpu
andcpuset
controller options for your processes.Optional: Inspect the child control group:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
The cpu
controller is only activated if the relevant child control group has at least 2 processes which compete for time on a single CPU.
Verification
Optional: confirm that you have created a new
cgroup
with only the required controllers active:cat /sys/fs/cgroup/Example/tasks/cgroup.controllers cpuset cpu
# cat /sys/fs/cgroup/Example/tasks/cgroup.controllers cpuset cpu
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
10.2. Controlling distribution of CPU time for applications by adjusting CPU weight Copy linkLink copied to clipboard!
You need to assign values to the relevant files of the cpu
controller to regulate distribution of the CPU time to applications under the specific cgroup tree.
Prerequisites
- You have root permissions on the system.
- You have applications for which you want to control distribution of CPU time.
-
You mounted
cgroups-v2
filesystem. You created a two level hierarchy of child control groups inside the
/sys/fs/cgroup/
root control group as in the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
You enabled the
cpu
controller in the parent control group and in child control groups similarly as described in Creating cgroups and enabling controllers in cgroups-v2 file system.
Procedure
Configure the required CPU weights to achieve resource restrictions within the control groups:
echo "150" > /sys/fs/cgroup/Example/g1/cpu.weight echo "100" > /sys/fs/cgroup/Example/g2/cpu.weight echo "50" > /sys/fs/cgroup/Example/g3/cpu.weight
# echo "150" > /sys/fs/cgroup/Example/g1/cpu.weight # echo "100" > /sys/fs/cgroup/Example/g2/cpu.weight # echo "50" > /sys/fs/cgroup/Example/g3/cpu.weight
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the applications' PIDs to the
g1
,g2
, andg3
child groups:echo "33373" > /sys/fs/cgroup/Example/g1/cgroup.procs echo "33374" > /sys/fs/cgroup/Example/g2/cgroup.procs echo "33377" > /sys/fs/cgroup/Example/g3/cgroup.procs
# echo "33373" > /sys/fs/cgroup/Example/g1/cgroup.procs # echo "33374" > /sys/fs/cgroup/Example/g2/cgroup.procs # echo "33377" > /sys/fs/cgroup/Example/g3/cgroup.procs
Copy to Clipboard Copied! Toggle word wrap Toggle overflow These commands ensure that the required applications become members of the
Example/g*/
child cgroups and will get their CPU time distributed based on the configuration of those cgroups.The weights of the children cgroups (
g1
,g2
,g3
) that have running processes are summed up at the level of the parent cgroup (Example
). The CPU resource is then distributed proportionally based on the assigned weights.As a result, when all processes run at the same time, the kernel allocates to each of them the proportionate CPU time based on the assigned cgroup’s
cpu.weight
file:Expand Child cgroup cpu.weight
fileCPU time allocation g1
150
~50% (150/300)
g2
100
~33% (100/300)
g3
50
~16% (50/300)
The value of the
cpu.weight
controller file is not a percentage.If one process stopped running, leaving cgroup
g2
with no running processes, the calculation would omit the cgroupg2
and only account weights of cgroupsg1
andg3
:Expand Child cgroup cpu.weight
fileCPU time allocation g1
150
~75% (150/200)
g3
50
~25% (50/200)
ImportantIf a child cgroup has multiple running processes, the CPU time allocated to the cgroup is distributed equally among its member processes.
Verification
Verify that the applications run in the specified control groups:
cat /proc/33373/cgroup /proc/33374/cgroup /proc/33377/cgroup 0::/Example/g1 0::/Example/g2 0::/Example/g3
# cat /proc/33373/cgroup /proc/33374/cgroup /proc/33377/cgroup 0::/Example/g1 0::/Example/g2 0::/Example/g3
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The command output shows the processes of the specified applications that run in the
Example/g*/
child cgroups.Inspect the current CPU consumption of the throttled applications:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteAll processes run on a single CPU for clear illustration. The CPU weight applies the same principles when used on multiple CPUs.
Notice that the CPU resource for the
PID 33373
,PID 33374
, andPID 33377
was allocated based on the 150, 100, and 50 weights you assigned to child cgroups. The weights correspond to around 50%, 33%, and 16% allocation of CPU time for each application.
10.3. Mounting cgroups-v1 Copy linkLink copied to clipboard!
During the boot process, RHEL 10 mounts the cgroup-v2
virtual filesystem by default. To utilize cgroup-v1
functionality in limiting resources for your applications, manually configure the system.
Both cgroup-v1
and cgroup-v2
are fully enabled in the kernel. There is no default control group version from the kernel point of view, and is decided by systemd
to mount at startup.
Prerequisites
- You have root permissions on the system.
Procedure
Configure the system to mount
cgroups-v1
by default during system boot by thesystemd
system and service manager:grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller"
# grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller"
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This adds the necessary kernel command-line parameters to the current boot entry.
To add the same parameters to all kernel boot entries:
grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller"
# grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller"
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Reboot the system for the changes to take effect.
Verification
Verify that the
cgroups-v1
filesystem was mounted:Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
cgroups-v1
filesystems that correspond to variouscgroup-v1
controllers, were successfully mounted on the/sys/fs/cgroup/
directory.Inspect the contents of the
/sys/fs/cgroup/
directory:Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
/sys/fs/cgroup/
directory, also called the root control group, by default, contains controller-specific directories such ascpuset
. In addition, there are some directories related tosystemd
.
10.4. Setting CPU limits to applications using cgroups-v1 Copy linkLink copied to clipboard!
To configure CPU limits to an application by using control groups version 1 (cgroups-v1
), use the /sys/fs/
virtual file system.
Prerequisites
- You have root permissions on the system.
- You have an application to restrict its CPU consumption installed on your system.
You configured the system to mount
cgroups-v1
by default during system boot by thesystemd
system and service manager:grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller"
# grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller"
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This adds the necessary kernel command-line parameters to the current boot entry.
Procedure
Identify the process ID (PID) of the application that you want to restrict in CPU consumption:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
sha1sum
example application withPID 6955
consumes a large amount of CPU resources.Create a sub-directory in the
cpu
resource controller directory:mkdir /sys/fs/cgroup/cpu/Example/
# mkdir /sys/fs/cgroup/cpu/Example/
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This directory represents a control group, where you can place specific processes and apply certain CPU limits to the processes. At the same time, a number of
cgroups-v1
interface files andcpu
controller-specific files will be created in the directory.Optional: Inspect the newly created control group:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Files, such as
cpuacct.usage
,cpu.cfs._period_us
represent specific configurations and/or limits, which can be set for processes in theExample
control group. Note that the file names are prefixed with the name of the control group controller they belong to.By default, the newly created control group inherits access to the system’s entire CPU resources without a limit.
Configure CPU limits for the control group:
echo "1000000" > /sys/fs/cgroup/cpu/Example/cpu.cfs_period_us echo "200000" > /sys/fs/cgroup/cpu/Example/cpu.cfs_quota_us
# echo "1000000" > /sys/fs/cgroup/cpu/Example/cpu.cfs_period_us # echo "200000" > /sys/fs/cgroup/cpu/Example/cpu.cfs_quota_us
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
The
cpu.cfs_period_us
file represents how often a control group’s access to CPU resources must be reallocated. The time period is in microseconds (µs, "us"). The upper limit is 1 000 000 microseconds and the lower limit is 1000 microseconds. The
cpu.cfs_quota_us
file represents the total amount of time in microseconds for which all processes in a control group can collectively run during one period, as defined bycpu.cfs_period_us
. When processes in a control group use up all the time specified by the quota during a single period, they are throttled for the remainder of the period and not allowed to run until the next period. The lower limit is 1000 microseconds.The example commands above set the CPU time limits so that all processes collectively in the
Example
control group will be able to run only for 0.2 seconds (defined bycpu.cfs_quota_us
) out of every 1 second (defined bycpu.cfs_period_us
).
-
The
Optional: Verify the limits:
cat /sys/fs/cgroup/cpu/Example/cpu.cfs_period_us /sys/fs/cgroup/cpu/Example/cpu.cfs_quota_us 1000000 200000
# cat /sys/fs/cgroup/cpu/Example/cpu.cfs_period_us /sys/fs/cgroup/cpu/Example/cpu.cfs_quota_us 1000000 200000
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add the application’s PID to the
Example
control group:echo "6955" > /sys/fs/cgroup/cpu/Example/cgroup.procs
# echo "6955" > /sys/fs/cgroup/cpu/Example/cgroup.procs
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command ensures that a specific application becomes a member of the
Example
control group and does not exceed the CPU limits configured for theExample
control group. The PID must represent an existing process in the system. ThePID 6955
here was assigned to thesha1sum /dev/zero &
process, used to illustrate the use case of thecpu
controller.
Verification
Verify that the application runs in the specified control group:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The process of an application runs in the
Example
control group applying CPU limits to the application’s process.Identify the current CPU consumption of your throttled application:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Note that the CPU consumption of the
PID 6955
has decreased from 99% to 20%.
The cgroups-v2
counterpart for cpu.cfs_period_us
and cpu.cfs_quota_us
is the cpu.max
file. The cpu.max
file is available through the cpu
controller.
Using the control groups (cgroups
) kernel functionality, you can control resource usage of applications to use them more efficiently.
You can use cgroups
for the following tasks:
- Setting limits for system resource allocation.
- Prioritizing the allocation of hardware resources to specific processes.
- Isolating certain processes from obtaining hardware resources.
10.5. Introducing control groups Copy linkLink copied to clipboard!
Using the control groups Linux kernel feature, you can organize processes into hierarchically ordered groups - cgroups
. You define the hierarchy (control groups tree) by providing structure to cgroups
virtual file system, mounted by default on the /sys/fs/cgroup/
directory.
The systemd
service manager uses cgroups
to organize all units and services that it governs. Manually, you can manage the hierarchies of cgroups
by creating and removing sub-directories in the /sys/fs/cgroup/
directory.
The resource controllers in the kernel then modify the behavior of processes in cgroups
by limiting, prioritizing or allocating system resources, of those processes. These resources include the following:
- CPU time
- Memory
- Network bandwidth
- Combinations of these resources
The primary use case of cgroups
is aggregating system processes and dividing hardware resources among applications and users. This makes it possible to increase the efficiency, stability, and security of your environment.
- Control groups version 1
Control groups version 1 (
cgroups-v1
) provide a per-resource controller hierarchy. Each resource, such as CPU, memory, or I/O, has its own control group hierarchy. You can combine different control group hierarchies in a way that one controller can coordinate with another in managing their respective resources. However, when the two controllers belong to different process hierarchies, the coordination is limited.The
cgroups-v1
controllers were developed across a large time span, resulting in inconsistent behavior and naming of their control files.- Control groups version 2
Control groups version 2 (
cgroups-v2
) provide a single control group hierarchy against which all resource controllers are mounted.The control file behavior and naming is consistent among different controllers.
10.6. Introducing kernel resource controllers Copy linkLink copied to clipboard!
Kernel resource controllers enable the functionality of control groups. RHEL 10 supports various controllers for control groups version 1 (cgroups-v1
) and control groups version 2 (cgroups-v2
).
A resource controller, also called a control group subsystem, is a kernel subsystem that represents a single resource, such as CPU time, memory, network bandwidth or disk I/O. The Linux kernel provides a range of resource controllers that are mounted automatically by the systemd
service manager.
You can find a list of the currently mounted resource controllers in the /proc/cgroups
file.
Controllers available for cgroups-v1
:
blkio
- Sets limits on input/output access to and from block devices.
cpu
-
Adjusts the parameters of the default scheduler for a control group’s tasks. The
cpu
controller is mounted together with thecpuacct
controller on the same mount. cpuacct
-
Creates automatic reports on CPU resources used by tasks in a control group. The
cpuacct
controller is mounted together with thecpu
controller on the same mount. cpuset
- Restricts control group tasks to run only on a specified subset of CPUs and to direct the tasks to use memory only on specified memory nodes.
devices
- Controls access to devices for tasks in a control group.
freezer
- Suspends or resumes tasks in a control group.
memory
- Sets limits on memory use by tasks in a control group and generates automatic reports on memory resources used by those tasks.
net_cls
-
Tags network packets with a class identifier (
classid
) that enables the Linux traffic controller (thetc
command) to identify packets that originate from a particular control group task. A subsystem ofnet_cls
, thenet_filter
(iptables), can also use this tag to perform actions on such packets. Thenet_filter
tags network sockets with a firewall identifier (fwid
) that allows the Linux firewall to identify packets that originate from a particular control group task (by using theiptables
command). net_prio
- Sets the priority of network traffic.
pids
- Sets limits for multiple processes and their children in a control group.
perf_event
-
Groups tasks for monitoring by the
perf
performance monitoring and reporting utility. rdma
- Sets limits on Remote Direct Memory Access/InfiniBand specific resources in a control group.
hugetlb
- Limits the usage of large size virtual memory pages by tasks in a control group.
Controllers available for cgroups-v2
:
io
- Sets limits on input/output access to and from block devices.
memory
- Sets limits on memory use by tasks in a control group and generates automatic reports on memory resources used by those tasks.
pids
- Sets limits for multiple processes and their children in a control group.
rdma
- Sets limits on Remote Direct Memory Access/InfiniBand specific resources in a control group.
cpu
- Adjusts the parameters of the default scheduler for a control group’s tasks and creates automatic reports on CPU resources used by tasks in a control group.
cpuset
-
Restricts control group tasks to run only on a specified subset of CPUs and to direct the tasks to use memory only on specified memory nodes. Supports only the core functionality (
cpus{,.effective}
,mems{,.effective}
) with a new partition feature. perf_event
-
Groups tasks for monitoring by the
perf
performance monitoring and reporting utility.perf_event
is enabled automatically on the v2 hierarchy.
A resource controller can be used either in a cgroups-v1
hierarchy or a cgroups-v2
hierarchy, not simultaneously in both.
10.7. Introducing namespaces Copy linkLink copied to clipboard!
Namespaces create separate spaces for organizing and identifying software objects. This keeps them from affecting each other. As a result, each software object contains its own set of resources, for example, a mount point, a network device, or a hostname, even though they are sharing the same system.
One of the most common technologies that use namespaces are containers.
Changes to a particular global resource are visible only to processes in that namespace and do not affect the rest of the system or other namespaces.
To inspect which namespaces a process is a member of, you can check the symbolic links in the /proc/<PID>/ns/
directory.
Namespace | Isolates |
---|---|
Mount | Mount points |
UTS | Hostname and NIS domain name |
IPC | System V IPC, POSIX message queues |
PID | Process IDs |
Network | Network devices, stacks, ports, etc |
User | User and group IDs |
Control groups | Control group root directory |