Chapter 25. Using control groups version 1 with systemd
You can manage cgroups
with the systemd
system and service manager and the utilities they provide. This is also the preferred way of the cgroups
management.
25.1. Role of systemd in control groups version 1
RHEL 8 moves the resource management settings from the process level to the application level by binding the system of cgroup
hierarchies with the systemd
unit tree. Therefore, you can manage the system resources with the systemctl
command, or by modifying the systemd
unit files.
By default, the systemd
system and service manager makes use of the slice
, the scope
and the service
units to organize and structure processes in the control groups. The systemctl
command enables you to further modify this structure by creating custom slices
. Also, systemd
automatically mounts hierarchies for important kernel resource controllers in the /sys/fs/cgroup/
directory.
Three systemd
unit types are used for resource control:
Service - A process or a group of processes, which
systemd
started according to a unit configuration file. Services encapsulate the specified processes so that they can be started and stopped as one set. Services are named in the following way:<name>.service
Scope - A group of externally created processes. Scopes encapsulate processes that are started and stopped by the arbitrary processes through the
fork()
function and then registered bysystemd
at runtime. For example, user sessions, containers, and virtual machines are treated as scopes. Scopes are named as follows:<name>.scope
Slice - A group of hierarchically organized units. Slices organize a hierarchy in which scopes and services are placed. The actual processes are contained in scopes or in services. Every name of a slice unit corresponds to the path to a location in the hierarchy. The dash ("-") character acts as a separator of the path components to a slice from the
-.slice
root slice. In the following example:<parent-name>.slice
parent-name.slice
is a sub-slice ofparent.slice
, which is a sub-slice of the-.slice
root slice.parent-name.slice
can have its own sub-slice namedparent-name-name2.slice
, and so on.
The service
, the scope
, and the slice
units directly map to objects in the control group hierarchy. When these units are activated, they map directly to control group paths built from the unit names.
The following is an abbreviated example of a control group hierarchy:
Control group /: -.slice ├─user.slice │ ├─user-42.slice │ │ ├─session-c1.scope │ │ │ ├─ 967 gdm-session-worker [pam/gdm-launch-environment] │ │ │ ├─1035 /usr/libexec/gdm-x-session gnome-session --autostart /usr/share/gdm/greeter/autostart │ │ │ ├─1054 /usr/libexec/Xorg vt1 -displayfd 3 -auth /run/user/42/gdm/Xauthority -background none -noreset -keeptty -verbose 3 │ │ │ ├─1212 /usr/libexec/gnome-session-binary --autostart /usr/share/gdm/greeter/autostart │ │ │ ├─1369 /usr/bin/gnome-shell │ │ │ ├─1732 ibus-daemon --xim --panel disable │ │ │ ├─1752 /usr/libexec/ibus-dconf │ │ │ ├─1762 /usr/libexec/ibus-x11 --kill-daemon │ │ │ ├─1912 /usr/libexec/gsd-xsettings │ │ │ ├─1917 /usr/libexec/gsd-a11y-settings │ │ │ ├─1920 /usr/libexec/gsd-clipboard … ├─init.scope │ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18 └─system.slice ├─rngd.service │ └─800 /sbin/rngd -f ├─systemd-udevd.service │ └─659 /usr/lib/systemd/systemd-udevd ├─chronyd.service │ └─823 /usr/sbin/chronyd ├─auditd.service │ ├─761 /sbin/auditd │ └─763 /usr/sbin/sedispatch ├─accounts-daemon.service │ └─876 /usr/libexec/accounts-daemon ├─example.service │ ├─ 929 /bin/bash /home/jdoe/example.sh │ └─4902 sleep 1 …
The example above shows that services and scopes contain processes and are placed in slices that do not contain processes of their own.
Additional resources
- What are kernel resource controllers
-
systemd.resource-control(5)
,cgroups(7)
,fork()
,fork(2)
manual pages
25.2. Creating transient control groups
The transient cgroups
set limits on resources consumed by a unit (service or scope) during its runtime.
Procedure
To create a transient control group, use the
systemd-run
command in the following format:# systemd-run --unit=<name> --slice=<name>.slice <command>
This command creates and starts a transient service or a scope unit and runs a custom command in such a unit.
-
The
--unit=<name>
option gives a name to the unit. If--unit
is not specified, the name is generated automatically. -
The
--slice=<name>.slice
option makes your service or scope unit a member of a specified slice. Replace<name>.slice
with the name of an existing slice (as shown in the output ofsystemctl -t slice
), or create a new slice by passing a unique name. By default, services and scopes are created as members of thesystem.slice
. Replace
<command>
with the command you want to enter in the service or the scope unit.The following message is displayed to confirm that you created and started the service or the scope successfully:
# Running as unit <name>.service
-
The
Optional: Keep the unit running after its processes finished to collect run-time information:
# systemd-run --unit=<name> --slice=<name>.slice --remain-after-exit <command>
The command creates and starts a transient service unit and runs a custom command in the unit. The
--remain-after-exit
option ensures that the service keeps running after its processes have finished.
Additional resources
- What are control groups
- Role of systemd in control groups
- Managing systemd in RHEL
-
The
systemd-run(1)
manual page
25.3. Creating persistent control groups
To assign a persistent control group to a service, it is necessary to edit its unit configuration file. The configuration is preserved after the system reboot, so it can be used to manage services that are started automatically.
Procedure
To create a persistent control group, enter:
# systemctl enable <name>.service
The command above automatically creates a unit configuration file into the
/usr/lib/systemd/system/
directory and by default, it assigns<name>.service
to thesystem.slice
unit.
Additional resources
- Understanding control groups
- Role of systemd in control groups
- Managing system services with systemctl in RHEL
-
systemd-run(1)
manual page
25.4. Configuring memory resource control settings on the command-line
Executing commands in the command-line interface is one of the ways how to set limits, prioritize, or control access to hardware resources for groups of processes.
Procedure
To limit the memory usage of a service, run the following:
# systemctl set-property example.service MemoryMax=1500K
The command instantly assigns the memory limit of 1,500 KB to processes executed in a control group the
example.service
service belongs to. TheMemoryMax
parameter, in this configuration variant, is defined in the/etc/systemd/system.control/example.service.d/50-MemoryMax.conf
file and controls the value of the/sys/fs/cgroup/memory/system.slice/example.service/memory.limit_in_bytes
file.Optionally, to temporarily limit the memory usage of a service, run:
# systemctl set-property --runtime example.service MemoryMax=1500K
The command instantly assigns the memory limit to the
example.service
service. TheMemoryMax
parameter is defined until the next reboot in the/run/systemd/system.control/example.service.d/50-MemoryMax.conf
file. With a reboot, the whole/run/systemd/system.control/
directory andMemoryMax
are removed.
The 50-MemoryMax.conf
file stores the memory limit as a multiple of 4096 bytes - one kernel page size specific for AMD64 and Intel 64. The actual number of bytes depends on a CPU architecture.
Additional resources
- What are control groups
- What are kernel resource controllers
-
systemd.resource-control(5)
andcgroups(7)
manual pages - Role of systemd in control groups
25.5. Configuring memory resource control settings with unit files
Each persistent unit is supervised by the systemd
system and service manager, and has a unit configuration file in the /usr/lib/systemd/system/
directory. To change the resource control settings of the persistent units, modify its unit configuration file either manually in a text editor or from the command-line interface.
Manually modifying unit files is one of the ways how to set limits, prioritize, or control access to hardware resources for groups of processes.
Procedure
To limit the memory usage of a service, modify the
/usr/lib/systemd/system/example.service
file as follows:… [Service] MemoryMax=1500K …
The configuration above places a limit on maximum memory consumption of processes executed in a control group, which
example.service
is part of.NoteUse suffixes K, M, G, or T to identify Kilobyte, Megabyte, Gigabyte, or Terabyte as a unit of measurement.
Reload all unit configuration files:
# systemctl daemon-reload
Restart the service:
# systemctl restart example.service
- Reboot the system.
Optionally, check that the changes took effect:
# cat /sys/fs/cgroup/memory/system.slice/example.service/memory.limit_in_bytes 1536000
The example output shows that the memory consumption was limited at around 1,500 KB.
NoteThe
memory.limit_in_bytes
file stores the memory limit as a multiple of 4096 bytes - one kernel page size specific for AMD64 and Intel 64. The actual number of bytes depends on a CPU architecture.
Additional resources
- Understanding control groups
- What kernel resource controllers are
-
systemd.resource-control(5)
,cgroups(7)
manual pages - Managing system services with systemctl in RHEL
- Role of systemd in control groups version 1
25.6. Removing transient control groups
You can use the systemd
system and service manager to remove transient control groups (cgroups
) if you no longer need to limit, prioritize, or control access to hardware resources for groups of processes.
Transient cgroups
are automatically released once all the processes that a service or a scope unit contains finish.
Procedure
To stop the service unit with all its processes, enter:
# systemctl stop <name>.service
To terminate one or more of the unit processes, enter:
# systemctl kill <name>.service --kill-who=PID,… --signal=<signal>
The command uses the
--kill-who
option to select process(es) from the control group you want to terminate. To kill multiple processes at the same time, pass a comma-separated list of PIDs. The--signal
option determines the type of POSIX signal to be sent to the specified processes. The default signal is SIGTERM.
Additional resources
- What are control groups
- What are kernel resource controllers
-
The
systemd.resource-control(5)
andcgroups(7)
man pages - Role of systemd in control groups version 1
- Managing systemd in RHEL
25.7. Removing persistent control groups
You can use the systemd
system and service manager to remove persistent control groups (cgroups
) if you no longer need to limit, prioritize, or control access to hardware resources for groups of processes.
Persistent cgroups
are released when a service or a scope unit is stopped or disabled and its configuration file is deleted.
Procedure
Stop the service unit:
# systemctl stop <name>.service
Disable the service unit:
# systemctl disable <name>.service
Remove the relevant unit configuration file:
# rm /usr/lib/systemd/system/<name>.service
Reload all unit configuration files so that changes take effect:
# systemctl daemon-reload
Additional resources
- What are control groups
- What are kernel resource controllers
-
systemd.resource-control(5)
,cgroups(7)
, andsystemd.kill(5)
manual pages - Role of systemd in control groups
- Managing system services with systemctl in RHEL
25.8. Listing systemd units
Use the systemd
system and service manager to list its units.
Procedure
List all active units on the system with the
systemctl
utility. The terminal returns an output similar to the following example:# systemctl UNIT LOAD ACTIVE SUB DESCRIPTION … init.scope loaded active running System and Service Manager session-2.scope loaded active running Session 2 of user jdoe abrt-ccpp.service loaded active exited Install ABRT coredump hook abrt-oops.service loaded active running ABRT kernel log watcher abrt-vmcore.service loaded active exited Harvest vmcores for ABRT abrt-xorg.service loaded active running ABRT Xorg log watcher … -.slice loaded active active Root Slice machine.slice loaded active active Virtual Machine and Container Slice system-getty.slice loaded active active system-getty.slice system-lvm2\x2dpvscan.slice loaded active active system-lvm2\x2dpvscan.slice system-sshd\x2dkeygen.slice loaded active active system-sshd\x2dkeygen.slice system-systemd\x2dhibernate\x2dresume.slice loaded active active system-systemd\x2dhibernate\x2dresume> system-user\x2druntime\x2ddir.slice loaded active active system-user\x2druntime\x2ddir.slice system.slice loaded active active System Slice user-1000.slice loaded active active User Slice of UID 1000 user-42.slice loaded active active User Slice of UID 42 user.slice loaded active active User and Session Slice …
UNIT
- A name of a unit that also reflects the unit position in a control group hierarchy. The units relevant for resource control are a slice, a scope, and a service.
LOAD
- Indicates whether the unit configuration file was properly loaded. If the unit file failed to load, the field contains the state error instead of loaded. Other unit load states are: stub, merged, and masked.
ACTIVE
-
The high-level unit activation state, which is a generalization of
SUB
. SUB
- The low-level unit activation state. The range of possible values depends on the unit type.
DESCRIPTION
- The description of the unit content and functionality.
List all active and inactive units:
# systemctl --all
Limit the amount of information in the output:
# systemctl --type service,masked
The
--type
option requires a comma-separated list of unit types such as a service and a slice, or unit load states such as loaded and masked.
Additional resources
- Managing system services with systemctl in RHEL
-
The
systemd.resource-control(5)
,systemd.exec(5)
manual pages
25.9. Viewing systemd cgroups hierarchy
Display control groups (cgroups
) hierarchy and processes running in specific cgroups
.
Procedure
Display the whole
cgroups
hierarchy on your system with thesystemd-cgls
command.# systemd-cgls Control group /: -.slice ├─user.slice │ ├─user-42.slice │ │ ├─session-c1.scope │ │ │ ├─ 965 gdm-session-worker [pam/gdm-launch-environment] │ │ │ ├─1040 /usr/libexec/gdm-x-session gnome-session --autostart /usr/share/gdm/greeter/autostart … ├─init.scope │ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18 └─system.slice … ├─example.service │ ├─6882 /bin/bash /home/jdoe/example.sh │ └─6902 sleep 1 ├─systemd-journald.service └─629 /usr/lib/systemd/systemd-journald …
The example output returns the entire
cgroups
hierarchy, where the highest level is formed by slices.Display the
cgroups
hierarchy filtered by a resource controller with thesystemd-cgls <resource_controller>
command.# systemd-cgls memory Controller memory; Control group /: ├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18 ├─user.slice │ ├─user-42.slice │ │ ├─session-c1.scope │ │ │ ├─ 965 gdm-session-worker [pam/gdm-launch-environment] … └─system.slice | … ├─chronyd.service │ └─844 /usr/sbin/chronyd ├─example.service │ ├─8914 /bin/bash /home/jdoe/example.sh │ └─8916 sleep 1 …
The example output lists the services that interact with the selected controller.
Display detailed information about a certain unit and its part of the
cgroups
hierarchy with thesystemctl status <system_unit>
command.# systemctl status example.service ● example.service - My example service Loaded: loaded (/usr/lib/systemd/system/example.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2019-04-16 12:12:39 CEST; 3s ago Main PID: 17737 (bash) Tasks: 2 (limit: 11522) Memory: 496.0K (limit: 1.5M) CGroup: /system.slice/example.service ├─17737 /bin/bash /home/jdoe/example.sh └─17743 sleep 1 Apr 16 12:12:39 redhat systemd[1]: Started My example service. Apr 16 12:12:39 redhat bash[17737]: The current time is Tue Apr 16 12:12:39 CEST 2019 Apr 16 12:12:40 redhat bash[17737]: The current time is Tue Apr 16 12:12:40 CEST 2019
Additional resources
- What are kernel resource controllers
-
The
systemd.resource-control(5)
andcgroups(7)
man pages
25.10. Viewing resource controllers
Find out which processes use which resource controllers.
Procedure
View which resource controllers a process interacts with, enter the
cat proc/<PID>/cgroup
command.# cat /proc/11269/cgroup 12:freezer:/ 11:cpuset:/ 10:devices:/system.slice 9:memory:/system.slice/example.service 8:pids:/system.slice/example.service 7:hugetlb:/ 6:rdma:/ 5:perf_event:/ 4:cpu,cpuacct:/ 3:net_cls,net_prio:/ 2:blkio:/ 1:name=systemd:/system.slice/example.service
The example output relates to a process of interest. In this case, it is a process identified by
PID 11269
, which belongs to theexample.service
unit. You can determine whether the process was placed in a correct control group as defined by thesystemd
unit file specifications.NoteBy default, the items and their ordering in the list of resource controllers is the same for all units started by
systemd
, since it automatically mounts all the default resource controllers.
Additional resources
-
The
cgroups(7)
manual page -
Documentation in the
/usr/share/doc/kernel-doc-<kernel_version>/Documentation/cgroups-v1/
directory
25.11. Monitoring resource consumption
View a list of currently running control groups (cgroups
) and their resource consumption in real-time.
Procedure
Display a dynamic account of currently running
cgroups
with thesystemd-cgtop
command.# systemd-cgtop Control Group Tasks %CPU Memory Input/s Output/s / 607 29.8 1.5G - - /system.slice 125 - 428.7M - - /system.slice/ModemManager.service 3 - 8.6M - - /system.slice/NetworkManager.service 3 - 12.8M - - /system.slice/accounts-daemon.service 3 - 1.8M - - /system.slice/boot.mount - - 48.0K - - /system.slice/chronyd.service 1 - 2.0M - - /system.slice/cockpit.socket - - 1.3M - - /system.slice/colord.service 3 - 3.5M - - /system.slice/crond.service 1 - 1.8M - - /system.slice/cups.service 1 - 3.1M - - /system.slice/dev-hugepages.mount - - 244.0K - - /system.slice/dev-mapper-rhel\x2dswap.swap - - 912.0K - - /system.slice/dev-mqueue.mount - - 48.0K - - /system.slice/example.service 2 - 2.0M - - /system.slice/firewalld.service 2 - 28.8M - - ...
The example output displays currently running
cgroups
ordered by their resource usage (CPU, memory, disk I/O load). The list refreshes every 1 second by default. Therefore, it offers a dynamic insight into the actual resource usage of each control group.
Additional resources
-
The
systemd-cgtop(1)
manual page