Chapter 26. Using systemd to manage resources used by applications
RHEL 9 binds cgroup hierarchies to the systemd unit tree. This moves resource management from the process level to the application level. You can manage system resources by using the systemctl command or by modifying systemd unit files.
To achieve this, systemd takes various configuration options from the unit files or directly via the systemctl command. Then systemd applies those options to specific process groups by using the Linux kernel system calls and features like cgroups and namespaces.
You can review the full set of configuration options for systemd in the following manual pages:
-
systemd.resource-control(5) -
systemd.exec(5)
26.1. Role of systemd in resource management Copy linkLink copied to clipboard!
Systemd manages and supervises services, ensuring they start at the right time and in the correct order during boot, run smoothly to optimize hardware usage, and provides capabilities to define resource management policies and tune performance options.
Use systemd to control the usage of system resources. Manually configure the cgroups virtual file system only in special cases, such as when you need to use cgroups-v1 controllers that have no equivalents in the cgroups-v2 hierarchy.
26.2. Distribution models of system sources Copy linkLink copied to clipboard!
Apply one or more distribution models to modify how system resources are distributed among processes and services.
- Weights
You can distribute the resource by adding up the weights of all sub-groups and giving each sub-group the fraction matching its ratio against the sum.
For example, if you have 10 cgroups, each with weight of value 100, the sum is 1000. Each cgroup receives one tenth of the resource.
Weight is usually used to distribute stateless resources. For example the CPUWeight= option is an implementation of this resource distribution model.
- Limits
A cgroup can consume up to the configured amount of the resource. The sum of sub-group limits can exceed the limit of the parent cgroup. Therefore it is possible to overcommit resources in this model.
For example the MemoryMax= option is an implementation of this resource distribution model.
- Protections
You can set up a protected amount of a resource for a cgroup. If the resource usage is below the protection boundary, the kernel will try not to penalize this cgroup in favor of other cgroups that compete for the same resource. An overcommit is also possible.
For example the MemoryLow= option is an implementation of this resource distribution model.
- Allocations
- Exclusive allocations of an absolute amount of a finite resource. An overcommit is not possible. An example of this resource type in Linux is the real-time budget.
- unit file option
A setting for resource control configuration.
For example, you can configure CPU resource with options like CPUAccounting=, or CPUQuota=. Similarly, you can configure memory or I/O resources with options like AllowedMemoryNodes= and IOAccounting=.
26.3. Allocating system resources using systemd Copy linkLink copied to clipboard!
Allocate system resources by creating and managing systemd services and units, which can be configured to start, stop, or restart at specific times or in response to system events.
Procedure
To change the required value of the unit file option of your service, you can adjust the value in the unit file, or use the systemctl command:
Check the assigned values for the service of your choice.
systemctl show --property <unit file option> <service name>
# systemctl show --property <unit file option> <service name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set the required value of the CPU time allocation policy option:
systemctl set-property <service name> <unit file option>=<value>
# systemctl set-property <service name> <unit file option>=<value>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check the newly assigned values for the service of your choice.
systemctl show --property <unit file option> <service name>
# systemctl show --property <unit file option> <service name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
26.4. Overview of systemd hierarchy for cgroups Copy linkLink copied to clipboard!
Systemd system and service manager uses slice, scope, and service units to organize processes in control groups. Modify this hierarchy by creating custom unit files or using the systemctl command. Systemd automatically mounts hierarchies for kernel resource controllers at /sys/fs/cgroup/.
For resource control, you can use the following three systemd unit types:
- Service
A process or a group of processes, which systemd started according to a unit configuration file.
Services encapsulate the specified processes so that they can be started and stopped as one set. Services are named in the following way:
<name>.service
<name>.serviceCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Scope
A group of externally created processes. Scopes encapsulate processes that are started and stopped by the arbitrary processes through the
fork()function and then registered by systemd at runtime. For example, user sessions, containers, and virtual machines are treated as scopes. Scopes are named as follows:<name>.scope
<name>.scopeCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Slice
A group of hierarchically organized units. Slices organize a hierarchy in which scopes and services are placed.
The actual processes are contained in scopes or in services. Every name of a slice unit corresponds to the path to a location in the hierarchy.
The dash (
-) character acts as a separator of the path components to a slice from the-.sliceroot slice. In the following example:<parent-name>.slice
<parent-name>.sliceCopy to Clipboard Copied! Toggle word wrap Toggle overflow parent-name.sliceis a sub-slice ofparent.slice, which is a sub-slice of the-.sliceroot slice.parent-name.slicecan have its own sub-slice namedparent-name-name2.slice, and so on.
The service, the scope, and the slice units directly map to objects in the control group hierarchy. When these units are activated, they map directly to control group paths built from the unit names.
The following is an abbreviated example of a control group hierarchy:
The example above shows that services and scopes contain processes and are placed in slices that do not contain processes of their own.
26.5. Listing systemd units Copy linkLink copied to clipboard!
Use the systemd system and service manager to list its units.
Procedure
List all active units on the system with the
systemctlutility. The terminal returns an output similar to the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow UNIT- A name of a unit that also reflects the unit position in a control group hierarchy. The units relevant for resource control are a slice, a scope, and a service.
LOAD- Indicates whether the unit configuration file was properly loaded. If the unit file failed to load, the field provides the state error instead of loaded. Other unit load states are: stub, merged, and masked.
ACTIVE-
The high-level unit activation state, which is a generalization of
SUB. SUB- The low-level unit activation state. The range of possible values depends on the unit type.
DESCRIPTION- The description of the unit content and functionality.
List all active and inactive units:
systemctl --all
# systemctl --allCopy to Clipboard Copied! Toggle word wrap Toggle overflow Limit the amount of information in the output:
systemctl --type service,masked
# systemctl --type service,maskedCopy to Clipboard Copied! Toggle word wrap Toggle overflow The
--typeoption requires a comma-separated list of unit types such as a service and a slice, or unit load states such as loaded and masked.
26.6. Viewing systemd cgroups hierarchy Copy linkLink copied to clipboard!
Display control groups (cgroups) hierarchy and processes running in specific cgroups.
Procedure
Display the whole
cgroupshierarchy on your system with thesystemd-cglscommand.Copy to Clipboard Copied! Toggle word wrap Toggle overflow The example output returns the entire
cgroupshierarchy, where the highest level is formed by slices.Display the
cgroupshierarchy filtered by a resource controller with thesystemd-cgls <resource_controller>command.Copy to Clipboard Copied! Toggle word wrap Toggle overflow The example output lists the services that interact with the selected controller.
Display detailed information about a certain unit and its part of the
cgroupshierarchy with thesystemctl status <system_unit>command.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
26.7. Viewing cgroups of processes Copy linkLink copied to clipboard!
Identify which control group (cgroup) a process belongs to and check which controllers and controller-specific configurations the cgroup uses.
Procedure
To view which
cgroupa process belongs to, run the# cat proc/<PID>/cgroupcommand:cat /proc/2467/cgroup 0::/system.slice/example.service
# cat /proc/2467/cgroup 0::/system.slice/example.serviceCopy to Clipboard Copied! Toggle word wrap Toggle overflow The example output relates to a process of interest. In this case, it is a process identified by
PID 2467, which belongs to theexample.serviceunit. You can determine whether the process is placed in a correct control group as defined by the systemd unit file specifications.To display what controllers and their configuration files the
cgroupuses, check thecgroupdirectory:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
The version 1 hierarchy of cgroups uses a per-controller model. Therefore the output from the /proc/PID/cgroup file shows, which cgroups under each controller the PID belongs to. You can find the cgroups under the controller directories at /sys/fs/cgroup/<controller_name>/.
26.8. Monitoring resource consumption Copy linkLink copied to clipboard!
View a list of currently running control groups (cgroups) and their resource consumption in real-time.
Procedure
Display a dynamic account of currently running
cgroupswith thesystemd-cgtopcommand.Copy to Clipboard Copied! Toggle word wrap Toggle overflow The example output displays currently running
cgroupsordered by their resource usage (CPU, memory, disk I/O load). The list refreshes every 1 second by default. Therefore, it offers a dynamic insight into the actual resource usage of each control group.
26.9. Using systemd unit files to set limits for applications Copy linkLink copied to clipboard!
Modify systemd unit files in /usr/lib/systemd/system/ to set limits, prioritize, and control access to hardware resources for groups of processes. The systemd service manager supervises units and creates control groups for them.
Prerequisites
-
You have the
rootprivileges.
Procedure
Edit the
/usr/lib/systemd/system/example.servicefile to limit the memory usage of a service:… [Service] MemoryMax=1500K …
… [Service] MemoryMax=1500K …Copy to Clipboard Copied! Toggle word wrap Toggle overflow The configuration limits the maximum memory that the processes in a control group cannot exceed. The
example.serviceservice is part of such a control group which has imposed limitations. You can use suffixes K, M, G, or T to identify Kilobyte, Megabyte, Gigabyte, or Terabyte as a unit of measurement.Reload all unit configuration files:
systemctl daemon-reload
# systemctl daemon-reloadCopy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the service:
systemctl restart example.service
# systemctl restart example.serviceCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check that the changes took effect:
cat /sys/fs/cgroup/system.slice/example.service/memory.max 1536000
# cat /sys/fs/cgroup/system.slice/example.service/memory.max 1536000Copy to Clipboard Copied! Toggle word wrap Toggle overflow The example output shows that the memory consumption was limited at around 1,500 KB.
26.10. Using systemctl command to set limits to applications Copy linkLink copied to clipboard!
CPU affinity settings restrict the access of a process to specific CPUs. The CPU scheduler never schedules the process on a CPU that is not in the affinity mask of the process.
The default CPU affinity mask applies to all services managed by systemd.
To configure CPU affinity mask for a particular systemd service, systemd provides CPUAffinity= both as:
- a unit file option.
-
a configuration option in the [Manager] section of the
/etc/systemd/system.conffile.
The CPUAffinity= unit file option sets a list of CPUs or CPU ranges that are merged and used as the affinity mask.
You can set the CPU affinity mask for a systemd service by using the CPUAffinity unit file option.
Procedure
Check the values of the
CPUAffinityunit file option in the service of your choice:systemctl show --property <CPU affinity configuration option> <service name>
$ systemctl show --property <CPU affinity configuration option> <service name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow As the root user, set the required value of the
CPUAffinityunit file option for the CPU ranges used as the affinity mask:systemctl set-property <service name> CPUAffinity=<value>
# systemctl set-property <service name> CPUAffinity=<value>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the service to apply the changes.
systemctl restart <service name>
# systemctl restart <service name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
26.11. Setting global default CPU affinity through manager configuration Copy linkLink copied to clipboard!
Set the global default CPU affinity mask using the CPUAffinity option in /etc/systemd/system.conf for PID 1 and all processes forked from it. Override this setting on a per-service basis.
To set the default CPU affinity mask for all systemd services using the /etc/systemd/system.conf file:
-
Set the CPU numbers for the
CPUAffinity=option in the [Manager] section of the/etc/systemd/system.conffile. Save the edited file and reload the systemd service:
systemctl daemon-reload
# systemctl daemon-reloadCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Reboot the server to apply the changes.
26.12. Configuring NUMA policies using systemd Copy linkLink copied to clipboard!
Non-uniform memory access (NUMA) is a computer memory subsystem design, in which the memory access time depends on the physical memory location relative to the processor.
Memory close to the CPU has lower latency (local memory) than memory that is local for a different CPU (foreign memory) or is shared between a set of CPUs.
In terms of the Linux kernel, NUMA policy governs where (for example, on which NUMA nodes) the kernel allocates physical memory pages for the process.
You can use the NUMAPolicy and NUMAMask systemd unit file options to control memory allocation policies for services.
Procedure
To set the NUMA memory policy through the NUMAPolicy unit file option:
Check the values of the
NUMAPolicyunit file option in the service of your choice:systemctl show --property <NUMA policy configuration option> <service name>
$ systemctl show --property <NUMA policy configuration option> <service name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow As a root, set the required policy type of the
NUMAPolicyunit file option:systemctl set-property <service name> NUMAPolicy=<value>
# systemctl set-property <service name> NUMAPolicy=<value>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the service to apply the changes.
systemctl restart <service name>
# systemctl restart <service name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To set a global NUMAPolicy setting using the [Manager] configuration option:
-
Search in the
/etc/systemd/system.conffile for theNUMAPolicyoption in the [Manager] section of the file. - Edit the policy type and save the file.
Reload the systemd configuration:
systemd daemon-reload
# systemd daemon-reloadCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Reboot the server.
When you configure a strict NUMA policy, for example bind, make sure that you also appropriately set the CPUAffinity= unit file option.
26.13. NUMA policy configuration options for systemd Copy linkLink copied to clipboard!
Systemd provides the following options to configure the NUMA policy:
NUMAPolicyControls the NUMA memory policy of the executed processes. You can use these policy types:
- default
- preferred
- bind
- interleave
- local
NUMAMaskControls the NUMA node list that is associated with the selected NUMA policy.
Note that you do not have to specify the
NUMAMaskoption for the following policies:- default
- local
For the preferred policy, the list specifies only a single NUMA node.
26.14. Creating transient cgroups using systemd-run command Copy linkLink copied to clipboard!
The transient cgroups set limits on resources consumed by a unit (service or scope) during its runtime.
Procedure
To create a transient control group, use the
systemd-runcommand in the following format:systemd-run --unit=<name> --slice=<name>.slice <command>
# systemd-run --unit=<name> --slice=<name>.slice <command>Copy to Clipboard Copied! Toggle word wrap Toggle overflow This command creates and starts a transient service or a scope unit and runs a custom command in such a unit.
-
The
--unit=<name>option gives a name to the unit. If--unitis not specified, the name is generated automatically. -
The
--slice=<name>.sliceoption makes your service or scope unit a member of a specified slice. Replace<name>.slicewith the name of an existing slice (as shown in the output ofsystemctl -t slice), or create a new slice by passing a unique name. By default, services and scopes are created as members of thesystem.slice. Replace
<command>with the command you want to enter in the service or the scope unit.The following message is displayed to confirm that you created and started the service or the scope successfully:
Running as unit <name>.service
# Running as unit <name>.serviceCopy to Clipboard Copied! Toggle word wrap Toggle overflow
-
The
Optional: Keep the unit running after its processes finished to collect runtime information:
systemd-run --unit=<name> --slice=<name>.slice --remain-after-exit <command>
# systemd-run --unit=<name> --slice=<name>.slice --remain-after-exit <command>Copy to Clipboard Copied! Toggle word wrap Toggle overflow The command creates and starts a transient service unit and runs a custom command in the unit. The
--remain-after-exitoption ensures that the service keeps running after its processes have finished.
26.15. Removing transient control groups Copy linkLink copied to clipboard!
You can use the systemd system and service manager to remove transient control groups (cgroups) if you no longer need to limit, prioritize, or control access to hardware resources for groups of processes.
Transient cgroups are automatically released when all the processes that a service or a scope unit contains finish.
Procedure
To stop the service unit with all its processes, enter:
systemctl stop <name>.service
# systemctl stop <name>.serviceCopy to Clipboard Copied! Toggle word wrap Toggle overflow To terminate one or more of the unit processes, enter:
systemctl kill <name>.service --kill-who=PID,… --signal=<signal>
# systemctl kill <name>.service --kill-who=PID,… --signal=<signal>Copy to Clipboard Copied! Toggle word wrap Toggle overflow The command uses the
--kill-whooption to select process(es) from the control group you want to terminate. To kill multiple processes at the same time, pass a comma-separated list of PIDs. The--signaloption determines the type of POSIX signal to be sent to the specified processes. The default signal is SIGTERM.