Chapter 17. Minimizing system latency by isolating interrupts and user processes
Real-time environments need to minimize or eliminate latency when responding to various events. To do this, you can isolate interrupts (IRQs) from user processes from one another on different dedicated CPUs.
17.1. Interrupt and process binding
Isolating interrupts (IRQs) from user processes on different dedicated CPUs can minimize or eliminate latency in real-time environments.
Interrupts are generally shared evenly between CPUs. This can delay interrupt processing when the CPU has to write new data and instruction caches. These interrupt delays can cause conflicts with other processing being performed on the same CPU.
It is possible to allocate time-critical interrupts and processes to a specific CPU (or a range of CPUs). In this way, the code and data structures for processing this interrupt will most likely be in the processor and instruction caches. As a result, the dedicated process can run as quickly as possible, while all other non-time-critical processes run on the other CPUs. This can be particularly important where the speeds involved are near or at the limits of memory and available peripheral bus bandwidth. Any wait for memory to be fetched into processor caches will have a noticeable impact in overall processing time and determinism.
In practice, optimal performance is entirely application-specific. For example, tuning applications with similar functions for different companies, required completely different optimal performance tunings.
- One firm saw optimal results when they isolated 2 out of 4 CPUs for operating system functions and interrupt handling. The remaining 2 CPUs were dedicated purely for application handling.
- Another firm found optimal determinism when they bound the network related application processes onto a single CPU which was handling the network device driver interrupt.
To bind a process to a CPU, you usually need to know the CPU mask for a given CPU or range of CPUs. The CPU mask is typically represented as a 32-bit bitmask, a decimal number, or a hexadecimal number, depending on the command you are using.
CPUs | Bitmask | Decimal | Hexadecimal |
0 | 00000000000000000000000000000001 | 1 | 0x00000001 |
0, 1 | 00000000000000000000000000000011 | 3 | 0x00000011 |
17.2. Disabling the irqbalance daemon
The irqbalance
daemon is enabled by default and periodically forces interrupts to be handled by CPUs in an even manner. However in real-time deployments, irqbalance
is not needed, because applications are typically bound to specific CPUs.
Procedure
Check the status of
irqbalance
.# systemctl status irqbalance irqbalance.service - irqbalance daemon Loaded: loaded (/usr/lib/systemd/system/irqbalance.service; enabled) Active: active (running) …
If
irqbalance
is running, disable it, and stop it.# systemctl disable irqbalance # systemctl stop irqbalance
Verification
Check that the
irqbalance
status is inactive.# systemctl status irqbalance
17.3. Excluding CPUs from IRQ balancing
You can use the IRQ balancing service to specify which CPUs you want to exclude from consideration for interrupt (IRQ) balancing. The IRQBALANCE_BANNED_CPUS
parameter in the /etc/sysconfig/irqbalance
configuration file controls these settings. The value of the parameter is a 64-bit hexadecimal bit mask, where each bit of the mask represents a CPU core.
Procedure
Open
/etc/sysconfig/irqbalance
in your preferred text editor and find the section of the file titledIRQBALANCE_BANNED_CPUS
.# IRQBALANCE_BANNED_CPUS # 64 bit bitmask which allows you to indicate which cpu's should # be skipped when reblancing irqs. Cpu numbers which have their # corresponding bits set to one in this mask will not have any # irq's assigned to them on rebalance # #IRQBALANCE_BANNED_CPUS=
-
Uncomment the
IRQBALANCE_BANNED_CPUS
variable. - Enter the appropriate bitmask to specify the CPUs to be ignored by the IRQ balance mechanism.
- Save and close the file.
Restart the
irqbalance
service for the changes to take effect:# systemctl restart irqbalance
If you are running a system with up to 64 CPU cores, separate each group of eight hexadecimal digits with a comma. For example: IRQBALANCE_BANNED_CPUS=00000001,0000ff00
CPUs | Bitmask |
0 | 00000001 |
8 - 15 | 0000ff00 |
8 - 15, 33 | 00000002,0000ff00 |
In RHEL 7.2 and higher, the irqbalance
utility automatically avoids IRQs on CPU cores isolated via the isolcpus
kernel parameter if IRQBALANCE_BANNED_CPUS
is not set in /etc/sysconfig/irqbalance
.
17.4. Manually assigning CPU affinity to individual IRQs
Assigning CPU affinity enables binding and unbinding processes and threads to a specified CPU or range of CPUs. This can reduce caching problems.
Procedure
Check the IRQs in use by each device by viewing the
/proc/interrupts
file.# cat /proc/interrupts
Each line shows the IRQ number, the number of interrupts that happened in each CPU, followed by the IRQ type and a description.
CPU0 CPU1 0: 26575949 11 IO-APIC-edge timer 1: 14 7 IO-APIC-edge i8042
Write the CPU mask to the
smp_affinity
entry of a specific IRQ. The CPU mask must be expressed as a hexadecimal number.For example, the following command instructs IRQ number 142 to run only on CPU 0.
# echo 1 > /proc/irq/142/smp_affinity
The change only takes effect when an interrupt occurs.
Verification
- Perform an activity that will trigger the specified interrupt.
Check
/proc/interrupts
for changes.The number of interrupts on the specified CPU for the configured IRQ increased, and the number of interrupts for the configured IRQ on CPUs outside the specified affinity did not increase.
17.5. Binding processes to CPUs with the taskset utility
The taskset
utility uses the process ID (PID) of a task to view or set its CPU affinity. You can use the utility to run a command with a chosen CPU affinity.
To set the affinity, you need to get the CPU mask to be as a decimal or hexadecimal number. The mask argument is a bitmask
that specifies which CPU cores are legal for the command or PID being modified.
The taskset
utility works on a NUMA (Non-Uniform Memory Access) system, but it does not allow the user to bind threads to CPUs and the closest NUMA memory node. On such systems, taskset is not the preferred tool, and the numactl
utility should be used instead for its advanced capabilities.
For more information, see the numactl(8)
man page on your system.
Procedure
Run
taskset
with the necessary options and arguments.You can specify a CPU list using the -c parameter instead of a CPU mask. In this example,
my_embedded_process
is being instructed to run only on CPUs 0,4,7-11.# taskset -c 0,4,7-11 /usr/local/bin/my_embedded_process
This invocation is more convenient in most cases.
To set the affinity of a process that is not currently running, use
taskset
and specify the CPU mask and the process.In this example,
my_embedded_process
is being instructed to use only CPU 3 (using the decimal version of the CPU mask).# taskset 8 /usr/local/bin/my_embedded_process
You can specify more than one CPU in the bitmask. In this example,
my_embedded_process
is being instructed to execute on processors 4, 5, 6, and 7 (using the hexadecimal version of the CPU mask).# taskset 0xF0 /usr/local/bin/my_embedded_process
You can set the CPU affinity for processes that are already running by using the
-p
(--pid
) option with the CPU mask and the PID of the process you want to change. In this example, the process with a PID of 7013 is being instructed to run only on CPU 0.# taskset -p 1 7013
You can combine the listed options.
Additional resources
-
taskset(1)
andnumactl(8)
man pages on your system