Chapter 36. Configuring huge pages
Physical memory is managed in fixed-size chunks called pages. On the x86_64 architecture, supported by Red Hat Enterprise Linux 8, the default size of a memory page is 4 KB
. This default page size has proved to be suitable for general-purpose operating systems, such as Red Hat Enterprise Linux, which supports many different kinds of workloads.
However, specific applications can benefit from using larger page sizes in certain cases. For example, an application that works with a large and relatively fixed data set of hundreds of megabytes or even dozens of gigabytes can have performance issues when using 4 KB
pages. Such data sets can require a huge amount of 4 KB
pages, which can lead to overhead in the operating system and the CPU.
This section provides information about huge pages available in RHEL 8 and how you can configure them.
36.1. Available huge page features
With Red Hat Enterprise Linux 8, you can use huge pages for applications that work with big data sets, and improve the performance of such applications.
The following are the huge page methods, which are supported in RHEL 8:
HugeTLB pages
HugeTLB pages are also called static huge pages. There are two ways of reserving HugeTLB pages:
- At boot time: It increases the possibility of success because the memory has not yet been significantly fragmented. However, on NUMA machines, the number of pages is automatically split among the NUMA nodes.
For more information about parameters that influence HugeTLB page behavior at boot time, see Parameters for reserving HugeTLB pages at boot time and how to use these parameters to configure HugeTLB pages at boot time, see Configuring HugeTLB at boot time.
- At run time: It allows you to reserve the huge pages per NUMA node. If the run-time reservation is done as early as possible in the boot process, the probability of memory fragmentation is lower.
For more information about parameters that influence HugeTLB page behavior at run time, see Parameters for reserving HugeTLB pages at run time and how to use these parameters to configure HugeTLB pages at run time, see Configuring HugeTLB at run time.
Transparent HugePages (THP)
With THP, the kernel automatically assigns huge pages to processes, and therefore there is no need to manually reserve the static huge pages. The following are the two modes of operation in THP:
-
system-wide
: Here, the kernel tries to assign huge pages to a process whenever it is possible to allocate the huge pages and the process is using a large contiguous virtual memory area. per-process
: Here, the kernel only assigns huge pages to the memory areas of individual processes which you can specify using themadvise
() system call.NoteThe THP feature only supports
2 MB
pages.
-
For more information about parameters that influence HugeTLB page behavior at boot time, see Enabling transparent hugepages and Disabling transparent hugepages.
36.2. Parameters for reserving HugeTLB pages at boot time
Use the following parameters to influence HugeTLB page behavior at boot time.
For more infomration on how to use these parameters to configure HugeTLB pages at boot time, see Configuring HugeTLB at boot time.
Parameter | Description | Default value |
---|---|---|
| Defines the number of persistent huge pages configured in the kernel at boot time. In a NUMA system, huge pages, that have this parameter defined, are divided equally between nodes.
You can assign huge pages to specific nodes at runtime by changing the value of the nodes in the |
The default value is
To update this value at boot, change the value of this parameter in the |
| Defines the size of persistent huge pages configured in the kernel at boot time. |
Valid values are |
| Defines the default size of persistent huge pages configured in the kernel at boot time. |
Valid values are |
36.3. Configuring HugeTLB at boot time
The page size, which the HugeTLB subsystem supports, depends on the architecture. The x86_64 architecture supports 2 MB
huge pages and 1 GB
gigantic pages.
This procedure describes how to reserve a 1 GB
page at boot time.
Procedure
To create a HugeTLB pool for
1 GB
pages, enable thedefault_hugepagesz=1G
andhugepagesz=1G
kernel options:# grubby --update-kernel=ALL --args="default_hugepagesz=1G hugepagesz=1G"
Create a new file called
hugetlb-gigantic-pages.service
in the/usr/lib/systemd/system/
directory and add the following content:[Unit] Description=HugeTLB Gigantic Pages Reservation DefaultDependencies=no Before=dev-hugepages.mount ConditionPathExists=/sys/devices/system/node ConditionKernelCommandLine=hugepagesz=1G [Service] Type=oneshot RemainAfterExit=yes ExecStart=/usr/lib/systemd/hugetlb-reserve-pages.sh [Install] WantedBy=sysinit.target
Create a new file called
hugetlb-reserve-pages.sh
in the/usr/lib/systemd/
directory and add the following content:While adding the following content, replace number_of_pages with the number of 1GB pages you want to reserve, and node with the name of the node on which to reserve these pages.
#!/bin/sh nodes_path=/sys/devices/system/node/ if [ ! -d $nodes_path ]; then echo "ERROR: $nodes_path does not exist" exit 1 fi reserve_pages() { echo $1 > $nodes_path/$2/hugepages/hugepages-1048576kB/nr_hugepages } reserve_pages number_of_pages node
For example, to reserve two
1 GB
pages on node0 and one 1GB page on node1, replace the number_of_pages with 2 for node0 and 1 for node1:reserve_pages 2 node0 reserve_pages 1 node1
Create an executable script:
# chmod +x /usr/lib/systemd/hugetlb-reserve-pages.sh
Enable early boot reservation:
# systemctl enable hugetlb-gigantic-pages
-
You can try reserving more
1 GB
pages at runtime by writing tonr_hugepages
at any time. However, to prevent failures due to memory fragmentation, reserve1 GB
pages early during the boot process. - Reserving static huge pages can effectively reduce the amount of memory available to the system, and prevents it from properly utilizing its full memory capacity. Although a properly sized pool of reserved huge pages can be beneficial to applications that utilize it, an oversized or unused pool of reserved huge pages will eventually be detrimental to overall system performance. When setting a reserved huge page pool, ensure that the system can properly utilize its full memory capacity.
Additional resources
-
systemd.service(5)
man page on your system -
/usr/share/doc/kernel-doc-kernel_version/Documentation/vm/hugetlbpage.txt
file
36.4. Parameters for reserving HugeTLB pages at run time
Use the following parameters to influence HugeTLB page behavior at run time.
For more information about how to use these parameters to configure HugeTLB pages at run time, see Configuring HugeTLB at run time.
Parameter | Description | File name |
---|---|---|
| Defines the number of huge pages of a specified size assigned to a specified NUMA node. |
|
| Defines the maximum number of additional huge pages that can be created and used by the system through overcommitting memory. Writing any non-zero value into this file indicates that the system obtains that number of huge pages from the kernel’s normal page pool if the persistent huge page pool is exhausted. As these surplus huge pages become unused, they are then freed and returned to the kernel’s normal page pool. |
|
36.5. Configuring HugeTLB at run time
This procedure describes how to add 20 2048 kB huge pages to node2.
To reserve pages based on your requirements, replace:
- 20 with the number of huge pages you wish to reserve,
- 2048kB with the size of the huge pages,
- node2 with the node on which you wish to reserve the pages.
Procedure
Display the memory statistics:
# numastat -cm | egrep 'Node|Huge' Node 0 Node 1 Node 2 Node 3 Total add AnonHugePages 0 2 0 8 10 HugePages_Total 0 0 0 0 0 HugePages_Free 0 0 0 0 0 HugePages_Surp 0 0 0 0 0
Add the number of huge pages of a specified size to the node:
# echo 20 > /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages
Verification
Ensure that the number of huge pages are added:
# numastat -cm | egrep 'Node|Huge' Node 0 Node 1 Node 2 Node 3 Total AnonHugePages 0 2 0 8 10 HugePages_Total 0 0 40 0 40 HugePages_Free 0 0 40 0 40 HugePages_Surp 0 0 0 0 0
Additional resources
-
numastat(8)
man page on your system
36.6. Enabling transparent hugepages
THP is enabled by default in Red Hat Enterprise Linux 8. However, you can enable or disable THP.
This procedure describes how to enable THP.
Procedure
Check the current status of THP:
# cat /sys/kernel/mm/transparent_hugepage/enabled
Enable THP:
# echo always > /sys/kernel/mm/transparent_hugepage/enabled
To prevent applications from allocating more memory resources than necessary, disable the system-wide transparent huge pages and only enable them for the applications that explicitly request it through the
madvise
:# echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
Sometimes, providing low latency to short-lived allocations has higher priority than immediately achieving the best performance with long-lived allocations. In such cases, you can disable direct compaction while leaving THP enabled.
Direct compaction is a synchronous memory compaction during the huge page allocation. Disabling direct compaction provides no guarantee of saving memory, but can decrease the risk of higher latencies during frequent page faults. Note that if the workload benefits significantly from THP, the performance decreases. Disable direct compaction:
# echo madvise > /sys/kernel/mm/transparent_hugepage/defrag
Additional resources
-
madvise(2)
man page on your system - Disabling transparent hugepages.
36.7. Disabling transparent hugepages
THP is enabled by default in Red Hat Enterprise Linux 8. However, you can enable or disable THP.
This procedure describes how to disable THP.
Procedure
Check the current status of THP:
# cat /sys/kernel/mm/transparent_hugepage/enabled
Disable THP:
# echo never > /sys/kernel/mm/transparent_hugepage/enabled
36.8. Impact of page size on translation lookaside buffer size
Reading address mappings from the page table is time-consuming and resource-expensive, so CPUs are built with a cache for recently-used addresses, called the Translation Lookaside Buffer (TLB). However, the default TLB can only cache a certain number of address mappings.
If a requested address mapping is not in the TLB, called a TLB miss, the system still needs to read the page table to determine the physical to virtual address mapping. Because of the relationship between application memory requirements and the size of pages used to cache address mappings, applications with large memory requirements are more likely to suffer performance degradation from TLB misses than applications with minimal memory requirements. It is therefore important to avoid TLB misses wherever possible.
Both HugeTLB and Transparent Huge Page features allow applications to use pages larger than 4 KB
. This allows addresses stored in the TLB to reference more memory, which reduces TLB misses and improves application performance.