5. Kernel-Related Updates
5.1. All Architectures
- Bugzilla #467714
- The ibmphp module is not safe to unload. Previously, the mechanism that prevented the ibmphp module from unloading was insufficient, and eventually triggered a bug halt. With this update, the method to prevent this module from unloading has been improved, preventing the bug halt. However, attempting to unload the module may produce a warning in the message log, indicating that the module is not safe to unload. This warning can be safely ignored.
- Bugzilla #461564
- With this update, physical memory will be limited to 64GB for 32-bit x86 kernels running on systems with more than 64GB. The kernel splits memory into 2 separate regions: Lowmem and Highmem. Lowmem is mapped into the kernel address space at all times. Highmem, however, is mapped into a kernel virtual window a page at a time as needed. If memory I/Os are allowed to exceed 64GB, the mem_map (also known as the page array) size can approach or even exceed the size of Lowmem. If this happens, the kernel panics during boot or starts prematurely. In the latter case, the kernel fails to allocate kernel memory after booting and either panics or hangs.
- Bugzilla #246233
- Previously, if a user pressed the arrow keys continously on a Hardware Virtual Machine (HVM) an interrupt race condition between the hardware interrupt and timer interrupt was encountered. As a result, the keyboard driver reported unknown keycode events. With this update, the i8042 polling timer has been removed, which resolves this issue.
- Bugzilla #435705
- With this update, the diskdump utility (which provides the ability to create and collect vmcore Kernel dumps) is now supported for use with the
sata_svw
driver. - Bugzilla #439043
- With this update, the "swap_token_timeout" parameter has been added to /proc/sys/vm.This file contains valid hold time of swap out protection token. The Linux Virtual Memory (VM) subsystem has a token based thrashing control mechanism and uses the token to prevent unnecessary page faults in thrashing situation. The unit of the value is in `second`. The value would be useful to tune thrashing behavior. Setting it to 0 will disable the swap token mechanism.
- Bugzilla #439431
- Previously, when a NFSv4 (Network File System Version 4) client encountered issues while processing a directory using
readdir()
, an error for the entirereaddir()
call was returned. With this update, thefattr4_rdattr_error
flag is now set whenreaddir()
is called, instructing the server to continue on and only report an error on the specific directory entry that was causing the issue. - Bugzilla #443655
- Previously, the NFS (Network File System) client was not handling malformed replies from the
readdir()
function. Consequently, the reply from the server would indicate that the call to thereaddir()
function was successful, but the reply would contain no entries. With this update, thereaddir()
reply parsing logic has been changed, such that when a malformed reply is received, the client returns an EIO error. - Bugzilla #448076
- The RPC client stores the result of a portmap call at a place in memory that can be freed and reallocated under the right circumstances. However, under some circumstances, the result of the portmap call was freed from memory too early, which may have resulted in memory corruption. With this update, reference counting has been added to the memory location where the portmap result is stored, and will only free it after it has been used.
- Bugzilla #450743
- Under some circumstances, the allocation of some data structures for RPC calls may have been blocked when the system memory was low. Consequently, deadlock may have been encountered under heavy memory pressure when there were a large number of NFS pages awaiting writeback. With this update, the allocation of these data structures is now non-blocking, which resolves this issue.
- Bugzilla #451088
- Previously, degraded performance may have been encountered when writing to a LVM mirrored volume synchronously (using the
O_SYNC
flag). Consequently, every write I/O to a mirrored volume was delayed by 3ms, resulting in the mirrored volume being approximately 5-10 times slower than a linear volume. With this update, I/O queue unplugging has been added to thedm-raid1
driver, and the performace of mirrored volumes has been improved to be comparable with that of linear volumes. - Bugzilla #476997
- A new tuning parameter has been added to allow system administrators to change the max number of modified pages
kupdate
writes to disk per iteration each time it runs. This new tunable (/proc/sys/vm/max_writeback_pages
) defaults to a value of 1024 (4MB) so that a maximum of 1024 pages get written out by each iteration ofkupdate
. Increasing this value alters how aggressivelykupdate
flushes modified pages and decreases the potential amount of data loss if the system crashes betweenkupdate
runs. However, increasing themax_writeback_pages
value may have negative performance consequences on systems that are sensitive to I/O loads. - Bugzilla #456911
- A new allowable value has been added to the
/proc/sys/kernel/wake_balance
tunable parameter. Setting wake_balance to a value of 2 will instruct the scheduler to run the thread on any available CPU rather than scheduling it on the optimal CPU. Setting this kernel parameter to 2 will force the scheduler to reduce the overall latency even at the cost of total system throughput. - Bugzilla #475715
- When checking a directory tree, the kernel module could, in some circumstances, incorrectly decide the tree was not busy. An active offset mount with an open file handle being used for expires caused the file handle to not count toward the busyness check. This resulted in mount requests being made for already mounted offsets. With this update, the kernel module check has been corrected and incorrect mount requests are no longer generated.
- Bugzilla #453470
- During system initalization, the CPU vendor was detected after the initialization of the Advanced Programmable Interrupt Controllers (APICs). Consequently, on x86_64 AMD systems with more than 8 cores, APIC clustered mode was used, resulting in suboptimal system performance. With this update, the CPU vendor is now queried prior to initializing the APICs, resulting in APIC physical flat mode being used by default, which resolves this issue.
- Bugzilla #462459
- The Common Internet File System (CIFS) code has been updated in Red Hat Enterprise Linux 4.8, fixing a number of bugs that had been repaired in upstream, including the following change:Previously, when mounting a server without Unix extensions, it was possible to change the mode of a file. However, this mode change could not be permanently stored, and may have changed back to the original mode at any time. With this update, the mode of the file cannot be temporarily changed by default;
chmod()
calls will return success, but have no effect. A new mount option,dynperm
needs to be used if the old behavior is required. - Bugzilla #451819
- Previously, in the kernel, there was a race condition may have been encountered between
dio_bio_end_aio()
anddio_await_one()
. This may have lead to a situation where direct I/O is left waiting indefinitely on an I/O process that has already completed. With this update, these reference counting operations are now locked so that the submission and completion paths see a unified state, which resolves this issue. - Bugzilla #249775
- Previously, upgrading a fully virtualized guest system from Red Hat Enterprise Linux 4.6 (with the
kmod-xenpv
package installed) to newer versions of Red Hat Enterprise Linux 4 resulted in an improper module dependency between the built-in kernel modules:xen-vbd.ko
&xen-vnif.ko
and the olderxen-platform-pci.ko
module. Consequently, file systems mounted via thexen-vbd.ko
block driver, and guest networking using thexen-vnif.ko
network driver would fail.In Red Hat Enterprise Linux 4.7, the functionality in thexen-platform-pci.ko
module was built-in to the kernel. However, when a formally loadable kernel module becomes a part of the kernel, the symbol dependency check for existing loadable modules is not accounted for in the module-init-tools correctly. With this update, thexen-platform-pci.ko
functionality has been removed from the built-in kernel and placed back into a loadable module, allowing the module-init-tools to check and create the proper dependencies during a kernel upgrade. - Bugzilla #463897
- Previously, attempting to mount disks or partitions in a 32-bit Red Hat Enterprise Linux 4.6 fully virtualized guest using the paravirtualized block driver(
xen-vbd.ko
) on a 64-bit host would fail. With this update, the block front driver (block.c
) has been updated to inform the block back driver that the guest is using the 32-bit protocol, which resolves this issue. - Bugzilla #460984
- Previously, installing the
pv-on-hvm
drivers on a bare-metal kernel automatically created the/proc/xen
directory. Consequently, applications that verify if the system is running a virtualized kernel by checking for the existence of the/proc/xen
directory may have incorrectly assumed that the virtualized kernel is being used. With this update, the pv-on-hvm drivers no longer create the/proc/xen
directory, which resolves this issue. - Bugzilla #455756
- Previously, paravirtualized guests could only have a maximum of 16 disk devices. In this update, this limit has been increased to a maximum of 256 disk devices.
- Bugzilla #523930
- In some circumstances, write operations to a particular TTY device opened by more than one user (eg, one opened it as
/dev/console
and the other opened it as/dev/ttyS0
) were blocked. If one user opened the TTY terminal without setting the O_NONBLOCK flag, this user's write operations were suspended if the output buffer was full or if a STOP (Ctrl-S) signal was sent. As well, because the O_NONBLOCK flag was not respected, write operations for user terminals opened with the O_NONBLOCK flag set were also blocked. This update re-implements TTY locks, ensuring O_NONBLOCK works as expected, even if a STOP signal is sent from another terminal. - Bugzilla #519692
- Previously, the
get_random_int()
function returned the same number until the jiffies counter (which ticks at a clock interrupt frequency) or process ID (PID) changed, making it possible to predict the random numbers. This may have weakened the ASLR security feature. With this update,get_random_int()
is more random and no longer uses a common seed value. This reduces the possibility of predicting the valuesget_random_int()
returns. - Bugzilla #518707
ib_mthca
, the driver for Host Channel Adapter (HCA) cards based on the Mellanox Technologies MT25408 InfiniHost III Lx HCA integrated circuit device, useskmalloc()
to allocate large bitmasks. This ensures allocated memory is a contiguous physical block, as is required by DMA devices such as these HCA cards.Previously, the largest allowedkmalloc()
was a 128kB page. Ifib_mthca
was set to allocate more than 128kB (for example, by setting thenum_mutt
option to "num_mutt=2097152", causing kmalloc() to allocate 256kB) the driver failed to load, returning the messageFailed to initialize memory region table, aborting.
This update alters the allocation methods of the ib_mthca driver. Whenmthca_buddy_init()
wants more than a page, memory is allocated directly from the page allocator, rather than usingkmalloc()
. It is now possible to pin large amounts of memory for use by theib_mthca
driver by increasing the values assigned tonum_mutt
andnum_mtt
.- Bugzilla #519446
- Previously, there were some instances in the kernel where the
__ptrace_unlink()
function (part of the ptrace system call) usedREMOVE_LINKS
andSET_LINKS
, rather thanadd_parent
andremove_parent
, while changing the parent of a process. This approach could abuse the global process list and, as a consequence, create deadlocked and unkillable processes in some circumstances. With this update,__ptrace_unlink()
now usesadd_parent
andremove_parent
in every instance, ensuring that deadlocked and unkillable processes cannot be created.Note
Unkillable or deadlocked processes created by this bug had no effect on system availability.