- BZ#622327
Previously, an operation such as madvise(MADV_MERGEABLE) may have split VMAs (Virtual Memory Area) without checking if any huge page had to be split into regular pages, leading to huge pages to be still mapped in VMA ranges that would not be large enough to fit huge pages. With this update, huge pages are checked whether they have been split when any VMA is being truncated.
- BZ#640576
Occasionally, the anon_vma variable could contain the value null in the page_address_in_vma function and cause kernel panic. With this update, kernel panic no longer occurs.
- BZ#640579
Previously, building under memory pressure with KSM (Kernel Shared Memory) caused KSM to collapse with an internal compiler error indicating an error in swapping. With this update, data corruption during swapping no longer occurs.
- BZ#640611
The fork() system call led to an rmap walk finding the parent huge-pmd twice instead of once, thus causing a discrepancy between the mapcount and page_mapcount check, which could have led to erratic page counts for subpages. This fix ensures that that the rmap walk is accurate when a process is forked, thus resolving the issue.
- BZ#642570
The fork() system call led to an rmap walk finding the parent huge-pmd twice instead of once, thus causing a discrepancy between the mapcount and page_mapcount check, which could have led to erratic page counts for subpages. This fix ensures that that the rmap walk is accurate when a process is forked, thus resolving the issue.
- BZ#646384
Running certain workload tests on a Non-Uniform Memory Architecture (NUMA) system could cause kernel panic at mm/migrate.c:113. This was due to a false positive BUG_ON. With this update, the false positive BUG_ON has been removed.
- BZ#622640
If an Intel 82598 10 Gigabit Ethernet Controller was configured in a way that caused peer-to-peer traffic to be sent to the Intel X58 I/O hub (IOH), a PCIe credit starvation problem occurred. As a result, the system would hang. With this update, the system continues to work and does not hang.
- BZ#637332
The ixgbe driver has been upgraded to upstream version 3.0.12, which provides a number of bug fixes and enhancements over the previous version.
- BZ#696337
During light or no network traffic, the active-backup interface bond using ARP monitoring with validation could go down and return due to an overflow or underflow of system timer interrupt ticks (jiffies). With this update, the jiffies calculation issues problems have been fixed and a bond interface works as expected.
- BZ#609516
Booting a system via the Extensible Firmware Interface (EFI) could result in a low resolution of the boot screen due to a VGA palette corruption. With this update, the VGA palette corruption no longer occurs and the boot screen is displayed in the correct resolution and colors.
- BZ#626454
Systems with an updated Video BIOS for the AMD RS880 would not properly boot with KMS (Kernel mode-setting) enabled. With this update, the Video BIOS boots successfully when KMS is enabled.
- BZ#640870
This update fixes the slow memory leak in the i915 module in DRM (Direct Rendering Manager) and GEM (Graphics Execution Manager).
- BZ#640871
Previously, a race condition in the TTM (Translation Table Maps) module of the DRM (Direct Rendering Manager) between the object destruction thread and object eviction could result in a major loss of large objects reference counts. Consequently, this caused a major amount of memory leak. With this update, the race condition no longer occurs and any memory leaks are prevented.
- BZ#644896
When booting the latest Red Hat Enterprise Linux 6 kernel (-78.el6), the system hanged shortly after the booting. Access to the file system died and the console started outputting soft lockup messages from the TTM code. With this update, the aforementioned behavior no longer occurs and the system boots as expected.
- BZ#530618
Under some circumstances, a kernel panic on installation or boot may occur if the "Interrupt Remapping" feature is enabled in the BIOS. To work around this issue, disable interrupt remapping in the BIOS.
- BZ#681017
Under some circumstances, faulty logic in the system BIOS could report that ASPM (Active State Power Management) was not supported on the system, but leave ASPM enabled on a device. This could lead to AER (Advanced Error Reporting) errors that the kernel was unable to handle. With this update, the kernel proactively disables ASPM on devices when the BIOS reports that ASPM is not supported, safely eliminating the aforementioned issues.
- BZ#624628
Prior to this update, a guest could use the poll() function to find out whether the host-side connection was open or closed. However, with a SIGIO signal, this can be done asynchronously, without having to explicitly poll each port. With this update, a SIGIO signal is sent for any host connect/disconnect events. Once the SIGIO signal is received, the open/close status of virtio-serial ports can be obtained using the poll() system call.
- BZ#628805
The virtio-console device did not handle the hot-unplug operation properly. As a result, virtio-console could access the memory outside the driver's memory area and cause kernel panic on the guest. With this update, multiple fixes to the virtio-console device resolved this issue and the hot-unplug operation works as expected.
- BZ#634232
Applications and agents using virtio serial ports would block messages even though there were messages queued up and ready to be read in the virtqueue. This was due to virtio_console's poll function checking whether a port was NULL to determine if a read operation would result in a block of the port. However, in some cases, a port can be NULL even though there are buffers left in the virtqueue to be read. This update introduces a more sophisticated method of checking whether a port contains any data; thus, preventing queued up messages from being incorrectly blocked.
- BZ#635535
Prior to this update, user space could submit (using the write() operation) a buffer with zero length to be written to the host, causing the qemu hypervisor instance running on that host to crash. This was caused by the write() operation triggering a virtqueue event on the host, causing a NULL buffer to be accessed. With this update, user space is no longer allowed to submit zero-sized buffers and the aforementioned crash no longer occur.
- BZ#643750
Using a virtio serial port from an application, filling it until the write command returns -EAGAIN and then executing a select command for the write command, caused the select command to not return any values when using the virtio serial port in a non-blocking mode. When used in blocking mode, the write command waited until the host indicated it had used up the buffers. This was due to the fact that the poll operation waited for the port->waitqueue pointer; however, nothing woke the waitqueue when there was room again in the queue. With this update, the queue is woken via host notifications so that buffers consumed by the host can be reclaimed, the queue freed, and the application write operations may proceed again.
- BZ#643751
If a host was slow in reading data or did not read data at all, blocking write() calls not only blocked the program that called the write() call but also the entire guest. This was caused by the write() calls waiting until an acknowledgment that the data consumed was received from the host. With this update, write() calls no longer wait for such acknowledgment: control is immediately returned to the user space application. This ensures that even if the host is busy processing other data or is not consuming data at all, the guest is not blocked.
- BZ#605786
Please note that in future versions of Red Hat Enterprise Linux 6 (i.e. Red Hat Enterprise Linux 6.1 and later) the auto value setting of the crashkernel= parameter (i.e. crashkernel=auto) is deprecated.
- BZ#675102
Prior to this update, the /usr/include/linux/fs.h file was broken, causing other packages to fail to build. With this update, the underlying source code has been modified to address this issue, and packages no longer fail to build.
- BZ#629178
Prior to this update, the execve utility exhibited the following flaw. When an argument and any environment data were copied from an old task's user stack to the user stack of a newly-execve'd task, the kernel would not allow the process to be interrupted or rescheduled. Therefore, when the argument or environment string data was (abnormally) large, there was no "interactivity" with the process while the execve() function was transferring the data. With this update, fatal signals (like CTRL+c) can now be received and handled and a process is allowed to yield to higher priority processes during the data transfer.
- BZ#616296
While not mandated by any specification, Linux systems rely on NMIs (Non-maskable Interrupts) being blocked by an IF-enabling (Interrupt Flag) STI instruction (an x86 instruction that enables interrupts; Set Interrupts); this is also the common behavior of all known hardware. Prior to this update, kernel panic could occur on guests using NMIs extensively (for example, a Linux system with the nmi_watchdog kernel parameter enabled). With this update, an NMI is disallowed when interrupts are blocked by an STI. This is done by checking for the condition and requesting an interrupt window exit if it occurs. As a result, kernel panic no longer occurs.
- BZ#645898
Prior to this update, running context-switch intensive workloads on KVM guests resulted in a large number of exits (kvm_exit) due to control register (CR) accesses by the guest, thus, resulting in poor performance. This update includes a number of optimizations which allow the guest not to exit to the hypervisor in the aforementioned case and improve the overall performance.
- BZ#626814
In some cases the NFS server fails to notify NFSv4 clients about renames and unlinks done by other clients, or by non-NFS users of the server. An application on a client may then be able to open the file at its old pathname (and read old cached data from it, and perform read locks on it), long after the file no longer exists at that pathname on the server. To work around this issue, use NFSv3 instead of NFSv4. Alternatively, turn off support for leases by writing 0 to /proc/sys/fs/leases-enable (ideally on boot, before the nfs server is started). This change prevents NFSv4 delegations from being given out, restoring correctness at the expense of some performance.
- BZ#695488
In a four node cluster environment, a deadlock could occur on machines in the cluster when the nodes accessed a GFS2 file system. This resulted in memory fragmentation which caused the number of network packet fragments in requests to exceed the network hardware limit. The network hardware firmware dropped the network packets exceeding this limit. With this update, the network packet fragmentation was reduced to the limit of the network hardware, no longer causing problems during memory fragmentation.
- BZ#627741
The zfcpdump (kdump) kernel on IBM System z could not be debugged using the dump analysis tool crash, because the vmlinux file in the kernel-kdump-debuginfo RPM did not contain DWARF debug information. With this update, the CONFIG_DEBUG_KERNEL parameter is set to yes and the needed debug information is provided.
- BZ#647365
On IBM System z systems, user space programs could access the /dev/mem file (which contains an image of main memory), where an accidental memory (write) access could potentially be harmful. To restrict access to memory from user space through the /dev/mem file, the CONFIG_STRICT_DEVMEM configuration option has been enabled for the default kernel. The kdump and debug kernels have this option switched off by default.
- BZ#668470
If a CPU is set offline, the nohz_load_balancer CPU is updated. However, under certain circumstances, the nohz_load_balancer CPU would not be updated, causing the offlined CPU to be enqueued with various timers which never expired. As a result, the system could become unresponsive. With this update, the nohz_load_balancer CPU is always updated; systems no longer become unresponsive.
- BZ#636678
Previously, in order to install Snapshot 13, boot parameter nomodeset xforcevesa had to be added to the kernel command line, otherwise, the screen turned black and and prevented the installation. With this update, the aforementioned boot parameter no longer has to be specified and the installation works as expected.
- BZ#635710
The qla2xxx driver for QLogic Fibre Channel Host Bus Adapters (HBAs) has been updated to upstream version 8.03.05.01.06.1-k0, which provides a number of bug fixes and enhancements over the previous version.
- BZ#695478
The driver for the NetXen NX3031 network adapter did not support more than 14 fragments for a non-TSO (TCP Segmentation Offload) packet, which could have caused network failures. This update corrects the driver.
- BZ#641764
Previously, accounting of reclaimable inodes did not work correctly. When an inode was reclaimed it was only deleted from the per-AG (per Allocation Group) tree. Neither the counter was decreased, nor was the parent tree's AG entry untagged properly. This caused the system to hang indefinitely. With this update, the accounting of reclaimable inodes works properly and the system remains responsive.
- BZ#632021
If a Xen guest which specifies a physical path such as /dev/sda1 in its /etc/fstab configuration file, instead of a labeled path, then the following workaround procedure should be followed:
The "xen_emul_unplug=never" option should be added to the guest's kernel boot line.
The /etc/fstab entry should be modified to specify a partition such as /dev/xvda1 for the /boot partition, or a proper partition label should be used for the file systems on the emulated block device.
Finally, if the Xen guest configuration spec uses a line similar to the following:
disk = [ 'file:/var/lib/xen/images/rhel6-guest.dsk,hda,w', ]
…then that line should be changed to:
disk = [ 'tap:aio:/var/lib/xen/images/rhel6-guest.dsk,hda,w', ]
This line needs to be changed because the Xen para-virtualized disk driver is not supported with file-backed I/O.
- BZ#680126
Using the pam_tty_audit.so module (which enables or disables TTY auditing for specified users) in the /etc/pam.d/sudo file and in the /etc/pam.d/system-auth file when the audit package is not installed resulted in soft lock-ups on CPUs. As a result, the kernel became unresponsive. This was due to the kernel exiting immediately after TTY auditing was disabled, without emptying the buffer, which caused the kernel to spin in a loop, copying 0 bytes at each iteration and attempting to push each time without any effect. With this update, a locking mechanism is introduced to prevent the aforementioned behavior.
- BZ#625914
Previously, a kernel module not shipped by Red Hat was successfully loaded when the FIPS boot option was enabled. With this update, kernel self-integrity is improved by rejecting to load kernel modules which are not shipped by Red Hat when the FIPS boot option is enabled.
- BZ#631547
Previously the cxgb3 (Chelsio Communications T3 10Gb Ethernet) adapter experienced parity errors. With this update, the parity errors are correctly detected and the cxgb3 adapter successfully recovers from them.
- BZ#698016
When the iscsi driver detected the platform option-rom, it bypassed its local defaults and used the platform-provided parameters. With this update, if the platform specifies invalid OEM parameters, a warning message is printed, and the iSCSI driver falls back on its sensible internal default parameters rather than failing to load the driver altogether.
- BZ#694106
After a raid45->raid0 takeover operation, another takeover operation (for example, raid0->raid5) resulted in kernel panic. This was due to the 'degraded' and 'plug' variables from the mddev structure not being cleared after the raid4->raid0 takeover. With this update, aforementioned variables are properly cleared, and no longer cause kernel panic.
- BZ#550724
In some cases, under a small system load involve some I/O operation, processes started to lock up in the D state (that is, became unresponsive). The system load could in some cases climb steadily. This was due to the way the event channel IRQ (Interrupt Request) was set up. Xen events behave like edge-triggered IRQs, however, the kernel was setting them up as level-triggered IRQs. As a result, any action using Xen event channels could lock up a process in the D state. With this update, the handling has been changed from edge-triggered IRQs to level-triggered IRQs and process no longer lock up in the D state.
- BZ#643371
A race condition occurred when Xen was presented with an inconsistent page type resulting in the crash of the kernel. With this update, the race condition is prevented and kernel crashes no longer occur.
- BZ#645198
The Red Hat Enterprise Linux kernel can now be tainted with a "tech preview" status. If a kernel module causes the tainted status, then running the command "cat /proc/modules" will display a "(T)" next to any module that is tainting the kernel.
For more information about Technology Previews, refer to:
Running a kernel with the tainted flag set may limit the amount of support that Red Hat can provide for the system.
- BZ#694913
The "perf" subsystem failed to load on HP ProLiant servers, and messages similar to the following were logged to the console at boot time:
NMI watchdog disabled for cpu1: unable to create perf event: -2
This update includes a patch that allows the "perf" subsystem to load when using these servers, but only using the same counter that the BIOS uses. The implications of this are that "perf" statistics could be corrupted.
- BZ#643667
Previously, Red Hat Enterprise Linux 6 enabled the CONFIG_IMA option in the kernel. This caused the kernel to track all inodes in the system in a radix tree, leading to a huge waste of memory. With this update, an optimized version of a tree (rbtree) is used and memory is no longer wasted.
- BZ#615309
Direct Asynchronous I/O (AIO) which is not issued on file system block boundaries, and falls into a hole in a sparse file on ext4 or xfs file systems, may corrupt file data if multiple I/O operations modify the same file system block. Specifically, if qemu-kvm is used with the aio=native I/O mode over a sparse device image hosted on the ext4 or xfs filesystem, guest file system corruption will occur if partitions are not aligned with the host file system block size. This issue can be avoided by using one of the following techniques:
Align AIOs on file system block boundaries, or do not write to sparse files using AIO on xfs or ext4 filesystems.
KVM: Use a non-sparse system image file or allocate the space by zeroing out the entire file.
KVM: Create the image using an ext3 host filesystem instead of ext4.
KVM: Invoke qemu-kvm with aio=threads (this is the default).
KVM: Align all partitions within the guest image to the host's file system block boundary (default 4k).
- BZ#624909
Running a fsstress test which issues various operations on a ext4 filesystem when usrquota is enabled, the following JBD (Journaling Block Device) error was output in /var/log/messages:
JBD: Spotted dirty metadata buffer (dev = sda10, blocknr = 17635). There's a risk of filesystem corruption in case of system crash.
With this update, by always journaling the quota file modification in an ext4 file system the aforementioned message no longer appears in the logs.
- BZ#593766
The /var/log/messages file could have slowly filled up with error messages similar to the following:
ACPI Error: Illegal I/O port address/length above 64K: 0x0000000000400020/4 (20090903/hwvalid-154)
ACPI Exception: AE_LIMIT, Returned by Handler for [SystemIO] (20090903/evregion-424)
ACPI Error (psparse-0537): Method parse/execution failed [\_GPE._L09] (Node ffff8800797cd298), AE_LIMIT
ACPI Exception: AE_LIMIT, while evaluating GPE method [_L09] (20090903/evgpe-568)
This error message no longer occurs with this update.
- BZ#653245
The kernel syslog contains debugging information that is often useful during exploitation of other vulnerabilities such as kernel heap addresses. With this update, a new CONFIG_SECURITY_DMESG_RESTRICT option has been added to config-generic-rhel which prevents unprivileged users from reading the kernel syslog. This option is by default turned off (0), which means no restrictions.
- BZ#627653
A regression was discovered that caused kernel panic during the booting of any SGI UV100 and UV1000 system unless the virtefi command line option was passed to the kernel by GRUB. With this update, the need for the virtefi command line option is removed and the kernel will boots as expected without it.
- BZ#659480
Prior to this update, running the hwclock --systohc command could halt a running system. This was due to the interrupt transactions being looped back from a local IOH (Input/Output Hub), through the IOH to a local CPU (erroneously), which caused a conflict with I/O port operations and other transactions. With this update, the conflicts are avoided and the system continues to run after executing the hwclock --systohc command.
- BZ#621304
The RELEASE_LOCKOWNER operation has been implemented for the NFSv4 client in order to avoid an exhaustion of NFS server state IDs, which could result in an NFS4ERR_RESOURCE error. Additionally, this update introduces NFSv4 lock state tracking in read/write requests and lock owners labeling.
- BZ#626515
An implementation of the SHA (Secure Hash Algorithm) hashing algorithm for the IBM System z architecture did not produce correct hashes and could potentially cause memory corruption due to broken partial block handling. A partial block could break when it was followed by an update which filled it with leftover bytes. Instead of storing the new leftover bytes at the start of the buffer, they were stored immediately after the previous partial block. With this update, the index pointer is reset, thus resolving the aforementioned partial block handling issue.
- BZ#661113
Outgoing packets were not fragmented after receiving the icmpv6 pkt-too-big message when using the IPSecv6 tunnel mode. This was due to the lack of IPv6 fragmentation support over an IPsec tunnel. With this update, IPv6 fragmentation is fully supported and works as expected when using the IPSecv6 tunnel mode.
- BZ#630810
Prior to this update, performing live migration back and forth during guest installation with network adapters based on the 8168c chipset or the 8111c chipset triggered an rtl8169_interrupt hang due to a RxFIFO overflow. With this update, infinite loops in the IRQ (Interrupt Request) handler caused by RxFIFO overflows are prevented and the aforementioned hang no longer occurs.
- BZ#629066
When booting a Red Hat Enterprise Linux 5.5 kernel on a guest on an AMD host system running Red Hat Enterprise Linux 6, the guest kernel crashes due to an unsupported MSR (Model Specific Registers) read of the MSR_K7_CLK_CTL model. With this update, KVM support was added for the MSR_K7_CLK_CTL model specific register used in the AMD K7 CPU models, thus, the kernel crashes no longer occur.
- BZ#629836
Previously, a Windows XP host experienced the stop error screen (i.e. the "Blue Screen Of Death" error) when booted with the CPU mode name. With this update, a Windows XP host no longer experiences the aforementioned error due to added KVM (Kernel-based Virtual Machine) support for the MSR_EBC_FREQUENCY_ID model specific register.
- BZ#629085
Under certain circumstances, a kernel thread that handles incoming messages from a server could unexpectedly exit by itself. As a result, the kernel thread would free some data structures which could then be referenced by another data structure, resulting in a kernel panic. With this update, kernel threads no longer unexpectedly exit; thus, kernel panic no longer occurs in the aforementioned case.
- BZ#641408
Previously, calling the elevator_change function immediately after the blk_init_queue function resulted in a null pointer dereference. With this update, the null pointer dereference no longer occurs.
- BZ#623199
In certain network setups (specifically, using VLAN on certain NICs where packets are sent through the VLAN GRO rx path), sending packets from an active ethernet port to another inactive ethernet port could affect the network's bridge and cause the bridge to acquire a wrong bridge port. This resulted in all packets not being passed along in the network. With this update, the underlying source code has been modified to address this issue, and network traffic works as expected.
- BZ#683496
Prior to this update, adding a bond over a bridge inside a virtual guest caused the kernel to crash due to a NULL dereference. This update improves the tests for the presence of VLANs configured above bonding (additionally, this update fixes a regression introduced by the patch for BZ#633571) . The new logic determines whether a registration has occurred, instead of testing that the internal vlan_list of a bond is empty. Previously, the system panicked and crashed when vlan_list was not empty, but the vlgrp pointer was still NULL.
- BZ#592879
The memory cgroup controller has its own Out of Memory routine (OOM killer) and kills a process at an OOM event. However, a race condition could cause the pagefault_out_of_memory function to be called after the memory cgroup's OOM. This invoked the generic OOM killer and a panic_on_oom could occur. With this update, only the memory cgroup's OOM killer is invoked and used to kill a process should an OOM occur.
- BZ#613812
This update provides a number of patches that resolve a mutual exclusion fault which could cause the kernel to become unresponsive.
- BZ#634500
Previously, MADV_HUGEPAGE was missing in the include/asm-generic/mman-common.h file which caused madvise to fail to utilize TPH. With this update, the madvise option was removed from /sys/kernel/mm/redhat_transparent_hugepage/enabled since MADV_HUGEPAGE was removed from the madvise system call.
- BZ#619818
If device-mapper-multipath is used, and the default path failure timeout value (/sys/class/fc_remote_ports/rport-xxx/dev_loss_tmo) is changed, that the timeout value will revert to the default value after a path fails, and later restored. Note that this issue will present the lpfc, qla2xxx, ibmfc or fnic Fibre Channel drivers. To work around this issue the dev_loss_tmo value must be adjusted after each path fail/restore event
- BZ#633907
During an installation through Cisco NPV (N port virtualization) to Brocade, adding a LUN (Logical Unit Number) through Add Advanced Target did not work properly. This was caused by the faulty resending of FLOGI (Fabric Login) when a Fibre Channel switch in the NPV mode rejected requests with zero Destination ID. With this update, the LUN is seen and able to be selected for installation.
- BZ#633915
An I/O operation could fast fail when using Device Mapper Multipathing (dm-multipath) if the I/O operation could be retried by the scsi layer. This prevented the multipath layer from starting its error recovery procedure and resulted in unnecessary log messages in the appropriate log files. This update includes a number of optimizations that resolve the aforementioned issue.
- BZ#636233
Previously, timing issues could cause the FIP (FCoE Initialization Protocol) FLOGIs to timeout even if there were no problems. This caused the kernel to go into a non-FIP mode even though it should have been in the FIP mode. With this update, the timing issues no longer occur and the kernel no longer switches to the non-FIP mode when logging to the Fibre Channel Switch/Forwarder.
- BZ#636771
A Red Hat Enterprise Linux 6.0 host (with root on a local disk) with dm-multipath configured on multiple LUNs (Logical Unit Number) hit kernel panic (at scsi_error_handler) with target controller faults during an I/O operation on the dm-multipath devices. This was caused by multipath using the blk_abort_queue() function to allow lower latency path deactivation. The call to blk_abort_queue proved to be unsafe due to a race (between blk_abort_queue and scsi_request_fn). With this update, the race has been resolved and kernel panic no longer occurs on Red Hat Enterprise Linux 6.0 hosts.
- BZ#638297
When an scsi command timed out and the fcoe/libfc driver aborted the command, a race could occur during the clean-up of the command which could result in kernel panic. With this update, the locking mechanism in the clean-up and abort paths was modified, thus, fixing the aforementioned issue.
- BZ#643237
Prior to this update, when using Red Hat Enterprise Linux 6 with a qla4xxx driver and FC (Fibre Channel) drivers using the fc class, a device might have been put in the offline state due to a transport problem. Once the transport problem was resolved, the device was not usable until a user manually corrected the state. This update enables the transition from the offline state to the running state, thus, fixing the problem.
- BZ#668114
Operating in the FIP (FCoE Initialization Protocol) mode and performing operations that bring up ports could cause the fcoe.ko and fnic.ko modules to not be able to re-login when a port was brought back up. This was due to a bug in the FCoE (Fiber Channel over Ethernet) layer causing improper handling of FCoE LOGO frames while in the FIP mode. With this update, FCoE LOGO frames are properly handled when in the FIP mode and the fcoe.ko and fnic.ko modules no longer fail to re-login.
- BZ#632631
Previously, the s390 tape block driver crashed whenever it tried to switch the I/O scheduler. With this update, an official in-kernel API (elevator_change()) is used to switch the I/O scheduler safely; thus, the crashes no longer occurs.
- BZ#635199
The barrier implementation in the Red Hat Enterprise Linux 6 kernel works by completely draining the I/O scheduler's queue, then issuing a preflush, a barrier, and finally a postflush request. However, since the supported file systems in Red Hat Enterprise Linux 6 all implement their own ordering guarantees, the block layer need only provide a mechanism to ensure that a barrier request is ordered with respect to other I/O already in the disk cache. This mechanism avoids I/O stalls experienced by queue draining. The block layer will be updated in future kernels to provide this more efficient mechanism of ensuring ordering.
Workloads that include heavy fsync or metadata activity will see an overall improvement in disk performance. Users taking advantage of the proportional weight I/O controller will also see a boost in performance. In preparation for the block layer updates, third party file system developers need to ensure that data ordering surrounding journal commits are handled within the file system itself, since the block layer will no longer provide this functionality.
These future block layer improvements will change some kernel interfaces such that symbols which are not on the kABI whitelist shall be modified. This may result in the need to recompile third party file system or storage drivers.
- BZ#636994
Handling ALUA (Asymmetric Logical Unit Access) transitioning states did not work properly due to a faulty SCSI (Small Computer System Interface) ALUA handler. With this update, optimized state transitioning prevents the aforementioned behavior.
- BZ#637805
Previously, a write request may have merged with a discard request. This could have posed a potential risk for 3rd party drivers which could possibly issue a discard without waiting properly. With this update, discarding of write block I/O requests by preventing merges of discard and write requests in one block I/O has been introduced, resolving the possible risks.
- BZ#638525
Previously, the /proc/maps file which is read by LVM2 (Logical Volume Manager) contained inconsistencies.
- BZ#644380
Running the Virtual Desktop Server Manager (VDSM) and performing an lvextend during an intensive Virtual Guest power up caused this operation to fail. Since lvextend was blocked, all components became non-responsive: vgs and lvs commands froze the session, Virtual Guests became Paused or Not Responding. This was caused by a faulty use of a lock. With this update, performing an lvextend operation works as expected.
- BZ#658293
The lack of synchronization between the clearing of the QUEUE_FLAG_CLUSTER flag and the setting of the no_cluster flag in the queue_limits variable caused corruption of data. Note that this issue only occurred on hardware that did not support segment merging (that is, clustering). With this update, the synchronization between the aforementioned flags works as expected, thus, corruption of data no longer occurs.
- BZ#669411
Deleting an SCSI (Small Computer System Interface) device attached to a device handler caused applications running in user space, which were performing I/O operations on that device, to become unresponsive. This was due to the fact that the SCSI device handler's activation did not propagate the SCSI device deletion via an error code and a callback to the Device-Mapper Multipath. With this update, deletion of an SCSI device attached to a device handler is properly handled and no longer causes certain applications to become unresponsive.
- BZ#670572
For a device that used a Target Portal Group (TPG) ID which occupied the full 2 bytes in the RTPG (Report Target Port Groups) response (with either byte exceeding the maximum value that may be stored in a signed char), the kernel's calculated TPG ID would never match the group_id that it should. As a result, this signed char overflow also caused the ALUA handler to incorrectly identify the Asymmetric Access State (AAS) of the specified device as well as incorrectly interpret the supported AAS of the target. With this update, the aforementioned issue has been addressed and no longer occurs.
- BZ#680140
Deleting an SCSI (Small Computer System Interface) device attached to a device handler caused applications running in user space, which were performing I/O operations on that device, to become unresponsive. This was due to the fact that the SCSI device handler's activation did not propagate the SCSI device deletion via an error code and a callback to the Device-Mapper Multipath. With this update, deletion of an SCSI device attached to a device handler is properly handled and no longer causes certain applications to become unresponsive.
- BZ#647367
Migrating a guest could have resulted in dirty values for the guest being retained in memory, which could have caused both the guest and qemu to crash. The trigger for this was memory pages being both write-protected and dirty simultaneously. With this update, memory pages in the current bitmap are either dirty or write-protected when migrating a guest, with the result that neither qemu nor guest operating systems crash following a migration.
- BZ#676579
Intensive usage of resources on a guest lead to a failure of networking on that guest: packets could no longer be received. The failure occurred when a DMA (Direct Memory Access) ring was consumed before NAPI (New API; an interface for networking devices which makes use of interrupt mitigation techniques) was enabled which resulted in a failure to receive the next interrupt request. The regular interrupt handler was not affected in this situation (because it can process packets in-place), however, the OOM (Out Of Memory) handler did not detect the aforementioned situation and caused networking to fail. With this update, NAPI is subsequently scheduled for each napi_enable operation; thus, networking no longer fails under the aforementioned circumstances.
- BZ#626956
The kernel panicked when booting the kdump kernel on a s390 system with an initramfs that contained an odd number of bytes. With this update, an initramfs with sufficient padding such that it contains an even number of bytes is generated; thus, the kernel no longer panics.
- BZ#631246
Previously, the destination MAC address validation was not checking for NPIV (N_Port ID Virtualization) addresses, which results in FCoE (Fibre Channel over Ethernet) frames being dropped. With this update, the destination MAC address check for FCoE frames has been modified so that multiple N_port IDs can be multiplexed on a single physical N_port.
- BZ#641315
Reading the /proc/vmcore file on a Red Hat Enterprise Linux 6 system was not optimal because it did not always take advantage of reading through the cached memory. With this update, access to the /dev/oldmem device in the /proc/vmcore file is cached, resulting in faster copying to user space.
- BZ#665110
Bonding, when operating in the ARP monitoring mode, made erroneous assumptions regarding the ownership of ARP frames when it received them for processing. Specifically, it was assumed that the the bonding driver code was the only execution context which had access to the ARP frames network buffer data. As a result, an operation was attempted on the said buffer (specifically, to modify the size of the data buffer) which was forbidden by the kernel when a buffer was shared among several execution contexts. The result of such an operation on a shared buffer could lead to data corruption. Consequently, trying to prevent the corruption, the kernel panicked. This shared state in the network buffer could be forced to occur, for example, when running the tcpdump utility to monitor traffic on the bonding interface. Every buffer the bond interface received would be shared between the driver and the tcpdump process, thus, resulting in the aforementioned kernel panic. With this update, for the particular affected path in the bonding driver, each inbound frame is checked whether it is in the shared state. In case a buffer is shared, a private copy is made for exclusive use by the bonding driver, thus, preventing the kernel panic.
- BZ#672937
Reading the /proc/vmcore file was previously significantly slower on a Red Hat Enterprise Linux 6 system when compared to a Red Hat Enterprise Linux 5 system. This update enables caching of memory accesses; reading of the /proc/vmcore file is now noticeably faster.
- BZ#680478
The kdump kernel (the second kernel) could in some cases become unresponsive due to a pending IPI (Inter-processor Interrupt) from the first kernel. The kernel tries to handle the IPI, but fails to do so due to a NULL pointer dereference. With this update, the underlying source code has been modified to address this issue, and kdump no longer hangs.
- BZ#625585
Physical CPU Hotplug is not supported on Red Hat Enterprise Linux 6 i686.
- BZ#681870
A Peripheral Component Interconnect Express (PCIe) Active State Power Management (ASPM) was not being properly enabled on some platforms. This resulted in the system becoming unresponsive and followed by a Non-Maskable Interrupt (NMI) on some HP ProLiant systems in the Hewlett Packard Smart Array (HPSA) or on some network cards. With this update, the underlying source code has been modified to address this issue.
- BZ#694891
Intel Xeon processor E7 family processors have an issue in which some c-state transitions can cause false correctable Machine Check Exception (MCE) errors to be reported from MCE bank 6 to the user. On some E7 processor family systems, this resulted in "floods" of MCE errors. This patch disables MCE error reporting for bank 6.
- BZ#634703
Systems that have an Emulex FC controller (with SLI-3 based firmware) installed could return a kernel panic during installation. With this update, kernel panic no longer occurs during installation.
- BZ#651584
Kernel panic could occur when the gfs2_glock_hold function was called within the gfs2_process_unlinked_inode function. This was due to the fact that gfs2_glock_hold was being called without a reference already held on the inode in question. This update, resolves this problem by changing the order in which it acquires references to match that of the NFS code; thus, kernel panic no longer occurs.
- BZ#695751
A previously applied patch accidentally removed a check that handled invalid EEPROM (Electrically Erasable Programmable Read-Only Memory) sizes. Without this check the EEPROM validation failed if the EEPROM size was invalid, causing the NIC (Network Interface Controller) to not function properly. This update reintroduces the aforementioned patch, fixing the problem.
- BZ#628951
PowerPC systems having more than 1 TB of RAM could randomly crash or become unresponsive due to an incorrect setup of the Segment Lookaside Buffer (SLB) entry for the kernel stack. With this update, the SLB entry is properly set up.
- BZ#636978
Previously, the vmstat (virtual memory statistics) tool incorrectly reported the disk I/O as swap-in on ppc64 and other architectures that do not support the TRANSPARENT_HUGEPAGE configuration option in the kernel. With this update, the vmstat tool no longer reports incorrect statistics and works as expected.
- BZ#676640
The bnx2i driver could cause a system crash on IBM POWER7 systems. The driver's page tables were not set up properly on Big Endian machines, causing extended error handling (EEH) errors on PowerPC machines. With this update, the page tables are properly set up and a system crash no longer occurs in the aforementioned case.
- BZ#678099
A race condition could occur during a threaded coredump causing some threads to not have a full register set. With this update, the underlying source code has been modified to address this issue and prevent the aforementioned race condition.
- BZ#681668
If an EEH (Enhanced I/O Error Handling) error occurred too early in the boot process, the kernel panicked with an error message similar to the following:
Unable to handle kernel paging request for data at address 0x00000468
Oops: Kernel access of bad area, sig: 11 [#1]
This situation is detected and avoided with this update, with the result that the machine continues to boot normally.
- BZ#683115
A race condition caused by a missing mutual exclusion lock in the device_pm_pre_add() function and the device_pm_pre_add_cleanup() function could occur during the booting of an IBM Power system. As a result, error diagnostic messages were displayed in dmesg. This update adds the missing mutual exclusion lock, resolving this issue.
- BZ#684961
On the PowerPC architecture, a PCI adapter with two functions, one of which uses MSI (Message Signaled Interrupts), and one of which uses MSI-X (an extended version of MSI), could have triggered an EEH (Enhanced I/O Error Handling) from an MSI-X signal when MSI was disabled using an older interface. With this update, the newer interface is used to disable MSI, with the result that the adapter no longer signals a stray MSI-X interrupt, and no EEH is registered.
- BZ#694327
A previously released patch added a spin_unlock into the dtl_disable function for the virtual processor dispatch trace log file. However, the dtl_function did not include a spin_unlock which could cause a deadlock to occur. With this update, the missing spin_unlock has been added, and a deadlock no longer occurs.
- BZ#695678
Section 14.11.3.2 "H_REGISTER_VPA" in the POWER Architecture Platform Reference (PAPR) specified that Dispatch Trace Log (DTL) buffers could not cross Active Memory Sharing (AMS) environments and memory entitlement granule boundaries (of size 4kB). However, kmalloc (a method for allocating memory in the kernel) did not guarantee an alignment of the allocation beyond 8 bytes. This update adds a special kmem cache for DLT buffers with the aforementioned alignment requirement.
- BZ#593566
Certain scan requests failed to complete before the network interface was brought down. As a result, a warning will appear in the kernel log regarding wdev_cleanup_work. In some cases connectivity may be lost until the next reboot. If connectivity is restored, then the warning may be safely ignored. In other cases, the driver module may need to be reloaded or the system may need to be rebooted.
- BZ#633836
Installing a debug kernel caused the PERC (Dell PowerEdge RAID Controller) 700 adapter to enter an undefined state and produce incorrect error messages. This has been corrected so that installing the debug kernel no longer causes the PERC 700 adapter to enter an undefined state and display erroneous RAID DIMM error messages.
- BZ#664832
Systems Management Applications using the libsmbios package could become unresponsive on Dell PowerEdge servers (specifically, Dell PowerEdge 2970 and Dell PowerEdge SC1435). The dcdbas driver can perform an I/O write operation which causes an SMI (System Management Interrupt) to occur. However, the SMI handler processed the SMI well after the outb function was processed, which caused random failures resulting in the aforementioned hang. With this update, the underlying source code has been modified to address this issue, and systems management applications using the libsmbios package no longer become unresponsive.
- BZ#692673
If an error occurred during an I/O operation, the SCSI driver reset the megaraid_sas controller to restore it to normal state. However, on Red Hat Enterprise Linux 6, the waiting time to allow a full reset completion for the megaraid_sas controller was too short. The driver incorrectly recognized the controller as stalled, and, as a result, the system stalled as well. With this update, more time is given to the controller to properly restart, thus, the controller operates as expected after being reset.
- BZ#638269
The lock reclaim operation on a Red Hat Enterprise Linux 6 NFSv4 client did not work properly when, after a server reboot, an I/O operation which resulted in a STALE_STATEID response was performed before the RENEW call was sent to the server. This behavior was caused due to the improper use of the state flags. While investigating this bug, a different bug was discovered in the state recovery operation which resulted in a reclaim thread looping in the nfs4_reclaim_open_state() function. With this update, both operations have been fixed and work as expected.
- BZ#680549
Previously, the UDP (User Datagram Protocol) transmit path ran under a socket lock due to the corking feature, which limited scalability due to having to transmit to the same socket in multiple threads. With this update, the transmit path has been made lockless when corking is not used, which greatly increases UDP transmit speed.
- BZ#630060
On a system configured with an HP Smart Array controller, during the kdump process, the capturing kernel could have become unresponsive and the following error message logged:
NMI: IOCK error (debug interrupt?)
As a workaround, the system can be configured by blacklisting the hpsa module in a configuration file such as /etc/modules.d/blacklist.conf, and specifying the disk_timeout option so that saving the vmcore over the network is possible.
- BZ#700430
Under certain circumstances, a command could be left unprocessed when using either the cciss or the hpsa driver. This was because the HP Smart Array controller considered all commands to be completed when, in fact, some commands were still left in the completion queue. This could cause the file system to become read-only or panic and the whole system to become unstable. With this update, an extra read operation has been added to both of the aforementioned drivers, fixing this issue.
- BZ#617137
On platforms using an Intel 7500 or an Intel 5500 chipset (or their derivatives), occasionally, a VT-d specification defined error occurred in the kdump kernel (the second kernel). As a result of the VT-d error, on some platforms, an SMI (System Management Interrupt) was issued and the system became unresponsive. With this update, a VT-d error is properly handled so that an SMI is no longer issued, and the system no longer hangs.
- BZ#664364
Invocating an EFI (Extensible Firmware Interface) call caused a restart or a failure to boot to occur on a system with more than 512GB of memory because the EFI page tables did not map the whole kernel space. EFI page tables used only one PGD (Page Global Directory) entry to map the kernel space; thus, virtual addresses higher than PAGE_OFFSET + 512GB could not be accessed. With this update, EFI page tables map the whole kernel space.
- BZ#655231
A previously introduced patch that prevented kbuild to attempt to sign an out-of-the-tree module only fixed this issue for cases when a full kernel tree was used for compiling. Using the kernel-devel package for compilation remained broken. This update allows out-of-the-tree modules to compile using the kernel-devel package only.
- BZ#703504
Prior to this update, external modules could be built using the "-Werr" option, which resulted in a failure to build any major third party module. This update, disables the "-Werr" option for external modules, fixing the issue.