Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
7.103. kernel
Security Fixes
- CVE-2014-2523, Important
- A flaw was found in the way the Linux kernel's netfilter connection tracking implementation for Datagram Congestion Control Protocol (DCCP) packets used the skb_header_pointer() function. A remote attacker could use this flaw to send a specially crafted DCCP packet to crash the system or, potentially, escalate their privileges on the system.
- CVE-2013-6383, Moderate
- A flaw was found in the way the Linux kernel's Adaptec RAID controller (aacraid) checked permissions of compat IOCTLs. A local attacker could use this flaw to bypass intended security restrictions.
- CVE-2014-0077, Moderate
- A flaw was found in the way the handle_rx() function handled large network packets when mergeable buffers were disabled. A privileged guest user could use this flaw to crash the host or corrupt QEMU process memory on the host, which could potentially result in arbitrary code execution on the host with the privileges of the QEMU process.
Bug Fixes
- BZ#1078512
- The memory page allocation mechanism of the mlx4 driver used exclusively "order 2" allocations when allocating memory for incoming frames. This led to a high memory page allocation failure rate on systems with high memory fragmentation. With this update, the mlx4 driver firstly attempts to perform "order 3" allocations and then continues with lower order allocations up to "order 0" if the memory is too fragmented. As a result, performance of mlx4 cards is now significantly higher and mlx4 no longer generates memory page allocation failures when the system is under memory pressure.
- BZ#1091161
- Due to a ndlp list corruption bug in the lpfc driver, systems with Emulex LPe16002B-M6 PCIe 2-port 16Gb Fibre Channel Adapters could trigger a kernel panic during I/O operations. A series of patches has been backported to address this problem so the kernel no longer panics during I/O operations on the aforementioned systems.
- BZ#1064912
- Previously, the GFS2 kernel module leaked memory in the gfs2_bufdata slab cache and allowed a use-after-free race condition to be triggered in the gfs2_remove_from_journal() function. As a consequence after unmounting the GFS2 file system, the GFS2 slab cache could still contain some objects, which subsequently could, under certain circumstances, result in a kernel panic. A series of patches has been applied to the GFS2 kernel module, ensuring that all objects are freed from the slab cache properly and the kernel panic is avoided.
- BZ#1078492
- Due to a regression bug in the mlx4 driver, Mellanox mlx4 adapters could become unresponsive on heavy load along with IOMMU allocation errors being logged to the systems logs. A patch has been applied to the mlx4 driver so that the driver now calculates the last memory page fragment when allocating memory in the Rx path.
- BZ#1086845
- A system could enter a deadlock situation when the Real-Time (RT) scheduler was moving RT tasks between CPUs and the wakeup_kswapd() function was called on multiple CPUs, resulting in a kernel panic. This problem has been fixed by removing a problematic memory allocation and therefore calling the wakeup_kswapd() function from a deadlock-safe context.
- BZ#1079868
- Due to a bug in the hrtimers subsystem, the clock_was_set() function called an inter-processor interrupt (IPI) from soft IRQ context and waited to its completion, which could result in a deadlock situation. A patch has been applied to fix this problem by moving the clock_was_set() function call to the working context. Also during the resume process, the hrtimers_resume() function reprogrammed kernel timers only for the current CPU because it assumed that all other CPUs are offline. However, this assumption was incorrect in certain scenarios, such as when resuming a Xen guest with some non-boot CPUs being only stopped with IRQs disabled. As a consequence, kernel timers were not corrected on other than the boot CPU even though those CPUs were online. To resolve this problem, hrtimers_resume() has been modified to trigger an early soft IRQ to correctly reprogram kernel timers on all CPUs that are online.
- BZ#1094621
- When processing a directory with a huge amount of files (over five hundred thousand) on a GFS2 file system, the respective task could become unresponsive and memory allocation failures could occur. This happened because the GFS2 was updating atime in a memory reclamation path, resulting in occasional failures under memory pressure. To handle atime updates effectively, this update introduces a new super block operation, dirty_inode(). GFS2 now processes large directories as expected without any memory allocation failures or hanging tasks.
- BZ#1092352
- Prior to this update, a guest-provided value was used as the head length of the socket buffer allocated on the host. If the host was under heavy memory load and the guest-provided value was too large, the allocation could have failed, resulting in stalls and packet drops in the guest's Tx path. With this update, the guest-provided value has been limited to a reasonable size so that socket buffer allocations on the host succeed regardless of the memory load on the host, and guests can send packets without experiencing packet drops or stalls.
Security Fixes
- CVE-2014-0101, Important
- A flaw was found in the way the Linux kernel processed an authenticated COOKIE_ECHO chunk during the initialization of an SCTP connection. A remote attacker could use this flaw to crash the system by initiating a specially crafted SCTP handshake in order to trigger a NULL pointer dereference on the system.
Bug Fixes
- BZ#1077872
- Previously, the vmw_pwscsi driver could attempt to complete a command to the SCSI mid-layer after reporting a successful abort of the command. This led to a double completion bug and a subsequent kernel panic. This update ensures that the pvscsi_abort() function returns SUCCESS only after the abort is completed, preventing the driver from invalid attempts to complete the command.
- BZ#1028593
- A bug in the kernel's file system code allowed the d_splice_alias() function to create a new dentry for a directory with an already-existing non-DISCONNECTED dentry. As a consequence, a thread accessing the directory could attempt to take the i_mutex on that directory twice, resulting in a deadlock situation. To resolve this problem, d_splice_alias() has been modified so that in the problematic cases, it reuses an existing dentry instead of creating a new dentry.
- BZ#1063198
- Recent changes in the d_splice_alias() function introduced a bug that allowed d_splice_alias() to return a dentry from a different directory than was the directory being looked up. As a consequence in cluster environment, a kernel panic could be triggered when a directory was being removed while a concurrent cross-directory operation was performed on this directory on another cluster node. This update avoids the kernel panic in this situation by correcting the search logic in the d_splice_alias() function so that the function can no longer return a dentry from an incorrect directory.
- BZ#1078873
- The Red Hat GFS2 file system previously limited a number of ACL entries per inode to 25. However, this number was insufficient in some cases, causing the setfacl command to fail. This update increases this limit to maximum of 300 ACL entries for the 4 KB block size. If the block size is smaller, this value is adjusted accordingly.
- BZ#1078640
- A bug in the megaraid_sas driver could cause the driver to read the hardware status values incorrectly. As a consequence, the RAID card was disabled during the system boot and the system could fail to boot. With this update, the megaraid_sas driver has been corrected so that the RAID card is now enabled on system boot as expected.
- BZ#1017904
- Previously, the kernel did not support unsharing for PID name spaces. With this update, a series of patches has been applied to the relevant kernel code to support the unshare() system call for PID name spaces.
- BZ#1075553
- When allocating kernel memory, the SCSI device handlers called the sizeof() function with a structure name as its argument. However, the modified files were using an incorrect structure name, which resulted in an insufficient amount of memory being allocated and subsequent memory corruption. This update modifies the relevant sizeof() function calls to rather use a pointer to the structure instead of the structure name so that the memory is now always allocated correctly.
- BZ#1085307
- Previously, GFS2 marked files that were written to for in-core data flushing only if the file size was actually increased. When the gfs2_fsync() function was called on a file that was not marked for in-core data flushing, any metadata or journaled data was not synchronized to the disk. This could, under certain circumstances, cause writes to files that were open for synchronous I/O to return before the data was written to the disk, allowing the data to be lost during a crash. A patch has been applied to mark files correctly whenever metadata has been updated during a write, ensuring that all in-core data are written to the disk with synchronous I/O operations.
- BZ#1086590
- Due to a bug in the GFS2 resource group code, the GFS2 block allocator did not switch from using blocking locks to non-blocking locks after the selected reservation group was found unsatisfactory for the allocation request with a block reservation. As a consequence, the block allocator used only blocking locks for all resource groups since that point, greatly reducing performance of the file system unless it was periodically remounted. This update ensures that the GFS2 block allocator overrides the non-blocking lock only for the appropriate resource group, and the file system performs as expected without any intervention.
Security Fixes
- CVE-2013-4387, Important
- A flaw was found in the way the Linux kernel's IPv6 implementation handled certain UDP packets when the UDP Fragmentation Offload (UFO) feature was enabled. A remote attacker could use this flaw to crash the system or, potentially, escalate their privileges on the system.
- CVE-2013-4470, Important
- A flaw was found in the way the Linux kernel's TCP/IP protocol suite implementation handled sending of certain UDP packets over sockets that used the UDP_CORK option when the UDP Fragmentation Offload (UFO) feature was enabled on the output device. A local, unprivileged user could use this flaw to cause a denial of service or, potentially, escalate their privileges on the system.
- CVE-2013-6367, Important
- A divide-by-zero flaw was found in the apic_get_tmcct() function in KVM's Local Advanced Programmable Interrupt Controller (LAPIC) implementation. A privileged guest user could use this flaw to crash the host.
- CVE-2013-6368, Important
- A memory corruption flaw was discovered in the way KVM handled virtual APIC accesses that crossed a page boundary. A local, unprivileged user could use this flaw to crash the system or, potentially, escalate their privileges on the system.
- CVE-2013-6381, Important
- A buffer overflow flaw was found in the way the qeth_snmp_command() function in the Linux kernel's QETH network device driver implementation handled SNMP IOCTL requests with an out-of-bounds length. A local, unprivileged user could use this flaw to crash the system or, potentially, escalate their privileges on the system.
- CVE-2013-4591, Moderate
- It was found that the fix for CVE-2012-2375 released via RHSA-2012:1580 accidentally removed a check for small-sized result buffers. A local, unprivileged user with access to an NFSv4 mount with ACL support could use this flaw to crash the system or, potentially, escalate their privileges on the system.
- CVE-2013-2851, Low
- A format string flaw was found in the Linux kernel's block layer. A privileged, local user could potentially use this flaw to escalate their privileges to kernel level (ring0).
Bug Fixes
- BZ#1063353
- Previously, the sysfs_dev_char_kobj variable was freed on shutdown, but the variable could be used later by the USB stack, and possibly other code, which could cause the system to terminate unexpectedly. The underlying source code has been modified to prevent "kobjects" in the device_shutdown() function from being removed, as the /sys/dev/block/ and /sys/dev/char/ directories must be kept because of the symbolic links pointing to the devices. As a result, the system no longer crashes in the described scenario.
- BZ#1062112
- Previously, when hot adding memory to the system, the memory management subsystem always performed unconditional page-block scans for all memory sections being set online. The total duration of the hot add operation depends on both, the size of memory that the system already has and the size of memory that is being added. Therefore, the hot add operation took an excessive amount of time to complete if a large amount of memory was added or if the target node already had a considerable amount of memory. This update optimizes the code so that page-block scans are performed only when necessary, which greatly reduces the duration of the hot add operation.
- BZ#1058417
- When performing read operations on an XFS file system, failed buffer readahead can leave the buffer in the cache memory marked with an error. This could lead to incorrect detection of stale errors during completion of an I/O operation because most callers do not zero out the b_error field of the buffer on a subsequent read. To avoid this problem and ensure correct I/O error detection, the b_error field of the used buffer is now zeroed out before submitting an I/O operation on a file.
- BZ#1060490
- When transferring a large amount of data over the peer-to-peer (PPP) link, a rare race condition between the throttle() and unthrottle() functions in the tty driver could be triggered. As a consequence, the tty driver became unresponsive, remaining in the throttled state, which resulted in the traffic being stalled. Also, if the PPP link was heavily loaded, another race condition in the tty driver could has been triggered. This race allowed an unsafe update of the available buffer space, which could also result in the stalled traffic. A series of patches addressing both race conditions has been applied to the tty driver; if the first race is triggered, the driver loops and forces re-evaluation of the respective test condition, which ensures uninterrupted traffic flow in the described situation. The second race is now completely avoided due to a well-placed read lock, and the update of the available buffer space proceeds correctly.
- BZ#1059990
- Due to a bug in the SELinux socket receive hook, network traffic was not dropped upon receiving a peer:recv access control denial on some configurations. A broken labeled networking check in the SELinux socket receive hook has been corrected, and network traffic is now properly dropped in the described case.
- BZ#1059382
- Due to a bug in ext4 metadata allocation code, the number of metadata blocks needed to complete a file system operation could be calculated incorrectly. Consequently, when performing file system operations on a nearly full ext4 file system, unexpected allocation failures could occur at writeback time, leading to possible data loss and file system inconsistency. A series of patches has been applied, fixing metadata allocation estimation problems and introducing a reserved space concept that ensures correct allocation of metadata in specific situations, such as the aforementioned scenario.
- BZ#1055363
- Previously, certain SELinux functions did not correctly handle the TCP synchronize-acknowledgment (SYN-ACK) packets when processing IPv4 labeled traffic over an INET socket. The initial SYN-ACK packets were labeled incorrectly by SELinux, and as a result, the access control decision was made using the server socket's label instead of the new connection's label. In addition, SELinux was not properly inspecting outbound labeled IPsec traffic, which led to similar problems with incorrect access control decisions. A series of patches that addresses these problems has been applied to SELinux. The initial SYN-ACK packets are now labeled correctly and SELinux processes all SYN-ACK packets as expected.
- BZ#1041143
- A bug in the mlx4 driver could trigger a race between the "blue flame" feature's traffic flow and the stamping mechanism in the Tx ring flow when processing Work Queue Elements (WQEs) in the Tx ring. Consequently, the related queue pair (QP) of the mlx4 Ethernet card entered an error state and the traffic on the related Tx ring was blocked. A patch has been applied to the mlx4 driver so that the driver does not stamp the last completed WQE in the Tx ring, and thus avoids the aforementioned race.
- BZ#1058419
- Previously, the e752x_edac module incorrectly handled the pci_dev usage count, which could reach zero and deallocate a PCI device structure. As a consequence, a kernel panic could occur when the module was loaded multiple times on some systems. This update fixes the usage count that is triggered by loading and unloading of the module repeatedly, and a kernel panic no longer occurs.
- BZ#1048098
- Previously, task management commands in the lpfc driver had a fixed timeout value of 60 seconds, which could pose a problem for error handling. The lpfc driver has been upgraded to version 8.3.7.21.2p in order to include a fix of this problem. The timeout of the task management commands is now adjustable in range from 5 to 180 seconds, and by default, it is set to 60 seconds.
- BZ#1046042
- Inefficient usage of Big Kernel Locks (BKLs) in the ptrace() system call could lead to BKL contention on certain systems that widely utilize ptrace(), such as User-mode Linux (UML) systems, resulting in degraded performance on these systems. This update removes the relevant BKLs from the ptrace() system call, thus resolving any related performance issues.
- BZ#1038122
- An improper function call in a previous kernel patch backport caused the PID namespace nesting to malfunction. This could adversely affect the proper functioning of other components, such as the Linux Container (LXC) driver, that rely on nested PID namespace usage. A patch has been applied to correct this problem so that nested PID namespaces can be used as expected.
- BZ#1056143
- Previously, GFS2 marked files that were written to for in-core data flushing only if the file size was actually increased. When the gfs2_fsync() function was called on a file that was not marked for in-core data flushing, any metadata or journaled data was not synchronized to the disk. This could, under certain circumstances, cause writes to files that were open for synchronous I/O to return before the data was written to the disk, allowing the data to be lost during a crash. A patch has been applied to mark files correctly whenever metadata has been updated during a write, ensuring that all in-core data are written to the disk with synchronous I/O operations.
Bug Fixes
- BZ#962894
- When extending memory, the hot-add operation could fail while the machine was under memory pressure, causing a kernel panic. A patch has been applied to improve the memory hot-add operation and this problem can now occur only in extremely rare occasions.
- BZ#1030168
- A bug in the netpoll transmit (Tx) code path that is used for netconsole logging could lead to various problems with bonding devices, for example, an invalid Tx queue index could have been used. To avoid these problems, an upstream patch has been backported to allow netpoll calling the external netdev_pick_tx() function from the netpoll_send_skb_on_dev() function.
- BZ#1014968
- The igb driver previously used a 16-bit mask when writing values of the flow control high-water mark to hardware registers on a network device. Consequently, the values were truncated on some network device, disrupting the flow control. A patch has been applied to the igb driver so that it now uses 32-bit mask as expected.
- BZ#1006167
- Previously, mounting a GFS2 file system in spectator mode on a cluster node was not possible if no other cluster node had already mounted this GFS2 file system. In such a case, a "file system consistency error" occurred and the GFS2 file system was withdrawn. A patch has been applied to allow the first cluster node mounting a GFS2 file system in spectator mode if all the file system journals are clean.
- BZ#1006388
- Due to a bug in the transmit path of the bonding driver, a buffer for the bonding device queue mapping could become corrupted. As a consequence, a kernel panic could occur or the system could become unresponsive in certain environments, such as is running a KVM guest in the Red Hat Enterprise Virtualization (RHEV) hypervisor with netconsole enabled and a bonding device over a network bridge configured. A patch has been applied to save the bonding device queue mapping buffer properly, and buffer corruption in this scenario is now prevented.
- BZ#1006664
- A bug in the kernel's super block code allowed a race between the get_active_super() and umount() functions that could lead to a use-after-free issue, resulting in a kernel oops. An upstream patch has been backported to fix this problem so that get_active_super() repeats attempts to obtain the active super block until it succeeds. The aforementioned race no longer occurs.
- BZ#1008508
- A kernel panic could occur during path failover on systems using multiple iSCSI, FC or SRP paths to connect an iSCSI initiator and an iSCSI target. This happened because a race condition in the SCSI driver allowed removing a SCSI device from the system before processing its run queue, which led to a NULL pointer dereference. The SCSI driver has been modified and the race is now avoided by holding a reference to a SCSI device run queue while it is active.
- BZ#1009251
- When a driver does not support namespace, the user must use the VLAN splinter feature from Open vSwitch to support VLANs and TCP traffic. However,when using the be2net driver and the VLAN splinter feature was enabled, the floating IP traffic could fail. This bug has been fixed and incompatibilities no longer occur, when using the VLAN splinter feature with the be2net driver.
- BZ#1019614
- When removing neigh entries, the list_del() function removed the neigh entry from the associated struct ipoib_path, while the ipoib_neigh_free() function removed the neigh entry from the device's neigh entry lookup table. Both of these operations were protected by a spinlock. However, the table was also protected by RCU kernel locking, and thus the spinlock was not held when performing read operations. Consequently, a race condition occurred, in which a thread could successfully look up a neigh entry that has already been deleted from the list of neighbor characters, but the previous deletion had marked the entry as "poisoned", and list_del() on the object caused a kernel panic. The list_del() function has been into ipoib_neigh_free(), so that deletion happens only once, after the entry has been successfully removed from the lookup table, thus fixing the bug.
- BZ#1010451
- If the arp_interval and arp_validate bonding options were not enabled on the configured bond device in the correct order, the bond device did not process ARP replies, which led to link failures and changes of the active slave device. A series of patches has been applied to modify an internal bond ARP hook based on the values of arp_validate and arp_interval. Therefore, the ARP hook is registered even if arp_interval is set after arp_validate has already been enabled, and ARP replies are processed as expected.
- BZ#1018966
- When GFS2 files were unlinked, sometimes they were not deleted completely. This could happen because when multiple nodes in a cluster accessed the same deleted file, the node responsible for freeing the "unlinked" blocks could not have been determined properly. Consequently, many deleted dinode blocks that should have been freed were often left in an "unlinked" state. With this update, the responsibility handover for deleting unlinked dinodes is accomplished through a mechanism known as the "iopen" glock. The "iopen" glocks are no longer cached by nodes after the point where it becomes impossible to free the dinode blocks. As a result, the dinode blocks for unlinked dinodes are now freed properly by the last process to close the file.
- BZ#1017905
- When the Audit subsystem was under heavy load, it could loop infinitely in the audit_log_start() function instead of failing over to the error recovery code. This could cause soft lockups in the kernel. With this update, the timeout condition in the audit_log_start() function has been modified to properly fail over when necessary.
- BZ#1018965
- Due to a race condition in the kernel's key management code, any process searching for a key in a keyring could dereference a NULL pointer while that key was instantiated as negative. This led to a kernel panic. A patch to fix this bug has been provided so that the kernel now handles the aforementioned situation properly without triggering the race.
- BZ#1016108
- The crypto_larval_lookup() function could return a larval, an in-between state when a cryptographic algorithm is being registered, even if it did not create one. This could cause a larval to be terminated twice, and result in a kernel panic. This occurred for example when the NFS service was running in FIPS mode, and attempted to use the MD5 hashing algorithm even though FIPS mode has this algorithm blacklisted. A condition has been added to the crypto_larval_lookup() function to check whether a larval was created before returning it.
- BZ#1012049
- Previously, the tcp_ioctl() function tried to take into account if a TCP socket has received a packet with a FIN flag in order to report the correct number of bytes in the receive queue. However, in certain cases, the reported number of bytes in the receive queue was incorrect. This bug has been fixed by using an improved way to detect if a TCP packet with a FIN flag has been received.
- BZ#988807
- Previously, on systems with RAID10 arrays defined, stack memory could become corrupted due to an insufficient amount of memory being allocated for a dynamically sized kernel data structure, leading to a kernel panic. This bug has been fixed and RAID10 arrays can now safely run without the risk of causing a kernel panic.
- BZ#1009756
- Due to the way the VFS code resolves dentry lookups, a race between multiple threads could have been triggered if the threads performed lookups on the same FUSE dentry subtree that contained an invalid (or stale) dentry or inode. As a consequence, the threads could fail with an ENOENT error instead of properly resolving a new dentry or inode. This update applies a series of patches to the FUSE code that addresses this problem and the aforementioned race can no longer occur.
- BZ#1004661
- Previously, the Hyper-V utility services negotiated the highest version of the Key-Value Pair (KVP) protocol that a Windows Server 2012 R2 host advertised but the host implemented a KVP protocol version that was not compatible with prior versions of the KVP protocol. Consequently, the IP injection functionality did not work on the latest Windows Server 2012 R2 host. This update explicitly specifies the KVP protocol version that the guest can support.
- BZ#1012495
- When a userspace process was reading the /proc/$PID/pagemap file, a memory leak could occur. An upstream patch has been provided to fix this bug, and memory usage before and after the mm_leak call is now the same.
- BZ#1020994
- Previously, when a CPU was brought offline, a race window occurred. During the race window, if an inter processor interrupt (IPI) was received, it got lost. As a consequence, the system became unresponsive. To fix this bug, a check has been added to the __cpu_disable() function, which executes the enqueued but not yet received IPIs before the CPU is marked offline.
- BZ#1023350
- Previously, when the user added an IPv6 route for local delivery, the route did not work and packets could not be sent. A patch has been applied to limit the neighbor entry creation only for input flow, thus fixing this bug. As a result, IPv6 routes for local delivery now work as expected.
- BZ#1014687, BZ#1025736
- The qla2xxx driver did not use any locking mechanism when passing information between its ISR and mailbox routines. Under certain conditions, this led to multiple mailbox command completions being signaled, which, in turn, led to a false mailbox timeout error for the subsequently issued mailbox command. This bug has been fixed and a mailbox timeout error no longer occurs in this scenario.
Enhancements
- BZ#1011168
- With this update, the missing values for the PG_buddy variable have been added to the kexec system call in order to increase dump performance relating to the buddy system for filtering free pages.
- BZ#990483
- Support for the fallocate method has been added to Filesystem in Userspace (FUSE). This method allows the caller to preallocate and deallocate blocks of a file.
Security Fixes
- CVE-2013-4162, Moderate
- A flaw was found in the way the Linux kernel's TCP/IP protocol suite implementation handled IPv6 sockets that used the UDP_CORK option. A local, unprivileged user could use this flaw to cause a denial of service.
- CVE-2013-4299, Moderate
- An information leak flaw was found in the way Linux kernel's device mapper subsystem, under certain conditions, interpreted data written to snapshot block devices. An attacker could use this flaw to read data from disk blocks in free space, which are normally inaccessible.
Bug Fixes
- BZ#987261
- Due to a bug in the NFS code, kernel size-192 and size-256 slab caches could leak memory. This could eventually result in an OOM issue when the most of available memory was used by the respective slab cache. A patch has been applied to fix this problem and the respective attributes in the NFS code are now freed properly.
- BZ#987262
- NFS previously allowed extending an NFS file write to cover a full page only if the file had not set a byte-range lock. However, extending the write to cover the entire page is sometimes desirable in order to avoid fragmentation inefficiencies. For example, a noticeable performance decrease was reported if a series of small non-contiguous writes was performed on the file. A patch has been applied to the NFS code that allows NFS extending a file write to a full page write if the whole file is locked for writing or if the client holds a write delegation.
- BZ#988228
- A change in the ipmi_si driver handling caused an extensively long delay while booting Red Hat Enterprise Linux 6.4 on SIG UV platforms. The driver was loaded as a kernel module on previous versions of Red Hat Enterprise Linux 6 while it is now built within the kernel. However, SIG UV does not use, and thus does not support the ipmi_si driver. A patch has been applied and the kernel now does not initialize the ipmi_si driver when booting on SIG UV.
- BZ#988384
- The GFS2 did not reserve journal space for a quota change block while growing the size of a file. Consequently, a fatal assertion causing a withdraw of the GFS2 file system could have been triggered when the free blocks were allocated from the secondary bitmap. With this update, GFS2 reserves additional blocks in the journal for the quota change so the file growing transaction can now complete successfully in this situation.
- BZ#988708
- A dentry leak occurred in the FUSE code when, after a negative lookup, a negative dentry was neither dropped nor was the reference counter of the dentry decremented. This triggered a BUG() macro when unmounting a FUSE subtree containing the dentry, resulting in a kernel panic. A series of patches related to this problem has been applied to the FUSE code and negative dentries are now properly dropped so that triggering the BUG() macro is now avoided.
- BZ#991346
- The fnic driver previously allowed I/O requests with the number of SGL descriptors greater than is supported by Cisco UCS Palo adapters. Consequently, the adapter returned any I/O request with more than 256 SGL descriptors with an error indicating invalid SGLs. A patch has been applied to limit the maximum number of supported SGLs in the fnic driver to 256 and the problem no longer occurs.
- BZ#993544
- An NFS server could terminate unexpectedly due to a NULL pointer dereference caused by a rare race condition in the lockd daemon. An applied patch fixes this problem by protecting the relevant code with spin locks, and thus avoiding the race in lockd.
- BZ#993547
- The kernel interface to ACPI had implemented error messaging incorrectly. The following error message was displayed when the system had a valid ACPI Error Record Serialization Table (ERST) and the pstore.backend kernel parameter had been used to disable use of ERST by the pstore interface:
ERST: Could not register with persistent store
However, the same message was also used to indicate errors precluding registration. A series of patches modifies the relevant ACPI code so that ACPI now properly distinguish between different cases and accordingly prints unique and informative messages. - BZ#994140
- Due a bug in the memory mapping code, the fadvise64() system call sometimes did not flush all the relevant pages of the given file from cache memory. A patch addresses this problem by adding a test condition that verifies whether all the requested pages were flushed and retries with an attempt to empty the LRU pagevecs in the case of test failure.
- BZ#994866
- A previous patch to the CIFS code caused a regression of a problem where under certain conditions, a mount attempt of a CIFS DFS share fails with a "mount error(6): No such device or address" error message. This happened because the return code variable was not properly reset after a previous unsuccessful mount attempt. A backported patch has been applied to properly reset the variable and CIFS DFS shares can now be mounted as expected.
- BZ#994867
- Previously, systems running heavily-loaded NFS servers could experience poor performance of the NFS READDIR operations on large directories that were undergoing concurrent modifications, especially over higher latency connections. This happened because the NFS code performed certain dentry operations inefficiently and revalidated directory attributes too often. This update applies a series of patches that address the problem as follows; needed dentries can be accessed from dcache after the READDIR operation, and directory attributes are revalidated only at the beginning of the directory or if the cached attributes expire.
- BZ#995334
- A previous change in the bridge multicast code allowed sending general multicast queries in order to achieve faster convergence on startup. To prevent interference with multicast routers, send packets contained a zero source IP address. However, these packets interfered with certain multicast-aware switches, which resulted in the system being flooded with the IGMP membership queries with zero source IP address. A series of patches addresses this problem by disabling multicast queries by default and implementing multicast querier that allows to toggle up sending of general multicast queries if needed.
- BZ#995458
- When a slave device started up, the current_arp_slave parameter was unset but the active flags on the slave were not marked inactive. Consequently, more than one slave device with active flags in active-backup mode could be present on the system. A patch has been applied to fix this problem by marking the active flags inactive for a slave device before the current_arp_slave parameter is unset.
- BZ#996014
- An infinite loop bug in the NFSv4 code caused an NFSv4 mount process to hang on a busy loop of the LOOKUP_ROOT operation when attempting to mount an NFSv4 file system and the first iteration on this operation failed. A patch has been applied that allows to exit the LOOKUP_ROOT operation properly and a mount attempt now either succeeds or fails in this situation.
- BZ#996424
- An NFS client previously did not wait for completing of unfinished I/O operations before sending the LOCKU and RELEASE_LOCKOWNER operations to the NFS server in order to release byte range locks on files. Consequently, if the server processed the LOCKU and RELEASE_LOCKOWNER operations before some of the related READ operations, it released all locking states associated with the requested lock owner, and the READs returned the NFS4ERR_BAD_STATEID error code. This resulted in the "Lock reclaim failed!" error messages being generated in the system log and the NFS client had to recover from the error. A series of patches has been applied ensuring that an NFS client waits for all outstanding I/O operations to complete before releasing the locks.
- BZ#997746
- A previous patch to the bridge multicast code introduced a bug allowing reinitialization of an active timer for a multicast group whenever an IPv6 multicast query was received. A patch has been applied to the bridge multicast code so that a bridge multicast timer is no longer reinitialized when it is active.
- BZ#997916
- An use-after-free issue in the PPS (Pulse-per-second) driver could cause the kernel to crash when unregistering the PPS source. A patch has been applied to resolve this problem so the respective char device is now removed from the system prior to its deallocating. The patch also prevents deallocating a PPS device with open file descriptors.
- BZ#999328
- Previously, power-limit notification interrupts were enabled by default on the system. This could lead to degradation of system performance or even render the system unusable on certain platforms, such as Dell PowerEdge servers. A patch has been applied to disable power-limit notification interrupts by default and a new kernel command line parameter "int_pln_enable" has been added to allow users observing these events using the existing system counters. Power-limit notification messages are also no longer displayed on the console. The affected platforms no longer suffer from degraded system performance due to this problem.
- BZ#1000314
- A bug in the autofs4 mount expiration code could cause the autofs4 module to falsely report a busy tree of NFS mounts as "not in use". Consequently, automount attempted to unmount the tree and failed with a "failed to umount offset" error, leaving the mount tree to appear as empty directories. A patch has been applied to remove an incorrectly used autofs dentry mount check and the aforementioned problem no longer occurs.
- BZ#1001954
- An insufficiently designed calculation in the CPU accelerator could cause an arithmetic overflow in the set_cyc2ns_scale() function if the system uptime exceeded 208 days prior to using kexec to boot into a new kernel. This overflow led to a kernel panic on the systems using the Time Stamp Counter (TSC) clock source, primarily the systems using Intel Xeon E5 processors that do not reset TSC on soft power cycles. A patch has been applied to modify the calculation so that this arithmetic overflow and kernel panic can no longer occur under these circumstances.
- BZ#1001963
- Due to a bug in firmware, systems using the LSI MegaRAID controller failed to initialize this device in the kdump kernel if the "intel_iommu=on" and "iommu=pt"kernel parameters were specified in the first kernel. As a workaround until a firmware fix is available, a patch to the megaraid_sas driver has been applied so if the firmware is not in the ready state upon the first attempt to initialize the controller, the driver resets the controller and retries for firmware transition to the ready state.
- BZ#1002184
- Due to a bug in the SCTP code, a NULL pointer dereference could occur when freeing an SCTP association that was hashed, resulting in a kernel panic. A patch addresses this problem by trying to unhash SCTP associations before freeing them and the problem no longer occurs.
- BZ#1003765
- The RAID1 and RAD10 code previously called the raise_barrier() and lower_barrier() functions instead of the freeze_array() and unfreeze_array() functions that are safe being called from within the management thread. As a consequence, a deadlock situation could occur if an MD array contained a spare disk, rendering the respective kernel thread unresponsive. Furthermore, if a shutdown sequence was initiated after this problem had occurred, the shutdown sequence became unresponsive and any in-cache file system data that were not synchronized to the disk were lost. A patch correcting this problem has been applied and the RAID1 and RAID10 code now uses management-thread safe functions as expected.
- BZ#1003931
- A function in the RPC code responsible for verifying whether the cached credentials matches the current process did not perform the check correctly. The code checked only whether the groups in the current process credentials appear in the same order as in the cached credential but did not ensure that no other groups are present in the cached credentials. As a consequence, when accessing files in NFS mounts, a process with the same UID and GID as the original process but with a non-matching group list could have been granted an unauthorized access to a file, or under certain circumstances, the process could have been wrongly prevented from accessing the file. The incorrect test condition has been fixed and the problem can no longer occur.
- BZ#1004657
- The xen-netback and xen-netfront drivers cannot handle packets with size greater than 64 KB including headers. The xen-netfront driver previously did not account for any headers when determining the maximum size of GSO (Generic Segmentation Offload). Consequently, Xen DomU guest operations could have caused a network DoS issue on DomU when sending packets larger than 64 KB. This update adds a patch that corrects calculation of the GSO maximum size and the problem no longer occurs.
- BZ#1006932
- A bug in the real-time (RT) scheduler could cause a RT priority process to stop running due to an invalid attribute of the run queue. When a CPU became affected by this bug, the migration kernel thread stopped running on the CPU, and subsequently every other process that was migrated to the affected CPU by the system stopped running as well. A patch has been applied to the RT scheduler and RT priority processes are no longer affected this problem.
- BZ#1006956
- A patch included in kernel version 2.6.32-358.9.1.el6, to fix handling of revoked NFSv4 delegations, introduced a regression bug to the NFSv4 code. This regression in the NFSv4 exception and asynchronous error handling allowed, under certain circumstances, passing a NULL inode to an NFSv4 delegation-related function, which resulted in a kernel panic. The NFSv4 exception and asynchronous error handling has been fixed so that a NULL inode can no longer be passed in this situation.
Security Fixes
- CVE-2013-2206, Important
- A flaw was found in the way the Linux kernel's Stream Control Transmission Protocol (SCTP) implementation handled duplicate cookies. If a local user queried SCTP connection information at the same time a remote attacker has initialized a crafted SCTP connection to the system, it could trigger a NULL pointer dereference, causing the system to crash.
- CVE-2013-2224, Important
- It was found that the fix for CVE-2012-3552 released via RHSA-2012:1304 introduced an invalid free flaw in the Linux kernel's TCP/IP protocol suite implementation. A local, unprivileged user could use this flaw to corrupt kernel memory via crafted sendmsg() calls, allowing them to cause a denial of service or, potentially, escalate their privileges on the system.
- CVE-2013-2146, Moderate
- A flaw was found in the Linux kernel's Performance Events implementation. On systems with certain Intel processors, a local, unprivileged user could use this flaw to cause a denial of service by leveraging the perf subsystem to write into the reserved bits of the OFFCORE_RSP_0 and OFFCORE_RSP_1 model-specific registers.
- CVE-2013-2232, Moderate
- An invalid pointer dereference flaw was found in the Linux kernel'sTCP/IP protocol suite implementation. A local, unprivileged user could use this flaw to crash the system or, potentially, escalate their privileges on the system by using sendmsg() with an IPv6 socket connected to an IPv4 destination.
- CVE-2012-6544, Low
- Information leak flaws in the Linux kernel's Bluetooth implementation could allow a local, unprivileged user to leak kernel memory to user-space.
- CVE-2013-2237, Low
- An information leak flaw in the Linux kernel could allow a privileged, local user to leak kernel memory to user-space.
Bug Fixes
- BZ#956054
- The kernel could rarely terminate instead of creating a dump file when a multi-threaded process using FPU aborted. This happened because the kernel did not wait until all threads became inactive and attempted to dump the FPU state of active threads into memory which triggered a BUG_ON() routine. A patch addressing this problem has been applied and the kernel now waits for the threads to become inactive before dumping their FPU state into memory.
- BZ#959930
- Due to the way the CPU time was calculated, an integer multiplication overflow bug could occur after several days of running CPU bound processes that were using hundreds of kernel threads. As a consequence, the kernel stopped updating the CPU time and provided an incorrect CPU time instead. This could confuse users and lead to various application problems. This update applies a patch fixing this problem by decreasing the precision of calculations when the stime and rtime values become too large. Also, a bug allowing stime values to be sometimes erroneously calculated as utime values has been fixed.
- BZ#963557
- Due to several bugs in the ext4 code, data integrity system calls did not always properly persist data on the disk. Therefore, the unsynchronized data in the ext4 file system could have been lost after the system's unexpected termination. A series of patches has been applied to the ext4 code to address this problem, including a fix that ensures proper usage of data barriers in the code responsible for file synchronization. Data loss no longer occurs in the described situation.
- BZ#974597
- A previous patch that modified dcache and autofs code caused a regression. Due to this regression, unmounting a large number of expired automounts on a system under heavy NFS load caused soft lockups, rendering the system unresponsive. If a "soft lockup" watchdog was configured, the machine rebooted. To fix the regression, the erroneous patch has been reverted and the system now handle the aforementioned scenario properly without any soft lockups.
- BZ#975576
- A system could become unresponsive due to an attempt to shut down an XFS file system that was waiting for log I/O completion. A patch to the XFS code has been applied that allows for the shutdown method to be called from different contexts so XFS log items can be deleted properly even outside the AIL, which fixes this problem.
- BZ#975578
- XFS file systems were occasionally shut down with the "xfs_trans_ail_delete_bulk: attempting to delete a log item that is not in the AIL" error message. This happened because the EFI/EFD handling logic was incorrect and the EFI log item could have been freed before it was placed in the AIL and committed. A patch has been applied to the XFS code fixing the EFI/EFD handling logic and ensuring that the EFI log items are never freed before the EFD log items are processed. The aforementioned error no longer occurs on an XFS shutdown.
- BZ#977668
- A race condition between the read_swap_cache_async() and get_swap_page() functions in the memory management (mm) code could lead to a deadlock situation. The deadlock could occur only on systems that deployed swap partitions on devices supporting block DISCARD and TRIM operations if kernel preemption was disabled (the !CONFIG_PREEMPT parameter). If the read_swap_cache_async() function was given a SWAP_HAS_CACHE entry that did not have a page in the swap cache yet, a DISCARD operation was performed in the scan_swap_map() function. Consequently, completion of an I/O operation was scheduled on the same CPU's working queue the read_swap_cache_async() was running on. This caused the thread in read_swap_cache_async() to loop indefinitely around its "-EEXIST" case, rendering the system unresponsive. The problem has been fixed by adding an explicit cond_resched() call to read_swap_cache_async(), which allows other tasks to run on the affected CPU, and thus avoiding the deadlock.
- BZ#977680, BZ#989923
- A previous change in the port auto-selection code allowed sharing ports with no conflicts extending its usage. Consequently, when binding a socket with the SO_REUSEADDR socket option enabled, the bind(2) function could allocate an ephemeral port that was already used. A subsequent connection attempt failed in such a case with the EADDRNOTAVAIL error code. This update applies a patch that modifies the port auto-selection code so that bind(2) now selects a non-conflict port even with the SO_REUSEADDR option enabled.
- BZ#979293
- Cyclic adding and removing of the st kernel module could previously cause a system to become unresponsive. This was caused by a disk queue reference count bug in the SCSI tape driver. An upstream patch addressing this bug has been backported to the SCSI tape driver and the system now responds as expected in this situation.
- BZ#979912
- On KVM guests with the KVM clock (kvmclock) as a clock source and with some VCPUs pinned, certain VCPUs could experience significant sleep delays (elapsed time was greater 20 seconds). This resulted in unexpected delays by sleeping functions and inaccurate measurement for low latency events. The problem happened because a kvmclock update was isolated to a certain VCPU so the NTP frequency correction applied only to that single VCPU. This problem has been resolved by a patch allowing kvmclock updates to all VCPUs on the KVM guest. VCPU sleep time now does not exceed the expected amount and no longer causes the aforementioned problems.
- BZ#981177
- When using applications that intensively utilized memory mapping, customers experienced significant application latency, which led to serious performance degradation. A series of patches has been applied to fix the problem. Among other, the patches modifies the memory mapping code to allow block devices to require stable page writes, enforce stable page writes only if required by a backing device, and optionally snapshot page content to provide stable pages during write. As a result, application latency has been improved by a considerable amount and applications with high demand of memory mapping now perform as expected.
- BZ#982116
- The bnx2x driver could have previously reported an occasional MDC/MDIO timeout error along with the loss of the link connection. This could happen in environments using an older boot code because the MDIO clock was set in the beginning of each boot code sequence instead of per CL45 command. To avoid this problem, the bnx2x driver now sets the MDIO clock per CL45 command. Additionally, the MDIO clock is now implemented per EMAC register instead of per port number, which prevents ports from using different EMAC addresses for different PHY accesses. Also, a boot code or Management Firmware (MFW) upgrade is required to prevent the boot code (firmware) from taking over link ownership if the driver's pulse is delayed. The BCM57711 card requires boot code version 6.2.24 or later, and the BCM57712/578xx cards require MFW version 7.4.22 or later.
- BZ#982472
- If the Audit queue is too long, the kernel schedules the kauditd daemon to alleviate the load on the Audit queue. Previously, if the current Audit process had any pending signals in such a situation, it entered a busy-wait loop for the duration of an Audit backlog timeout because the wait_for_auditd() function was called as an interruptible task. This could lead to system lockup in non-preemptive uniprocessor systems. This update fixes the problem by setting wait_for_auditd() as uninterruptible.
- BZ#982496
- A possible race in the tty layer could result in a kernel panic after triggering the BUG_ON() macro. As a workaround, the BUG_ON() macro has been replaced by the WARN_ON() macro, which allows for avoiding the kernel panic and investigating the race problem further.
- BZ#982571
- A recent change in the memory mapping code introduced a new optional next-fit algorithm for allocating VMAs to map processed files to the address space. This change, however, broke behavior of a certain internal function which then always followed the next-fit VMA allocation scheme instead of the first-fit VMA allocation scheme. Consequently, when the first-fit VMA allocation scheme was in use, this bug caused linear address space fragmentation and could lead to early "-ENOMEM" failures for mmap() requests. This patch restores the original first-fit behavior to the function so the aforementioned problems no longer occur.
- BZ#982697
- When using certain HP hardware with UHCI HDC support and the uhci-hdc driver performed the auto-stop operation, the kernel emitted the "kernel: uhci_hcd 0000:01:00.4: Controller not stopped yet!" warning messages. This happened because HP's virtual UHCI host controller takes extremely long time to suspend (several hundred microseconds) even with no attached USB device and the driver was not adjusted to handle this situation. To avoid this problem, the uhci-hdc driver has been modified to not run the auto-stop operation until the controller is suspended.
- BZ#982703
- A previously released erratum, RHSA-2013:0911, included a patch that added support for memory configurations greater than 1 TB of RAM on AMD systems, and a patch that fixed a kernel panic preventing installation of Red Hat Enterprise Linux on such systems. However, these patches broke booting of Red Hat Enterprise Linux 6.4 on the SGI UV platform, and therefore they have been reverted with this update. Red Hat Enterprise Linux 6.4 now boots on SGI UV as expected.
- BZ#982758
- Due to a bug in descriptor handling, the ioat driver did not correctly process pending descriptors on systems with the Intel Xeon Processor E5 family. Consequently, the CPU was utilized excessively on these systems. A patch has been applied to the ioat driver so the driver now determines pending descriptors correctly and CPU usage is normal again for the described processor family.
- BZ#990464
- A bug in the network bridge code allowed an internal function to call code which was not atomic-safe while holding a spin lock. Consequently, a "BUG: scheduling while atomic" error has been triggered and a call trace logged by the kernel. This update applies a patch that orders the function properly so the function no longer holds a spin lock while calling code which is not atomic-safe. The aforementioned error with a call trace no longer occurs in this case.
- BZ#990470
- A race condition in the abort task and SPP device task management path of the isci driver could, under certain circumstances, cause the driver to fail cleaning up timed-out I/O requests that were pending on an SAS disk device. As a consequence, the kernel removed such a device from the system. A patch applied to the isci driver fixes this problem by sending the task management function request to the SAS drive anytime the abort function is entered and the task has not completed. The driver now cleans up timed-out I/O requests as expected in this situation.
Security Fixes
- CVE-2013-2128, Moderate
- A flaw was found in the tcp_read_sock() function in the Linux kernel's IPv4 TCP/IP protocol suite implementation in the way socket buffers (skb) were handled. A local, unprivileged user could trigger this issue via a call to splice(), leading to a denial of service.
- CVE-2012-6548, CVE-2013-2634, CVE-2013-2635, CVE-2013-3222, CVE-2013-3224, CVE-2013-3225, Low
- Information leak flaws in the Linux kernel could allow a local, unprivileged user to leak kernel memory to user-space.
- CVE-2013-0914, Low
- An information leak was found in the Linux kernel's POSIX signals implementation. A local, unprivileged user could use this flaw to bypass the Address Space Layout Randomization (ASLR) security feature.
- CVE-2013-1848, Low
- A format string flaw was found in the ext3_msg() function in the Linux kernel's ext3 file system implementation. A local user who is able to mount an ext3 file system could use this flaw to cause a denial of service or, potentially, escalate their privileges.
- CVE-2013-2852, Low
- A format string flaw was found in the b43_do_request_fw() function in the Linux kernel's b43 driver implementation. A local user who is able to specify the "fwpostfix" b43 module parameter could use this flaw to cause a denial of service or, potentially, escalate their privileges.
- CVE-2013-3301, Low
- A NULL pointer dereference flaw was found in the Linux kernel's ftrace and function tracer implementations. A local user who has the CAP_SYS_ADMIN capability could use this flaw to cause a denial of service.
Bug Fixes
- BZ#924847
- An error in backporting the block reservation feature from upstream resulted in a missing allocation of a reservation structure when an allocation is required during the rename system call. Renaming a file system object (for example, file or directory) requires a block allocation for the destination directory. If the destination directory had not had a reservation structure allocated, a NULL pointer dereference occurred, leading to a kernel panic. With this update, a reservation structure is allocated before the rename operation, and a kernel panic no longer occurs in this scenario. This patch also ensures that the inode's multi-block reservation is not deleted when a file is closed while changing the inode's size.
- BZ#927308
- When an inconsistency is detected in a GFS2 file system after an I/O operation, the kernel performs the withdraw operation on the local node. However, the kernel previously did not wait for an acknowledgement from the GFS control daemon (gfs_controld) before proceeding with the withdraw operation. Therefore, if a failure isolating the GFS2 file system from a data storage occurred, the kernel was not aware of this problem and an I/O operation to the shared block device may have been performed after the withdraw operation was logged as successful. This could lead to corruption of the file system or prevent the node from journal recovery. This patch modifies the GFS2 code so the withdraw operation no longer proceeds without the acknowledgement from gfs_controld, and the GFS2 file system can no longer become corrupted after performing the withdraw operation.
- BZ#927317
- The GFS2 discard code did not calculate the sector offset correctly for block devices with the sector size of 4 KB, which led to loss of data and metadata on these devices. A patch correcting this problem has been applied so the discard and FITRIM requests now work as expected for the block devices with the 4 KB sector size.
- BZ#956296
- The virtual file system (VFS) code had a race condition between the unlink and link system calls that allowed creating hard links to deleted (unlinked) files. This could, under certain circumstances, cause inode corruption that eventually resulted in a file system shutdown. The problem was observed in Red Hat Storage during rsync operations on replicated Gluster volumes that resulted in an XFS shutdown. A testing condition has been added to the VFS code, preventing hard links to deleted files from being created.
- BZ#956979
- The sunrpc code paths that wake up an RPC task are highly optimized for speed so the code avoids using any locking mechanism but requires precise operation ordering. Multiple bugs were found related to operation ordering, which resulted in a kernel crash involving either a BUG_ON() assertion or an incorrect use of a data structure in the sunrpc layer. These problems have been fixed by properly ordering operations related to the RPC_TASK_QUEUED and RPC_TASK_RUNNING bits in the wake-up code paths of the sunrpc layer.
- BZ#958684
- A previous update introduced a new failure mode to the blk_get_request() function returning the -ENODEV error code when a block device queue is being destroyed. However, the change did not include a NULL pointer check for all callers of the function. Consequently, the kernel could dereference a NULL pointer when removing a block device from the system, which resulted in a kernel panic. This update applies a patch that adds these missing NULL pointer checks. Also, some callers of the blk_get_request() function could previously return the -ENOMEM error code instead of -ENODEV, which would lead to incorrect call chain propagation. This update applies a patch ensuring that correct return codes are propagated.
- BZ#962368
- A rare race condition between the "devloss" timeout and discovery state machine could trigger a bug in the lpfc driver that nested two levels of spin locks in reverse order. The reverse order of spin locks led to a deadlock situation and the system became unresponsive. With this update, a patch addressing the deadlock problem has been applied and the system no longer hangs in this situation.
- BZ#962370
- When attempting to deploy a virtual machine on a hypervisor with multiple NICs and macvtap devices, a kernel panic could occur. This happened because the macvtap driver did not gracefully handle a situation when the macvlan_port.vlans list was empty and returned a NULL pointer. This update applies a series of patches which fix this problem using a read-copy-update (RCU) mechanism and by preventing the driver from returning a NULL pointer if the list is empty. The kernel no longer panics in this scenario.
- BZ#962372
- Certain CPUs contain on-chip virtual-machine control structure (VMCS) caches that are used to keep active VMCSs managed by the KVM module. These VMCSs contain runtime information of the guest machines operated by KVM. These CPUs require support of the VMCLEAR instruction that allows flushing the cache's content into memory. The kernel previously did not use the VMCLEAR instruction in Kdump. As a consequence, when dumping a core of the QEMU KVM host, the respective CPUs did not flush VMCSs to the memory and the guests' runtime information was not included in the core dump. This problem has been addressed by a series of patches that implement support of using the VMCLEAR instruction in Kdump. The kernel is now performs the VMCLEAR operation in Kdump if it is required by a CPU so the vmcore file of the QEMU KVM host contains all VMCSs information as expected.
- BZ#963564
- When a network interface (NIC) is running in promiscuous (PROMISC) mode, the NIC may receive and process VLAN tagged frames even though no VLAN is attached to the NIC. However, some network drivers, such as bnx2, igb, tg3, and e1000e did not handle processing of packets with VLAN tagged frames in PROMISC mode correctly if the frames had no VLAN group assigned. The drivers processed the packets with incorrect routines and various problems could occur; for example, a DHCPv6 server connected to a VLAN could assign an IPv6 address from the VLAN pool to a NIC with no VLAN interface. To handle the VLAN tagged frames without a VLAN group properly, the frames have to be processed by the VLAN code so the aforementioned drivers have been modified to restrain from performing a NULL value test of the packet's VLAN group field when the NIC is in PROMISC mode. This update also includes a patch fixing a bug where the bnx2x driver did not strip a VLAN header from the frame if no VLAN was configured on the NIC, and another patch that implements some register changes in order to enable receiving and transmitting of VLAN packets on a NIC even if no VLAN is registered with the card.
- BZ#964046
- Due to a bug in the NFSv4 nfsd code, a NULL pointer could have been dereferenced when nfsd was looking up a path to the NFSv4 recovery directory for the fsync operation, which resulted in a kernel panic. This update applies a patch that modifies the NFSv4 nfsd code to open a file descriptor for fsync in the NFSv4 recovery directory instead of looking up the path. The kernel no longer panics in this situation.
- BZ#966432
- When adding a virtual PCI device, such as virtio disk, virtio net, e1000 or rtl8139, to a KVM guest, the kacpid thread reprograms the hot plug parameters of all devices on the PCI bus to which the new device is being added. When reprogramming the hot plug parameters of a VGA or QXL graphics device, the graphics device emulation requests flushing of the guest's shadow page tables. Previously, if the guest had a huge and complex set of shadow page tables, the flushing operation took a significant amount of time and the guest could appear to be unresponsive for several minutes. This resulted in exceeding the threshold of the "soft lockup" watchdog and the "BUG: soft lockup" events were logged by both, the guest and host kernel. This update applies a series of patches that deal with this problem. The KVM's Memory Management Unit (MMU) now avoids creating multiple page table roots in connection with processors that support Extended Page Tables (EPT). This prevents the guest's shadow page tables from becoming too complex on machines with EPT support. MMU now also flushes only large memory mappings, which alleviates the situation on machines where the processor does not support EPT. Additionally, a free memory accounting race that could prevent KVM MMU from freeing memory pages has been fixed.
- BZ#968557
- A race condition could occur in the uhci-hcd kernel module if the IRQ line was shared with other devices. The race condition allowed the IRQ handler routine to be called before the data structures were fully initialized, which caused the system to become unresponsive. This update applies a patch that fixes the problem by adding a test condition to the IRQ handler routine; if the data structure initialization is still in progress, the handler routine finishes immediately.
- BZ#969306
- When setting up a bonding device, a certain flag was used to distinguish between TLB and ALB modes. However, usage of this flag in ALB mode allowed enslaving NICs before the bond was activated. This resulted in enslaved NICs not having unique MAC addresses as required, and consequent loss of "reply" packets sent to the slaves. This patch modifies the function responsible for the setup of the slave's MAC address so the flag is no longer needed to discriminate ALB mode from TLB and the flag was removed. The described problem no longer occur in this situation.
- BZ#969326
- When booting the normal kernel on certain servers, such as HP ProLiant DL980 G7, some interrupts may have been lost which resulted in the system being unresponsive or rarely even in data loss. This happened because the kernel did not set correct destination mode during the boot; the kernel booted in "logical cluster mode" that is default while this system supported only "x2apic physical mode". This update applies a series of patches addressing the problem. The underlying APIC code has been modified so the x2apic probing code now checks the Fixed ACPI Description Table (FADT) and installs the x2apic "physical" driver as expected. Also, the APIC code has been simplified and the code now uses probe routines to select destination APIC mode and install the correct APIC drivers.
- BZ#972586
- A bug in the OProfile tool led to a NULL pointer dereference while unloading the OProfile kernel module, which resulted in a kernel panic. The problem was triggered if the kernel was running with the nolapic parameter set and OProfile was configured to use the NMI timer interrupt. The problem has been fixed by correctly setting the NMI timer when initializing OProfile.
- BZ#973198
- Previously, when booting a Red Hat Enterprise Linux 6.4 system and the ACPI Static Resource Affinity Table (SRAT) had a hot-pluggable bit enabled, the kernel considered the SRAT table incorrect and NUMA was not configured. This led to a general protection fault and a kernel panic occurring on the system. The problem has been fixed by using an SMBIOS check in the code in order to avoid the SRAT code table consistency checks. NUMA is now configured as expected and the kernel no longer panics in this situation.
- BZ#973555
- A bug in the PCI driver allowed to use a pointer to the Virtual Function (VF) device entry that was already freed. Consequently, when hot-removing an I/O unit with enabled SR-IOV devices, a kernel panic occurred. This update modifies the PCI driver so a valid pointer to the Physical Function (PF) device entry is used and the kernel no longer panics in this situation.
- BZ#975086
- The kernel previously did not handle situation where the system needed to fall back from non-flat Advanced Programmable Interrupt Controller (APIC) mode to flat APIC mode. Consequently, a NULL pointer was dereferenced and a kernel panic occurred. This update adds the flat_probe() function to the APIC driver, which allows the kernel using flat APIC mode as a fall-back option. The kernel no longer panics in this situation.
Security Fixes
- CVE-2013-1935, Important
- A flaw was found in the way KVM (Kernel-based Virtual Machine) initialized a guest's registered pv_eoi (paravirtualized end-of-interrupt) indication flag when entering the guest. An unprivileged guest user could potentially use this flaw to crash the host.
- CVE-2013-1943, Important
- A missing sanity check was found in the kvm_set_memory_region() function in KVM, allowing a user-space process to register memory regions pointing to the kernel address space. A local, unprivileged user could use this flaw to escalate their privileges.
- CVE-2013-2017, Moderate
- A double free flaw was found in the Linux kernel's Virtual Ethernet Tunnel driver (veth). A remote attacker could possibly use this flaw to crash a target system.
Bug Fixes
- BZ#923096
- Previously, the queue limits were not being retained as they should have been if a device did not contain any data or if a multipath device temporarily lost all its paths. This problem has been fixed by avoiding a call to the
dm_calculate_queue_limits()
function. - BZ#924823
- A bug in the
dm_btree_remove()
function could cause leaf values to have incorrect reference counts. Removal of a shared block could result in space maps considering the block as no longer used. As a consequence, sending a discard request to a shared region of a thin device could corrupt its snapshot. The bug has been fixed to prevent corruption in this scenario. - BZ#927292
- Prior to this update, if Large Receive Offload (LRO) was enabled, Broadcom, QLogic, and Intel card drivers did not fill in all the packet fields. Consequently, when the
macvtap
driver received a packet with agso_type
field that was not set, a kernel panic occurred. With this update, theixgbe
,qlcnic
, andbnx2x
drivers have been fixed to always set thegso_type
field. Thus, kernel panic no longer occurs in the previously-described scenario. - BZ#927294
- Reading a large number of files from a pNFS (parallel NFS) mount and canceling the running operation by pressing Ctrl+c caused a general protection fault in the XDR code, which could manifest itself as a kernel oops with an
unable to handle kernel paging request
message. This happened because decoding of theLAYOUTGET
operation is done by a worker thread and the caller waits for the worker thread to complete. When the reading operation was canceled, the caller stopped waiting and freed the pages. So the pages no longer existed at the time the worker thread called the relevant function in the XDR code. The cleanup process of these pages has been moved to a different place in the code, which prevents the kernel oops from happening in this scenario. - BZ#961431
- By default, the kernel uses a best-fit algorithm for allocating Virtual Memory Areas (VMAs) to map processed files to the address space. However, if an enormous number of small files (hundreds of thousands or millions) was being mapped, the address space became extremely fragmented, which resulted in significant CPU usage and performance degradation. This update introduces an optional next-fit policy which, if enabled, allows for mapping of a file to the first suitable unused area in the address space that follows after the previously allocated VMA.
- BZ#960864
- C-states for the Intel Family 6, Model 58 and 62, processors were not properly initialized in Red Hat Enterprise Linux 6. Consequently, these processors were unable to enter deep C-states. Also, C-state accounting was not functioning properly and power management tools, such as powertop or turbostat, thus displayed incorrect C-state transitions. This update applies a patch that ensures proper C-states initialization so the aforementioned processors can now enter deep core power states as expected. Note that this update does not correct C-state accounting which has been addressed by a separate patch.
- BZ#960436
- If an NFSv4 client was checking open permissions for a delegated OPEN operation during OPEN state recovery of an NFSv4 server, the NFSv4 state manager could enter a deadlock. This happened because the client was holding the NFSv4 sequence ID of the OPEN operation. This problem is resolved by releasing the sequence ID before the client starts checking open permissions.
- BZ#960429
- When using parallel NFS (pNFS), a kernel panic could occur when a process was killed while getting the file layout information during the open() system call. A patch has been applied to prevent this problem from occurring in this scenario.
- BZ#960426
- In the RPC code, when a network socket backed up due to high network traffic, a timer was set causing a retransmission, which in turn could cause even larger amount of network traffic to be generated. To prevent this problem, the RPC code now waits for the socket to empty instead of setting the timer.
- BZ#960420
- Previously, the fsync(2) system call incorrectly returned the EIO (Input/Output) error instead of the ENOSPC (No space left on device) error. This was caused by incorrect error handling in the page cache. This problem has been fixed and the correct error value is now returned.
- BZ#960417
- Previously, an NFS RPC task could enter a deadlock and become unresponsive if it was waiting for an NFSv4 state serialization lock to become available and the session slot was held by the NFSv4 server. This update fixes this problem along with the possible race condition in the pNFS return-on-close code. The NFSv4 client has also been modified to not accepting delegated OPEN operations if a delegation recall is in effect. The client now also reports NFSv4 servers that try to return a delegation when the client is using the CLAIM_DELEGATE_CUR open mode.
- BZ#952613
- When pNFS code was in use, a file locking process could enter a deadlock while trying to recover form a server reboot. This update introduces a new locking mechanism that avoids the deadlock situation in this scenario.
- BZ#960412
- Previously, when open(2) system calls were processed, the GETATTR routine did not check to see if valid attributes were also returned. As a result, the open() call succeeded with invalid attributes instead of failing in such a case. This update adds the missing check, and the open() call succeeds only when valid attributes are returned.
- BZ#955504
- The be2iscsi driver previously leaked memory in the driver's control path when mapping tasks.This update fixes the memory leak by freeing all resources related to a task when the task was completed. Also, the driver did not release a task after responding to the received NOP-IN acknowledgment with a valid Target Transfer Tag (TTT). Consequently, the driver run out of tasks available for the session and no more iscsi commands could be issued. A patch has been applied to fix this problem by releasing the task.
- BZ#960415
- Due to a missing structure, the NFSv4 error handler did not handle exceptions caused by revoking NFSv4 delegations. Consequently, the NFSv4 client received the EIO error message instead of the NFS4ERR_ADMIN_REVOKED error. This update modifies the NFSv4 code to no longer require the nfs4_state structure in order to revoke a delegation.
- BZ#954298
- Under rare circumstances, if a TCP retransmission was multiple times partially acknowledged and collapsed, the used Socked Buffer (SKB) could become corrupted due to an overflow caused by the transmission headroom. This resulted in a kernel panic. The problem was observed rarely when using an IP-over-InfiniBand (IPoIB) connection. This update applies a patch that verifies whether a transmission headroom exceeded the maximum size of the used SKB, and if so, the headroom is reallocated. It was also discovered that a TCP stack could retransmit misaligned SKBs if a malicious peer acknowledged sub MSS frame and output interface did not have a sequence generator (SG) enabled. This update introduces a new function that allows for copying of a SKB with a new head so the SKB remains aligned in this situation.
- BZ#921964
- In a case of a broken or malicious server, an index node (inode) of an incorrect type could be matched. This led to an NFS client NULL pointer dereference, and, consequently, to a kernel oops. To prevent this problem from occurring in this scenario, a check has been added to verify that the inode type is correct.
- BZ#962482
- When using more than 4 GB of RAM with an AMD processor, reserved regions and memory holes (E820 regions) can also be placed above the 4 GB range. For example, on configurations with more than 1 TB of RAM, AMD processors reserve the 1012 GB - 1024 GB range for the Hyper Transport (HT) feature. However, the Linux kernel does not correctly handle E820 regions that are located above the 4 GB range. Therefore, when installing Red Hat Enterprise Linux on a machine with an AMD processor and 1 TB of RAM, a kernel panic occurred and the installation failed. This update modifies the kernel to exclude E820 regions located above the 4 GB range from direct mapping. The kernel also no longer maps the whole memory on boot but only finds memory ranges that are necessary to be mapped. The system can now be successfully installed on the above-described configuration.
- BZ#950529
- This update reverts two previously-included
qla2xxx
patches. These patches changed the fibre channel target port discovery procedure, which resulted in some ports not being discovered in some corner cases. Reverting these two patches fixes the discovery issues. - BZ#928817
- A previously-applied patch introduced a bug in the
ipoib_cm_destroy_tx()
function, which allowed a CM object to be moved between lists without any supported locking. Under a heavy system load, this could cause the system to crash. With this update, proper locking of the CM objects has been re-introduced to fix the race condition, and the system no longer crashes under a heavy load. - BZ#928683
- A bug in the
do_filp_open()
function caused it to exit early if any write access was requested on a read-only file system. This prevented the opening of device nodes on a read-only file system. With this update, thedo_filp_open()
has been fixed to no longer exit if a write request is made on a read-only file system. - BZ#960433
- An NFSv4 client could previously enter a deadlock situation with the state recovery thread during state recovery after a reboot of an NFSv4 server. This happened because the client did not release the NFSv4 sequence ID of an OPEN operation that was requested before the reboot. This problem is resolved by releasing the sequence ID before the client starts waiting for the server to recover.
Enhancement
- BZ#952570
- The kernel now supports memory configurations with more than 1 TB of RAM on AMD systems.
Security Fixes
- CVE-2013-0913, Important
- An integer overflow flaw, leading to a heap-based buffer overflow, was found in the way the Intel i915 driver in the Linux kernel handled the allocation of the buffer used for relocation copies. A local user with console access could use this flaw to cause a denial of service or escalate their privileges.
- CVE-2013-1773, Important
- A buffer overflow flaw was found in the way UTF-8 characters were converted to UTF-16 in the utf8s_to_utf16s() function of the Linux kernel's FAT file system implementation. A local user able to mount a FAT file system with the "utf8=1" option could use this flaw to crash the system or, potentially, to escalate their privileges.
- CVE-2013-1796, Important
- A flaw was found in the way KVM handled guest time updates when the buffer the guest registered by writing to the MSR_KVM_SYSTEM_TIME machine state register (MSR) crossed a page boundary. A privileged guest user could use this flaw to crash the host or, potentially, escalate their privileges, allowing them to execute arbitrary code at the host kernel level.
- CVE-2013-1797, Important
- A potential use-after-free flaw was found in the way KVM handled guest time updates when the GPA (guest physical address) the guest registered by writing to the MSR_KVM_SYSTEM_TIME machine state register (MSR) fell into a movable or removable memory region of the hosting user-space process (by default, QEMU-KVM) on the host. If that memory region is deregistered from KVM using KVM_SET_USER_MEMORY_REGION and the allocated virtual memory reused, a privileged guest user could potentially use this flaw to escalate their privileges on the host.
- CVE-2013-1798, Important
- A flaw was found in the way KVM emulated IOAPIC (I/O Advanced Programmable Interrupt Controller). A missing validation check in the ioapic_read_indirect() function could allow a privileged guest user to crash the host, or read a substantial portion of host kernel memory.
- CVE-2013-1792, Moderate
- A race condition in install_user_keyrings(), leading to a NULL pointer dereference, was found in the key management facility. A local, unprivileged user could use this flaw to cause a denial of service.
- CVE-2013-1826, Moderate
- A NULL pointer dereference in the XFRM implementation could allow a local user who has the CAP_NET_ADMIN capability to cause a denial of service.
- CVE-2013-1827, Moderate
- A NULL pointer dereference in the Datagram Congestion Control Protocol (DCCP) implementation could allow a local user to cause a denial of service.
- CVE-2012-6537, Low
- Information leak flaws in the XFRM implementation could allow a local user who has the CAP_NET_ADMIN capability to leak kernel stack memory to user-space.
- CVE-2012-6546, Low
- Two information leak flaws in the Asynchronous Transfer Mode (ATM) subsystem could allow a local, unprivileged user to leak kernel stack memory to user-space.
- CVE-2012-6547, Low
- An information leak was found in the TUN/TAP device driver in the networking implementation. A local user with access to a TUN/TAP virtual interface could use this flaw to leak kernel stack memory to user-space.
- CVE-2013-0349, Low
- An information leak in the Bluetooth implementation could allow a local user who has the CAP_NET_ADMIN capability to leak kernel stack memory to user-space.
- CVE-2013-1767, Low
- A use-after-free flaw was found in the tmpfs implementation. A local user able to mount and unmount a tmpfs file system could use this flaw to cause a denial of service or, potentially, escalate their privileges.
- CVE-2013-1774, Low
- A NULL pointer dereference was found in the Linux kernel's USB Inside Out Edgeport Serial Driver implementation. An attacker with physical access to a system could use this flaw to cause a denial of service.
Bug Fixes
- BZ#909156
- When running the Hyper-V hypervisor, the host expects guest virtual machines to report free memory and the memory used for memory ballooning, including the pages that were ballooned out. However, the memory ballooning code did not handle reporting correctly, and the pages that were ballooned out were not included in the report. Consequently, after the memory was ballooned out from the guest, the Hyper-V Manager reported an incorrect value of the demanded memory and a memory status. This update provides a patch that adjusts the memory ballooning code to include the ballooned-out pages and to determine the demanded memory correctly.
- BZ#911267
- The Intel 5520 and 5500 chipsets do not properly handle remapping of MSI and MSI-X interrupts. If the interrupt remapping feature is enabled on the system with such a chipset, various problems and service disruption could occur (for example, a NIC could stop receiving frames), and the "kernel: do_IRQ: 7.71 No irq handler for vector (irq -1)" error message appears in the system logs. As a workaround to this problem, it has been recommended to disable the interrupt remapping feature in the BIOS on such systems, and many vendors have updated their BIOS to disable interrupt remapping by default. However, the problem is still being reported by users without proper BIOS level with this feature properly turned off. Therefore, this update modifies the kernel to check if the interrupt remapping feature is enabled on these systems and to provide users with a warning message advising them on turning off the feature and updating the BIOS.
- BZ#911475
- If a logical volume was created on devices with thin provisioning enabled, the mkfs.ext4 command took a long time to complete, and the following message was recorded in the system log:
kernel: blk: request botched
This was caused by discard request merging that was not completely functional in the block and SCSI layers. This functionality has been temporarily disabled to prevent such problems from occurring. - BZ#915579
- Timeout errors could occur on an NFS client with heavy read workloads; for example when using the rsync and ldconfig utilities. Both, client-side and server-side causes were found for the problem. On the client side, problems that could prevent the client reconnecting lost TCP connections; this was fixed prior to this update. On the server side, TCP memory pressure on the server forced the send buffer size to be lower than the size required to send a single Remote Procedure Call (RPC), which consequently caused the server to be unable to reply to the client. A series of patches addressing the server-side problem has been applied. This update provides the last of those patches that removes the redundant xprt->shutdown bit field from the sunrpc code. Setting this bit field could lead to a race causing the aforementioned problem. Timeout errors no longer occur on NFS clients that are under heavy read workload.
- BZ#915583
- Previously, running commands such as "ls", "find" or "move" on a MultiVersion File System (MVFS) could cause a kernel panic. This happened because the d_validate() function, which is used for dentry validation, called the kmem_ptr_validate() function to validate a pointer to a parent dentry. The pointer could have been freed anytime so the kmem_ptr_validate() function could not guarantee the pointer to be dereferenced, which could lead to a NULL pointer derefence. This update modifies d_validate() to verify the parent-child relationship by traversing the parent dentry's list of child dentries, which solves this problem. The kernel no longer panics in the described scenario.
- BZ#916957
- A previous patch introduced the use of the page_descs length field to describe the size of a fuse request. That patch incorrectly handled a code path that does not exist in the upstream fuse code, which resulted in a data corruption when using loop devices over FUSE. This patch fixes this problem by setting the fuse request size before submitting the request.
- BZ#917690
- When the state of the netfilter module was out-of-sync, a TCP connection was recorded in the conntrack table although the TCP connection did not exist between two hosts. If a host re-established this connection with the same source, port, destination port, source address and destination address, the host sent a TCP SYN packet and the peer sent back acknowledgment for this SYN package. However, because netfilter was out-of-sync, netfilter dropped this acknowledgment, and deleted the connection item from the conntrack table, which consequently caused the host to retransmit the SYN packet. A patch has been applied to improve this handling; if an unexpected SYN packet appears, the TCP options are annotated. Acknowledgment for the SYN packet serves as a confirmation of the connection tracking being out-of-sync, then a new connection record is created using the information annotated previously to avoid the retransmission delay.
- BZ#920266
- The NFS code implements the "silly rename" operation to handle an open file that is held by a process while another process attempts to remove it. The "silly rename" operation works according to the "delete on last close" semantics so the inode of the file is not released until the last process that opens the file closes it. A previous update of the NFS code broke the mechanics that prevented an NFS client from deleting a silly-renamed dentry. This affected the "delete on last close" semantics and silly-renamed files could be deleted by any process while the files were open for I/O by another process. As a consequence, the process reading the file failed with the "ESTALE" error code. This update modifies the way how the NFS code handles dentries of silly-renamed files and silly-renamed files can not be deleted until the last process that has the file open for I/O closes it.
- BZ#920268
- The NFSv4 code uses byte range locks to simulate the flock() function, which is used to apply or remove an exclusive advisory lock on an open file. However, using the NFSv4 byte range locks precludes a possibility to open a file with read-only permissions and subsequently to apply an exclusive advisory lock on the file. A previous patch broke a mechanism used to verify the mode of the open file. As a consequence, the system became unresponsive and the system logs filled with a "kernel: nfs4_reclaim_open_state: Lock reclaim failed!" error message if the file was open with read-only permissions and an attempt to apply an exclusive advisory lock was made. This update modifies the NFSv4 code to check the mode of the open file before attempting to apply the exclusive advisory lock. The "-EBADF" error code is returned if the type of the lock does not match the file mode.
- BZ#921145
- Due to a bug in the tty driver, an ioctl call could return the "-EINTR" error code when the "read" command was interrupted by a signal, such as SIGCHLD. As a consequence, the subsequent"read" command caused the Bash shell to abort with a "double free or corruption (out)" error message. An applied patch corrects the tty driver to use the "-ERESTARTSYS" error code so the system call is restarted if needed. Bash no longer crashes in this scenario.
- BZ#921150
- Previously, the NFS Lock Manager (NLM) did not resend blocking lock requests after NFSv3 server reboot recovery. As a consequence, when an application was running on a NFSv3 mount and requested a blocking lock, the application received an -ENOLCK error. This patch ensures that NLM always resend blocking lock requests after the grace period has expired.
- BZ#921535
- Virtual LAN (VLAN) support of the eHEA ethernet adapter did not work as expected. A "device ethX has buggy VLAN hw accel" message could have been reported when running the "dmesg" command. This was because a backported upstream patch removed the vlan_rx_register() function. This update adds the function back, and eHEA VLAN support works as expected. This update also addresses a possible kernel panic, which could occur due to a NULL pointer dereference when processing received VLAN packets. The patch adds a test condition verifying whether a VLAN group is set by the network stack, which prevents a possible NULL pointer to be dereferenced, and the kernel no longer crashes in this situation.
- BZ#921958
- When the Active Item List (AIL) becomes empty, the xfsaild daemon is moved to a task sleep state that depends on the timeout value returned by the xfsaild_push() function. The latest changes modified xfsaild_push() to return a 10-ms value when the AIL is empty, which sets xfsaild into the uninterruptible sleep state (D state) and artificially increased system load average. This update applies a patch that fixes this problem by setting the timeout value to the allowed maximum, 50 ms. This moves xfsaild to the interruptible sleep state (S state), avoiding the impact on load average.
- BZ#921961
- When running a high thread workload of small-sized files on an XFS file system, sometimes, the system could become unresponsive or a kernel panic could occur. This occurred because the xfsaild daemon had a subtle code path that led to lock recursion on the xfsaild lock when a buffer in the AIL was already locked and an attempt was made to force the log to unlock it. This patch removes the dangerous code path and queues the log force to be invoked from a safe locking context with respect to xfsaild. This patch also fixes the race condition between buffer locking and buffer pinned state that exposed the original problem by rechecking the state of the buffer after a lock failure. The system no longer hangs and the kernel no longer panics in this scenario.
- BZ#921963
- The kernel's implementation of RTAS (RunTime Abstraction Services) previously allowed the stop_topology_update() function to be called from an interrupt context during live partition migration on PowerPC and IMB System p machines. As a consequence, the system became unresponsive. This update fixes the problem by calling stop_topology_update() earlier in the migration process, and the system no longer hangs in this situation.
- BZ#922154
- A previous kernel update broke queue pair (qp) hash list deletion in the qp_remove() function. This could cause a general protection fault in the InfiniBand stack or QLogic InfiniBand driver. A patch has been applied to restore the former behavior so the general protection fault no longer occurs.
- BZ#923098
- Due to a bug in the CIFS mount code, it was not possible to mount Distributed File System (DFS) shares in Red Hat Enterprise Linux 6.4. This update applies a series of patches that address this problem and modifies the CIFS mount code so that DFS shares can now be mounted as expected.
- BZ#923204
- When the Red Hat Enterprise Linux 6 kernel runs as a virtual machine, it performs boot-time detection of the hypervisor in order to enable hypervisor-specific optimizations. Red Hat Enterprise Linux 6.4 introduces detection and optimization for the Microsoft Hyper-V hypervisor. Previously Hyper-V was detected first, however, because some Xen hypervisors can attempt to emulate Hyper-V, this could lead to a boot failure when that emulation was not exact. A patch has been applied to ensure that the attempt to detect Xen is always done before Hyper-V, resolving this issue.
- BZ#927309
- When using the congestion window lock functionality of the ip utility, the system could become unresponsive. This happened because the tcp_slow_start() function could enter an infinite loop if the congestion window was locked using route metrics. A set of patches has been applied to comply with the upstream kernel, ensuring the problem no longer occurs in this scenario.
- BZ#928686
- Previously, the tty driver allowed a race condition to occur in the tty buffer code. If the tty buffer was requested by multiple users of the same tty device in the same time frame, the same tty's buffer structure was used and the buffer could exceed the reserved size. This resulted in a buffer overflow problem and a subsequent memory corruption issue, which caused the kernel to panic. This update fixes the problem by implementing a locking mechanism around the tty buffer structure using spin locks. The described race can no longer occur so the kernel can no longer crash due to a tty buffer overflow.
Security Fixes
- CVE-2013-0228, Important
- A flaw was found in the way the xen_iret() function in the Linux kernel used the DS (the CPU's Data Segment) register. A local, unprivileged user in a 32-bit para-virtualized guest could use this flaw to crash the guest or, potentially, escalate their privileges.
- CVE-2013-0268, Important
- A flaw was found in the way file permission checks for the "/dev/cpu/[x]/msr" files were performed in restricted root environments (for example, using a capability-based security model). A local user with the ability to write to these files could use this flaw to escalate their privileges to kernel level, for example, by writing to the SYSENTER_EIP_MSR register.
Bug Fixes
- BZ#908398
- Truncating files on a GFS2 file system could fail with an "unable to handle kernel NULL pointer dereference" error. This was because of a missing reservation structure that caused the truncate code to reference an incorrect pointer. To prevent this, a patch has been applied to allocate a block reservation structure before truncating a file.
- BZ#908733
- Previously, when using parallel network file system (pNFS) and data was written to the appropriate storage device, the LAYOUTCOMMIT requests being sent to the metadata server could fail internally. The metadata server was not provided with the modified layout based on the written data, and these changes were not visible to the NFS client. This happened because the encoding functions for the LAYOUTCOMMIT and LAYOUTRETURN operations were defined as void, and returned thus an arbitrary status. This update corrects these encoding functions to return 0 on success as expected. The changes on the storage device are now propagated to the metadata server and can be observed as expected.
- BZ#908737
- Previously, init scripts were unable to set the MAC address of the master interface properly because it was overwritten by the first slave MAC address. To avoid this problem, this update re-introduces the check for an unassigned MAC address before setting the MAC address of the first slave interface as the MAC address of the master interface.
- BZ#908739
- During device discovery, the system creates a temporary SCSI device with the LUN ID 0 if the LUN 0 is not mapped on the system. Previously, this led to a NULL pointer dereference because inquiry data was not allocated for the temporary LUN 0 device, which resulted in a kernel panic. This update adds a NULL pointer test in the underlying SCSI code, and the kernel no longer panics in this scenario.
- BZ#908744
- Previously on system boot, devices with associated Reserved Memory Region Reporting (RMRR) information had lost their RMRR information after they were removed from the static identity (SI) domain. Consequently, a system unexpectedly terminated in an endless loop due to unexpected NMIs triggered by DMA errors. This problem was observed on HP ProLiant Generation 7 (G7) and 8 (G8) systems. This update prevents non-USB devices that have RMRR information associated with them from being placed into the SI domain during system boot. HP ProLiant G7 and G8 systems that contain devices with the RMRR information now boot as expected.
- BZ#908794
- When counting CPU time, the utime and stime values are scaled based on rtime. Prior to this update, the utime value was multiplied with the rtime value, but the integer multiplication overflow could happen, and the resulting value could be then truncated to 64 bits. As a consequence, utime values visible in the user space were stall even if an application consumed a lot of CPU time. With this update, the multiplication is performed on stime instead of utime. This significantly reduces the chances of an overflow on most workloads because the stime value, unlike the utime value, cannot grow fast.
- BZ#909159
- When using transparent proxy (TProxy) over IPv6, the kernel previously created neighbor entries for local interfaces and peers that were not reachable directly. This update corrects this problem and the kernel no longer creates invalid neighbor entries.
- BZ#909813
- Due to a bug in the superblock code, a NULL pointer could be dereferenced when handling a kernel paging request. Consequently, the request failed and a kernel oops occurred. This update corrects this problem and kernel page requests are processed as expected.
- BZ#909814
- Sometimes, the irqbalance tool could not get the CPU NUMA node information due to missing symlinks for CPU devices in sysfs. This update adds the NUMA node symlinks for CPU devices in sysfs, which is also useful when using irqbalance to build a CPU topology.
- BZ#909815
- A previous kernel patch introduced a bug by assigning a different value to the IFLA_EXT_MASK Netlink attribute than found in the upstream kernels. This could have caused various problems; for example, a binary compiled against the upstream kernel headers could have failed or behaved unexpectedly on Red Hat Enterprise Linux 6.4 and later kernels. This update realigns IFLA_EXT_MASK in the enumeration correctly by synchronizing the IFLA_* enumeration with the upstream. This ensures that binaries compiled against Red Hat Enterprise Linux 6.4 kernel headers will function as expected. Backwards compatibility is guaranteed.
- BZ#909816
- Broadcom 5719 NIC could previously sometimes drop received jumbo frame packets due to cyclic redundancy check (CRC) errors. This update modifies the tg3 driver so that CRC errors no longer occur and Broadcom 5719 NICs process jumbo frame packets as expected.
- BZ#909818
- Previously, the VLAN code incorrectly cleared the timestamping interrupt bit for network devices using the igb driver. Consequently, timestamping failed on the igb network devices with Precision Time Protocol (PTP) support. This update modifies the igb driver to preserve the interrupt bit if interrupts are disabled.
- BZ#910370
- The NFSv4.1 client could stop responding while recovering from a server reboot on an NFSv4.1 or pNFS mount with delegations disabled. This could happen due to insufficient locking in the NFS code and several related bugs in the NFS and RPC scheduler code which could trigger a deadlock situation. This update applies a series of patches which prevent possible deadlock situations from occurring. The NFSv4.1 client now recovers and continue with workload as expected in the described situation.
- BZ#910373
- Previously, race conditions could sometimes occur in interrupt handling on the Emulex BladeEngine 2 (BE2) controllers, causing the network adapter to become unresponsive. This update provides a series of patches for the be2net driver, which prevents the race from occurring. The network cards using BE2 chipsets no longer hang due to incorrectly handled interrupt events.
- BZ#910998
- A previous patch to the mlx4 driver enabled an internal loopback to allow communication between functions on the same host. However, this change introduced a regression that caused virtual switch (vSwitch) bridge devices using Mellanox Ethernet adapter as the uplink to become inoperative in native (non-SRIOV) mode under certain circumstances. To fix this problem, the destination MAC address is written to Tx descriptors of transmitted packets only in SRIOV or eSwitch mode, or during the device self-test. Uplink traffic works as expected in the described setup.
- BZ#911000
- Previously, the kernel did not support a storage discard granularity that was not a power of two. Consequently, if the underlying storage reported such a granularity, the kernel issued discard requests incorrectly, which resulted in I/O errors. This update modifies the kernel to correct calculation of the storage discard granularity and the kernel now process discard requests correctly even for storage devices with the discard granularity that is not power of two.
- BZ#911655
- Previously, a kernel panic could occur on machines using the SCSI sd driver with Data Integrity Field (DIF) type 2 protection. This was because the scsi_register_driver() function registered a prep_fn() function that might have needed to use the sd_cdp_pool variable for the DIF functionality. However, the variable had not yet been initialized at this point. The underlying code has been updated so that the driver is registered last, which prevents a kernel panic from occurring in this scenario.
- BZ#911663
- Previously, the mlx4 driver set the number of requested MSI-X vectors to 2 under multi-function mode on mlx4 cards. However, the default setting of the mlx4 firmware allows for a higher number of requested MSI-X vectors (4 of them with the current firmware). This update modifies the mlx4 driver so that it uses these default firmware settings, which improves performance of mlx4 cards.
Security Fixes
- CVE-2012-4508, Important
- A race condition was found in the way asynchronous I/O and fallocate() interacted when using the ext4 file system. A local, unprivileged user could use this flaw to expose random data from an extent whose data blocks have not yet been written, and thus contain data from a deleted file.
- CVE-2013-0311, Important
- A flaw was found in the way the vhost kernel module handled descriptors that spanned multiple regions. A privileged guest user in a KVM guest could use this flaw to crash the host or, potentially, escalate their privileges on the host.
- CVE-2012-4542, Moderate
- It was found that the default SCSI command filter does not accommodate commands that overlap across device classes. A privileged guest user could potentially use this flaw to write arbitrary data to a LUN that is passed-through as read-only.
- CVE-2013-0190, Moderate
- A flaw was found in the way the xen_failsafe_callback() function in the Linux kernel handled the failed iret (interrupt return) instruction notification from the Xen hypervisor. An unprivileged user in a 32-bit para-virtualized guest could use this flaw to crash the guest.
- CVE-2013-0309, Moderate
- A flaw was found in the way pmd_present() interacted with PROT_NONE memory ranges when transparent hugepages were in use. A local, unprivileged user could use this flaw to crash the system.
- CVE-2013-0310, Moderate
- A flaw was found in the way CIPSO (Common IP Security Option) IP options were validated when set from user mode. A local user able to set CIPSO IP options on the socket could use this flaw to crash the system.
Bug Fixes
- BZ#807385
- Suspending a system (mode S3) running on a HP Z1 All-in-one Workstation with an internal Embedded DisplayPort (eDP) panel and an external DisplayPort (DP) monitor, and, consequently, waking up the system caused the backlight of the eDP panel to not be re-enabled. To fix this issue, the code that handles suspending in the i915 module has been modified to write the BLC_PWM_CPU_CTL parameter using the I915_WRITE function after writing the BLC_PWM_CPU_CTL2 parameter.
- BZ#891839
- Prior to this update, when a VLAN device was set up on a qlge interface, running the TCP Stream Performance test using the netperf utility to test TCP/IPv6 traffic caused the kernel to produce warning messages that impacted the overall performance. This was due to an unsupported feature (NETIF_F_IPV6_CSUM) which was enabled via the NETIF_F_TSO6 flag. This update removes the NETIF_F_TSO6 flag from qlge code and TCP/IPv6 traffic performance is no longer impacted.
- BZ#876912
- The isci driver copied the result of a "Register Device to Host" frame into the wrong buffer causing the SATA DOWNLOAD MICROCODE command to fail, preventing the download of hard drive firmware. This bug in the frame handler routine caused a timeout, resulting in a reset. With this update, the underlying source code has been modified to address this issue, and the isci driver successfully completes SATA DOWNLOAD MICROCODE commands as expected.
- BZ#813677
- In the xHCI code, due to a descriptor that incorrectly pointed at the USB 3.0 register instead of USB 2.0 registers, kernel panic could occur when more USB 2.0 registers were available than USB 3.0 registers. This update fixes the descriptor to point at the USB 2.0 registers, and kernel panic no longer occurs in the aforementioned case.
- BZ#879509
- When the "perf script --gen-script" command was called with a perf.data file which contained no tracepoint events, the command terminated unexpectedly with a segmentation fault due to a NULL "pevent" pointer. With this update, the underlying source code has been modified to address this issue, and the aforementioned command no longer crashes.
- BZ#885030
- Running the mq_notify/5-1 test case from the Open POSIX test suite resulted in corrupted memory, later followed by various kernel crash/BUG messages. This update addresses the mq_send/receive memory corruption issue in the inter-process communication code, and the aforementioned test case no longer fails.
- BZ#841983
- Bond masters and slaves now have separate VLAN groups. As such, if a slave device incurred a network event that resulted in a failover, the VLAN device could process this event erroneously. With this update, when a VLAN is attached to a master device, it ignores events generated by slave devices so that the VLANs do not go down until the bond master does.
- BZ#836748
- Previously in the kernel, when the leap second hrtimer was started, it was possible that the kernel livelocked on the xtime_lock variable. This update fixes the problem by using a mixture of separate subsystem locks (timekeeping and ntp) and removing the xtime_lock variable, thus avoiding the livelock scenarios that could occur in the kernel.
- BZ#836803
- After the leap second was inserted, applications calling system calls that used futexes consumed almost 100% of available CPU time. This occurred because the kernel's timekeeping structure update did not properly update these futexes. The futexes repeatedly expired, re-armed, and then expired immediately again. This update fixes the problem by properly updating the futex expiration times by calling the clock_was_set_delayed() function, an interrupt-safe method of the clock_was_set() function.
- BZ#822691
- When the Fibre Channel (FC) layer sets a device to "running", the layer also scans for other new devices. Previously, there was a race condition between these two operations. Consequently, for certain targets, thousands of invalid devices were created by the SCSI layer and the udev service. This update ensures that the FC layer always sets a device to "online" before scanning for others, thus fixing this bug.
- BZ#845135
- Certain disk device arrays report a medium error without returning any data. This was not being handled correctly in cases where low level device drivers were not setting the optional residual field, however, most modern low level drivers do set it. This update correctly handles cases where low level drivers do not set the residual field in the upper level sd driver, avoiding the potential data corruption.
- BZ#890454
- This update reverts a previously-applied patch that caused the qla2xxx driver to not be able to load on an IBM POWER 7 7895-81X system. This patch has also been isolated as the cause of Dynamic Logical Partitioning (DLPAR) memory remove failures on 2 adapters.
- BZ#841149
- Previous update changed the /proc/stat code to use the get_cpu_idle_time_us() and get_cpu_iowait_time_us() macros if dynamic ticks are enabled in the kernel. This could lead to problems on IBM System z architecture that defines the arch_idle_time() macro. For example, executing the "vmstat" command could fail with "Floating point exception" followed by a core dump. The underlying source code has been modified so that the arch_idle_time() macro is used for idle and iowait times, which prevents the mentioned problem.
- BZ#809792
- The Stream Control Transmission Protocol (SCTP) process became unresponsive inside the sctp_wait_for_sndbuf() function when the sender exhausted the send buffer and waited indefinitely to be woken up. This was because twice the amount of data was accounted for during a packet transmission, once when constructing the packet and the second time when transmitting it. Thus, the available memory resources were used up too early, causing a deadlock. With this update, only a single byte is reserved to ensure the socket stays alive for the life time of the packet, and the SCTP process no longer hangs.
- BZ#852847
- If there are no active threads using a semaphore, blocked threads should be unblocked. Previously, the R/W semaphore code looked for a semaphore counter as a whole to reach zero - which is incorrect because at least one thread is usually queued on the semaphore and the counter is marked to reflect this. As a consequence, the system could become unresponsive when an application used direct I/O on the XFS file system. With this update, only the count of active semaphores is checked, thus preventing the hang in this scenario.
- BZ#861164
- When performing PCI device assignment on AMD systems, a virtual machine using the assigned device could not be able to boot, as the device had failed the assignment, leaving the device in an unusable state. This was due to an improper range check that omitted the last PCI device in a PCI subsystem or tree. The check has been fixed to include the full range of PCI devices in a PCI subsystem or tree. This bug fix avoids boot failures of a virtual machine when the last device in a PCI subsystem is assigned to a virtual machine on an AMD host system.
- BZ#859533
- The mlx4 driver must program the mlx4 card so that it is able to resolve which MAC addresses to listen to, including multicast addresses. Therefore, the mlx4 card keeps a list of trusted MAC addresses. The driver used to perform updates to this list on the card by emptying the entire list and then programming in all of the addresses. Thus, whenever a user added or removed a multicast address or put the card into or out of promiscuous mode, the card's entire address list was re-written. This introduced a race condition, which resulted in a packet loss if a packet came in on an address the card should be listening to, but had not yet been reprogrammed to listen to. With this update, the driver no longer rewrites the entire list of trusted MAC addresses on the card but maintains a list of addresses that are currently programmed into the card. On address addition, only the new address is added to the end of the list, and on removal, only the to-be-removed address is removed from the list. The mlx4 card no longer experiences the described race condition and packets are no longer dropped in this scenario.
- BZ#858850
- Filesystem in Userspace (FUSE) did not implement scatter-gather direct I/O optimally. Consequently, the kernel had to process an extensive number of FUSE requests, which had a negative impact on system performance. This update applies a set of patches which improves internal request management for other features, such as readahead. FUSE direct I/O overhead has been significantly reduced to minimize negative effects on system performance.
- BZ#865637
- A previous kernel update introduced a bug that caused RAID0 and linear arrays larger than 4 TB to be truncated to 4 TB when using 0.90 metadata. The underlying source code has been modified so that 0.90 RAID0 and linear arrays larger than 4 TB are no longer truncated in the md RAID layer.
- BZ#865682
- A larger command descriptor block (CDB) is allocated for devices using Data Integrity Field (DIF) type 2 protection. The CDB was being freed in the sd_done() function, which resulted in a kernel panic if the command had to be retried in certain error recovery cases. With this update, the larger CDB is now freed in the sd_unprep_fn() function instead. This prevents the kernel panic from occurring.
- BZ#857518
- Previously, a use-after-free bug in the usbhid code caused a NULL pointer dereference. Consequent kernel memory corruption resulted in a kernel panic and could cause data loss. This update adds a NULL check to avoid these problems.
- BZ#856325
- A race condition could occur between page table sharing and virtual memory area (VMA) teardown. As a consequence, multiple "bad pmd" message warnings were displayed and "kernel BUG at mm/filemap.c:129" was reported while shutting down applications that share memory segments backed by huge pages. With this update, the VM_MAYSHARE flag is explicitly cleaned during the unmap_hugepage_range() call under the i_mmap_lock. This makes VMA ineligible for sharing and avoids the race condition. After using shared segments backed by huge pages, applications like databases and caches shut down correctly, with no crash.
- BZ#855984
- When I/O is issued through blk_execute_rq(), the blk_execute_rq_nowait() routine is called to perform various tasks. At first, the routine checks for a dead queue. Previously, however, if a dead queue was detected, the blk_execute_rq_nowait() function did not invoke the done() callback function. This resulted in blk_execute_rq() being unresponsive when waiting for completion, which had never been issued. To avoid such hangs, the rq->end_io pointer is initialized to the done() callback before the queue state is verified.
- BZ#855759
- The Stream Control Transmission Protocol (SCTP) ipv6 source address selection logic did not take the preferred source address into consideration. With this update, the source address is chosen from the routing table by taking this aspect into consideration. This brings the SCTP source address selection on par with IPv4.
- BZ#855139
- Under certain circumstances, a system crash could result in data loss on XFS file systems. If files were created immediately before the file system was left to idle for a long period of time and then the system crashed, those files could appear as zero-length once the file system was remounted. This occurred even if a sync or fsync was run on the files. This was because XFS was not correctly idling the journal, and therefore it incorrectly replayed the inode allocation transactions upon mounting after the system crash, which zeroed the file size. This problem has been fixed by re-instating the periodic journal idling logic to ensure that all metadata is flushed within 30 seconds of modification, and the journal is updated to prevent incorrect recovery operations from occurring.
- BZ#854376
- Mellanox hardware keeps a separate list of Ethernet hardware addresses it listens to depending on whether the Ethernet hardware address is unicast or multicast. Previously, the mlx4 driver was incorrectly adding multicast addresses to the unicast list. This caused unstable behavior in terms of whether or not the hardware would have actually listened to the addresses requested. This update fixes the problem by always putting multicast addresses on the multicast list and vice versa.
- BZ#854140
- Previously, the kernel had no way to distinguish between a device I/O failure due to a transport problem and a failure as a result of command timeout expiration. I/O errors always resulted in a device being set offline and the device had to be brought online manually even though the I/O failure occured due to a transport problem. With this update, the SCSI driver has been modified and a new SDEV_TRANSPORT_OFFLINE state has been added to help distinguish transport problems from another I/O failure causes. Transport errors are now handled differently and storage devices can now recover from these failures without user intervention.
- BZ#854053
- In a previous release of Red Hat Enterprise Linux, the new Mellanox packet steering architecture had been intentionally left out of the Red Hat kernel. With Red Hat Enterprise Linux 6.4, the new Mellanox packet steering architecture was merged into Red Hat Mellanox driver. One merge detail was missing, and as a result, the multicast promiscuous flag on an interface was not checked during an interface reset to see if the flag was on prior to the reset and should be re-enabled after the reset. This update fixes the problem, so if an adapter is reset and the multicast promiscuous flag was set prior to the reset, the flag is now still set after the reset.
- BZ#854052
- On dual port Mellanox hardware, the mlx4 driver was adding promiscuous mode to the correct port, but when attempting to remove promiscuous mode from a port, it always tried to remove it from port one. It was therefore impossible to remove promiscuous mode from the second port, and promiscuous mode was incorrectly removed from port one even if it was not intended. With this update, the driver now properly attempts to remove promiscuous mode from port two when needed.
- BZ#853007
- The kernel provided by the Red Hat Enterprise Linux 6.3 release included an unintentional kernel ABI (kABI) breakage with regards to the "contig_page_data" symbol. Unfortunately, this breakage did not cause the checksums to change. As a result, drivers using this symbol could silently corrupt memory on the kernel. This update reverts the previous behavior.
- BZ#852148
- In case of a regular CPU hot plug event, the kernel does not keep the original cpuset configuration and can reallocate running tasks to active CPUs. Previously, the kernel treated switching between suspend and resume modes as a regular CPU hot plug event, which could have a significant negative impact on system performance in certain environments such as SMP KVM virtualization. When resuming an SMP KVM guest from suspend mode, the libvirtd daemon and all its child processes were pinned to a single CPU (the boot CPU) so that all VMs used only the single CPU. This update applies a set of patches which ensure that the kernel does not modify cpusets during suspend and resume operations. The system is now resumed in the exact state before suspending without any performance decrease.
- BZ#851118
- Prior to this update, it was not possible to set IPv6 source addresses in routes as it was possible with IPv4. With this update, users can select the preferred source address for a specific IPv6 route with the "src" option of the "ip -6 route" command.
- BZ#849702
- Previously, when a server attempted to shut down a socket, the svc_tcp_sendto() function set the XPT_CLOSE variable if the entire reply failed to be transmitted. However, before XPT_CLOSE could be acted upon, other threads could send further replies before the socket was really shut down. Consequently, data corruption could occur in the RPC record marker. With this update, send operations on a closed socket are stopped immediately, thus preventing this bug.
- BZ#849188
- The usb_device_read() routine used the bus->root_hub pointer to determine whether or not the root hub was registered. However, this test was invalid because the pointer was set before the root hub was registered and remained set even after the root hub was unregistered and deallocated. As a result, the usb_device_read() routine accessed freed memory, causing a kernel panic; for example, on USB device removal. With this update, the hcs->rh_registered flag - which is set and cleared at the appropriate times - is used in the test, and the kernel panic no longer occurs in this scenario.
- BZ#894344
- BE family hardware could falsely indicate an unrecoverable error (UE) on certain platforms and stop further access to be2net-based network interface cards (NICs). A patch has been applied to disable the code that stops further access to hardware for BE family network interface cards (NICs). For a real UE, it is not necessary as the corresponding hardware block is not accessible in this situation.
- BZ#847838
- Previously, a race condition existed whereby device open could race with device removal (for example when hot-removing a storage device), potentially leading to a kernel panic. This was due a use-after-free error in the block device open patch, which has been corrected by not referencing the "disk" pointer after it has been passed to the module_put() function.
- BZ#869750
- The hugetlbfs file system implementation was missing a proper lock protection of enqueued huge pages at the gather_surplus_pages() function. Consequently, the hstate.hugepages_freelist list became corrupted, which caused a kernel panic. This update adjusts the code so that the used spinlock protection now assures atomicity and safety of enqueued huge pages when handling hstate.hugepages_freelist. The kernel no longer panics in this scenario.
- BZ#847310
- An unnecessary check for the RXCW.CW bit could cause the Intel e1000e NIC (Network Interface Controller) to not work properly. The check has been removed so that the Intel e1000e NIC now works as expected.
- BZ#846585
- If a mirror or redirection action is configured to cause packets to go to another device, the classifier holds a reference count. However, it was previously assuming that the administrator cleaned up all redirections before removing. Packets were therefore dropped if the mirrored device was not present, and connectivity to the host could be lost. To prevent such problems, a notifier and cleanup are now run during the unregister action. Packets are not dropped if the a mirrored device is not present.
- BZ#846419
- Previously, the MultiTech MT9234MU USB serial device was not supported by version 0.9 of the it_usb_3410_5052 kernel module. With this update, the MultiTech MT9234MU USB serial device is supported by this version.
- BZ#846024
- Previously, the I/O watchdog feature was disabled when Intel Enhanced Host Controller Interface (EHCI) devices were detected. This could cause incorrect detection of USB devices upon addition or removal. Also, in some cases, even though such devices were detected properly, they were non-functional. The I/O watchdog feature can now be enabled on the kernel command line, which improves hardware detection on underlying systems.
- BZ#845347
- A kernel panic could occur when using the be2net driver. This was because the Bottom Half (BF) was enabled even if the Interrupt ReQuest (IRQ) was already disabled. With this update, the BF is disabled in callers of the be_process_mcc() function and the kernel no longer crashes in this scenario. Note that, in certain cases, it is possible to experience the network card being unresponsive after installing this update. A future update will correct this problem.
- BZ#844814
- This issue affects O_DSYNC performance on GFS2 when only data (and not metadata such as file size) has been dirtied as the result of a write system call. Prior to this patch, O_DSYNC writes were behaving in the same way as O_SYNC for all cases. After this patch, O_DSYNC writes will only write back data, if the inode's metadata is not dirty. This gives a considerable performance improvement for this specific case. Note that the issue does not affect data integrity. The same issue also applies to the pairing of write and fdatasync calls.
- BZ#844531
- Previously, a cgroup or its hierarchy could only be modified under the cgroup_mutex master lock. This introduced a locking dependency on cred_guard_mutex from cgroup_mutex and completed a circular dependency, which involved cgroup_mutex, namespace_sem and workqueue, and led to a deadlock. As a consequence, many processes were unresponsive, and the system could be eventually unusable. This update introduces a new mutex, cgroup_root_mutex, which protects cgroup root modifications and is now used by mount options readers instead of the master lock. This breaks the circular dependency and avoids the deadlock.
- BZ#843771
- On architectures with the 64-bit cputime_t type, it was possible to trigger the "divide by zero" error, namely, on long-lived processes. A patch has been applied to address this problem, and the "divide by zero" error no longer occurs under these circumstances.
- BZ#843541
- The kernel allows high priority real time tasks, such as tasks scheduled with the SCHED_FIFO policy, to be throttled. Previously, the CPU stop tasks were scheduled as high priority real time tasks and could be thus throttled accordingly. However, the replenishment timer, which is responsible for clearing a throttle flag on tasks, could be pending on the just disabled CPU. This could lead to a situation that the throttled tasks were never scheduled to run. Consequently, if any of such tasks was needed to complete the CPU disabling, the system became unresponsive. This update introduces a new scheduler class, which gives a task the highest possible system priority and such a task cannot be throttled. The stop-task scheduling class is now used for the CPU stop tasks, and the system shutdown completes as expected in the scenario described.
- BZ#843163
- The previous implementation of socket buffers (SKBs) allocation for a NIC was node-aware, that is, memory was allocated on the node closest to the NIC. This increased performance of the system because all DMA transfer was handled locally. This was a good solution for networks with a lower frame transmitting rate where CPUs of the local node handled all the traffic of the single NIC well. However, when using 10Gb Ethernet devices, CPUs of one node usually do not handle all the traffic of a single NIC efficiently enough. Therefore, system performance was poor even though the DMA transfer was handled by the node local to the NIC. This update modifies the kernel to allow SKBs to be allocated on a node that runs applications receiving the traffic. This ensures that the NIC's traffic is handled by as many CPUs as needed, and since SKBs are accessed very frequently after allocation, the kernel can now operate much more efficiently even though the DMA can be transferred cross-node.
- BZ#872813
- Bug 768304 introduced a deadlock on the super block umount mutex. Consequently, when two processes attempted to mount an NFS file system at the same time they would block. This was because a backport mistake with one of the patches of bug 768304, which resulted in an imbalance between the mutex aquires and releases. Rather than just fix the imbalance, an upstream patch that the problem patch depended on was identified and backported so that the kernel code then matched the upstream code. The deadlock no longer occurs in this scenario.
- BZ#842881
- A kernel oops could occur due to a NULL pointer dereference upon USB device removal. The NULL pointer dereference has been fixed and the kernel no longer crashes in this scenario.
- BZ#842435
- When an NFSv4 client received a read delegation, a race between the OPEN and DELEGRETURN operation could occur. If the DELEGRETURN operation was processed first, the NFSv4 client treated the delegation returned by the following OPEN as a new delegation. Also, the NFSv4 client did not correctly handle errors caused by requests that used a bad or revoked delegation state ID. As a result, applications running on the client could receive spurious EIO errors. This update applies a series of patches that fix the NFSv4 code so an NFSv4 client recovers correctly in the described situations instead of returning errors to applications.
- BZ#842312
- Due to a missing return statement, the nfs_attr_use_mounted_on_file() function returned a wrong value. As a consequence, redundant ESTALE errors could potentially be returned. This update adds the proper return statement to nfs_attr_use_mounted_on_file(), thus preventing this bug. Note that this bug only affects NFSv4 file systems.
- BZ#841987
- Previously, soft interrupt requests (IRQs) under the bond_alb_xmit() function were locked even when the function contained soft IRQs that were disabled. This could cause a system to become unresponsive or terminate unexpectedly. With this update, such IRQs are no longer disabled, and the system no longer hangs or crashes in this scenario.
- BZ#873949
- Previously, the IP over Infiniband (IPoIB) driver maintained state information about neighbors on the network by attaching it to the core network's neighbor structure. However, due to a race condition between the freeing of the core network neighbor struct and the freeing of the IPoIB network struct, a use after free condition could happen, resulting in either a kernel oops or 4 or 8 bytes of kernel memory being zeroed when it was not supposed to be. These patches decouple the IPoIB neighbor struct from the core networking stack's neighbor struct so that there is no race between the freeing of one and the freeing of the other.
- BZ#874322
- Previously, XFS could, under certain circumstances, incorrectly read metadata from the journal during XFS log recovery. As a consequence, XFS log recovery terminated with an error message and prevented the file system from being mounted. This problem could result in a loss of data if the user forcibly "zeroed" the log to allow the file system to be mounted. This update ensures that metadata is read correctly from the log so that journal recovery completes successfully and the file system mounts as expected.
- BZ#748827
- If a dirty GFS2 inode was being deleted but was in use by another node, its metadata was not written out before GFS2 checked for dirty buffers in the gfs2_ail_flush() function. GFS2 was relying on the inode_go_sync() function to write out the metadata when the other node tried to free the file. However, this never happened because GFS2 failed the error check. With this update, the inode is written out before calling the gfs2_ail_flush() function. If a process has the PF_MEMALLOC flag set, it does not start a new transaction to update the access time when it writes out the inode. The inode is marked as dirty to make sure that the access time is updated later unless the inode is being freed.
- BZ#839973
- A USB Human Interface Device (HID) can be disconnected at any time. If this happened right before or while the hiddev_ioctl() call was in progress, hiddev_ioctl() attempted to access the invalid hiddev->hid pointer. When the HID device was disconnected, the hiddev_disconnect() function called the hid_device_release() function, which frees the hid_device structure type, but did not set the hiddev->hid pointer to NULL. If the deallocated memory region was re-used by the kernel, a kernel panic or memory corruption could occur. The hiddev->exist flag is now checked while holding the existancelock and hid_device is used only if such a device exists. As a result, the kernel no longer crashes in this scenario.
- BZ#839311
- The CONFIG_CFG80211_WEXT configuration option previously defined in the KConfig of the ipw2200 driver was removed with a recent update. This led to a build failure of the driver. The driver no longer depends on the CONFIG_CFG80211_WEXT option, so it can build successfully.
- BZ#875036
- The mmap_rnd() function is expected to return a value in the [0x00000000 .. 0x000FF000] range on 32-bit x86 systems. This behavior is used to randomize the base load address of shared libraries by a bug fix resolving the CVE-2012-1568 issue. However, due to a signedness bug, the mmap_rnd() function could return values outside of the intended scope. Consequently, the shared libraries base address could be less than one megabyte. This could cause binaries that use the MAP_FIXED mappings in the first megabyte of the process address space (typically, programs using vm86 functionality) to work incorrectly. This update modifies the mmap_rnd() function to no longer cast values returned by the get_random_int() function to the long data type. The aforementioned binaries now work correctly in this scenario.
- BZ#837607
- Due to an error in the dm-mirror driver, when using LVM mirrors on disks with discard support (typically SSD disks), repairing such disks caused the system to terminate unexpectedly. The error in the driver has been fixed and repairing disks with discard support is now successful.
- BZ#837230
- During the update of the be2net driver between the Red Hat Enterprise Linux 6.1 and 6.2, the NETIF_F_GRO flag was incorrectly removed, and the GRO (Generic Receive Offload) feature was therefore disabled by default. In OpenVZ kernels based on Red Hat Enterprise Linux 6.2, this led to a significant traffic decrease. To prevent this problem, the NETIF_F_GRO flag has been included in the underlying source code.
- BZ#875091
- Previously, the HP Smart Array driver (hpsa) used the target reset functionality. However, HP Smart Array logical drives do not support the target reset functionality. Therefore, if the target reset failed, the logical drive was taken offline with a file system error. The hpsa driver has been updated to use the LUN reset functionality instead of target reset, which is supported by these drives.
- BZ#765665
- A possible race between the n_tty_read() and reset_buffer_flags() functions could result in a NULL pointer dereference in the n_tty_read() function under certain circumstances. As a consequence, a kernel panic could have been triggered when interrupting a current task on a serial console. This update modifies the tty driver to use a spin lock to prevent functions from a parallel access to variables. A NULL pointer dereference causing a kernel panic can no longer occur in this scenario.
- BZ#769045
- Traffic to the NFS server could trigger a kernel oops in the svc_tcp_clear_pages() function. The source code has been modified, and the kernel oops no longer occurs in this scenario.
- BZ#836164
- Previously, reference counting was imbalanced in the slave add and remove paths for bonding. If a network interface controller (NIC) did not support the NETIF_F_HW_VLAN_FILTER flag, the bond_add_vlans_on_slave() and bond_del_vlans_on_slave() functions did not work properly, which could lead to a kernel panic if the VLAN module was removed while running. The underlying source code for adding and removing a slave and a VLAN has been revised and now also contains a common path, so that kernel crashes no kernel no longer occur in the described scenario.
- BZ#834764
- The bonding method for adding VLAN Identifiers (VIDs) did not always add the VID to a slave VLAN group. When the NETIF_F_HW_VLAN_FILTER flag was not set on a slave, the bonding module could not add new VIDs to it. This could cause networking problems and the system to be unreachable even if NIC messages did not indicate any problems. This update changes the bond VID add path to always add a new VID to the slaves (if the VID does not exist). This ensures that networking problems no longer occur in this scenario.
- BZ#783322
- Previously, after a crash, preparing to switch to the kdump kernel could in rare cases race with IRQ migration, causing a deadlock of the ioapic_lock variable. As a consequence, kdump became unresponsive. The race condition has been fixed, and switching to kdump no longer causes hangs in this scenario.
- BZ#834038
- Previously, futex operations on read-only (RO) memory maps did not work correctly. This broke workloads that had one or more reader processes performing the FUTEX_WAIT operation on a futex within a read-only shared file mapping and a writer process that had a writable mapping performing the FUTEX_WAKE operation. With this update, the FUTEX_WAKE operation is performed with a RO MAP_PRIVATE mapping, and is successfully awaken if another process updates the region of the underlying mapped file.
- BZ#833098
- When a device was registered to a bus, a race condition could occur between the device being added to the list of devices of the bus and binding the device to a driver. As a result, the device could already be bound to a driver which led to a warning and incorrect reference counting, and consequently to a kernel panic on device removal. To avoid the race condition, this update adds a check to identify an already bound device.
- BZ#832135
- Sometimes, the crypto allocation code could become unresponsive for 60 seconds or multiples thereof due to an incorrect notification mechanism. This could cause applications, like openswan, to become unresponsive. The notification mechanism has been improved to avoid such hangs.
- BZ#832009
- When a device is added to the system at runtime, the AMD IOMMU driver initializes the necessary data structures to handle translation for it. Previously, however, the per-device dma_ops structure types were not changed to point to the AMD IOMMU driver, so mapping was not performed and direct memory access (DMA) ended with the IO_PAGE_FAULT message. This consequently led to networking problems. With this update, the structure types point correctly to the AMD IOMMU driver, and networking works as expected when the AMD IOMMU driver is used.
- BZ#830716
- It is possible to receive data on multiple transports. Previously, however, data could be selectively acknowledged (SACKed) on a transport that had never received any data. This was against the SHOULD requirement in section 6.4 of the RFC 2960 standard. To comply with this standard, bundling of SACK operations is restricted to only those transports which have moved the ctsn of the association forward since the last sack. As a result, only outbound SACKs on a transport that has received a chunk since the last SACK are bundled.
- BZ#830209
- On ext4 file systems, when fallocate() failed to allocate blocks due to the ENOSPC condition (no space left on device) for a file larger than 4 GB, the size of the file became corrupted and, consequently, caused file system corruption. This was due to a missing cast operator in the "ext4_fallocate()" function. With this update, the underlying source code has been modified to address this issue, and file system corruption no longer occurs.
- BZ#829739
- Previously, on Fibre Channel hosts using the QLogic QLA2xxx driver, users could encounter error messages and long I/O outages during fabric faults. This was because the number of outstanding requests was hard-coded. With this update, the number of outstanding requests the driver keeps track of is based on the available resources instead of being hard-coded, which avoids the aforementioned problems.
- BZ#829211
- Previously introduced firmware files required for new Realtek chipsets contained an invalid prefix ("rtl_nic_") in the file names, for example "/lib/firmware/rtl_nic/rtl_nic_rtl8168d-1.fw". This update corrects these file names. For example, the aforementioned file is now correctly named "/lib/firmware/rtl_nic/rtl8168d-1.fw".
- BZ#829149
- Due to insufficient handling of a dead Input/Output Controller (IOC), the mpt2sas driver could fail Enhanced I/O Error Handling (EEH) recovery for certain PCI bus failures on 64-bit IBM PowerPC machines. With this update, when a dead IOC is detected, EEH recovery routine has more time to resolve the failure and the controller in a non-operational state is removed.
- BZ#828271
- USB Request Blocks (URBs) coming from user space were not allowed to have transfer buffers larger than an arbitrary maximum. This could lead to various problems; for example, attempting to redirect certain USB mass-storage devices could fail. To avoid such problems, programs are now allowed to submit URBs of any size; if there is not sufficient contiguous memory available, the submission fails with an ENOMEM error. In addition, to prevent programs from submitting a lot of small URBs and so using all the DMA-able kernel memory, this update also replaces the old limits on individual transfer buffers with a single global limit of 16MB on the total amount of memory in use by USB file system (usbfs).
- BZ#828065
- A race condition could occur due to incorrect locking scheme in the code for software RAID. Consequently, this could cause the mkfs utility to become unresponsive when creating an ext4 file system on software RAID5. This update introduces a locking scheme in the handle_stripe() function, which ensures that the race condition no longer occurs.
- BZ#826375
- Previously, using the e1000e driver could lead to a kernel panic. This was caused by a NULL pointer dereference that occurred if the adapter was being closed and reset simultaneously. The source code of the driver has been modified to address this problem, and kernel no longer crashes in this scenario.
- BZ#878204
- When a new rpc_task is created, the code takes a reference to rpc_cred and sets the task->tk_cred pointer to it. After the call completes, the resources held by the rpc_task are freed. Previously, however, after the rpc_cred was released, the pointer to it was not zeroed out. This led to an rpc_cred reference count underflow, and consequently to a kernel panic. With this update, the pointer to rpc_cred is correctly zeroed out, which prevents a kernel panic from occurring in this scenario.
- BZ#823822
- When removing a bonding module, the bonding driver uses code separate from the net device operations to clean up the VLAN code. Recent changes to the kernel introduced a bug which caused a kernel panic if the vlan module was removed after the bonding module had been removed. To fix this problem, the VLAN group removal operations found in the bonding kill_vid path are now duplicated in alternate paths which are used when removing a bonding module.
- BZ#823371
- When TCP segment offloading (TSO) or jumbo packets are used on the Broadcom BCM5719 network interface controller (NIC) with multiple TX rings, small packets can be starved for resources by the simple round-robin hardware scheduling of these TX rings, thus causing lower network performance. To ensure reasonable network performance for all NICs, multiple TX rings are now disabled by default.
- BZ#822651
- Previously, the default minimum entitled capacity of a virtual processor was 10%. This update changes the PowerPC architecture vector to support a lower minimum virtual processor capacity of 1%.
- BZ#821374
- On PowerPC architecture, the "top" utility displayed incorrect values for the CPU idle time, delays and workload. This was caused by a previous update that used jiffies for the I/O wait and idle time, but the change did not take into account that jiffies and CPU time are represented by different units. These differences are now taken into account, and the "top" utility displays correct values on PowerPC architecture.
- BZ#818172
- A bug in the writeback livelock avoidance scheme could result in some dirty data not being written to disk during a sync operation. In particular, this could occasionally occur at unmount time, when previously written file data was not synced, and was unavailable after the file system was remounted. Patches have been applied to address this issue, and all dirty file data is now synced to disk at unmount time.
- BZ#807704
- Previously, the TCP socket bound to NFS server contained a stale skb_hints socket buffer. Consequently, kernel could terminate unexpectedly. A patch has been provided to address this issue and skb_hints is now properly cleared from the socket, thus preventing this bug.
- BZ#814877
- Previously, bnx2x devices did not disable links with a large number of RX errors and overruns, and such links could still be detected as active. This prevented the bonding driver from failing over to a working link. This update restores remote-fault detection, which periodically checks for remote faults on the MAC layer. In case the physical link appears to be up but an error occurs, the link is disabled. Once the error is cleared, the link is brought up again.
- BZ#813137
- Various race conditions that led to indefinite log reservation hangs due to xfsaild "idle" mode occurred in XFS file system. This could lead to certain tasks being unresponsive; for example, the cp utility could become unresponsive on heavy workload. This update improves the Active Item List (AIL) pushing logic in xfsaild. Also, the log reservation algorithm and interactions with xfsaild have been improved. As a result, the aforementioned problems no longer occur in this scenario.
- BZ#811255
- The Out of Memory (OOM) killer killed processes outside of a memory cgroup when one or more processes inside that memory cgroup exceeded the "memory.limit_in_bytes" value. This was because when a copy-on-write fault happened on a Transparent Huge Page (THP), the 2 MB THP caused the cgroup to exceed the memory.limit_in_bytes value but the individual 4 KB page was not exceeded. With this update, the 2 MB THP is correctly split into 4 KB pages when the memory.limit_in_bytes value is exceeded. The OOM kill is delivered within the memory cgroup; tasks outside the memory cgroups are no longer killed by the OOM killer.
- BZ#812904
- This update blacklists the ADMA428M revision of the 2GB ATA Flash Disk device. This is due to data corruption occurring on the said device when the Ultra-DMA 66 transfer mode is used. When the "libata.force=5:pio0,6:pio0" kernel parameter is set, the aforementioned device works as expected.
- BZ#814044
- With certain switch peers and firmware, excessive link flaps could occur due to the way DCBX (Data Center Bridging Exchange) was handled. To prevent link flaps, changes were made to examine the capabilities in more detail and only initialize hardware if the capabilities have changed.
- BZ#865115
- If an abort request times out to the virtual Fibre Channel adapter, the ibmvfc driver initiates a reset of the adapter. Previously, however, the ibmvfc driver incorrectly returned success to the eh_abort handler and then sent a response to the same command, which led to a kernel oops on IBM System p machines. This update ensures that both the abort request and the request being aborted are completed prior to exiting the en_abort handler, and the kernel oops no longer occurs in this scenario.
- BZ#855906
- A kernel panic occurred when the size of a block device was changed and an I/O operation was issued at the same time. This was because the direct and non-direct I/O code was written with the assumption that the block size would not change. This update introduces a new read-write lock, bd_block_size_semaphore. The lock is taken for read during I/O operations and for write when changing the block size of a device. As a result, block size cannot be changed while I/O is being submitted. This prevents the kernel from crashing in the described scenario.
- BZ#883643
- The bonding driver previously did not honor the maximum Generic Segmentation Offload (GSO) length of packets and segments requested by the underlying network interface. This caused the firmware of the underlying NIC, such as be2net, to become unresponsive. This update modifies the bonding driver to set up the lowest gso_max_size and gso_max_segs values of network devices while attaching and detaching the devices as slaves. The network drivers no longer hangs and network traffic now proceeds as expected in setups using a bonding interface.
- BZ#855131
- In Fibre Channel fabrics with large zones, the automatic port rescan on incoming Extended Link Service (ELS) frames and any adapter recovery could cause high traffic, in particular if many Linux instances shared a host bus adapter (HBA), which is common on IBM System z architecture. This could lead to various failures; for example, names server requests, port or adapter recovery could fail. With this update, ports are re-scanned only when setting an adapter online or on manual user-triggered writes to the sysfs attribute "port_rescan".
- BZ#824964
- A deadlock sometimes occurred between the dlm_controld daemon closing a lowcomms connection through the configfs file system and the dlm_send process looking up the address for a new connection in configfs. With this update, the node addresses are saved within the lowcomms code so that the lowcomms work queue does not need to use configfs to get a node address.
- BZ#827031
- On Intel systems with Pause Loop Exiting (PLE), or AMD systems with Pause Filtering (PF), it was possible for larger multi-CPU KVM guests to experience slowdowns and soft lock-ups. Due to a boundary condition in kvm_vcpu_on_spin, all the VCPUs could try to yield to VCPU0, causing contention on the run queue lock of the physical CPU where the guest's VCPU0 is running. This update eliminates the boundary condition in kvm_vcpu_on_spin.
- BZ#796352
- On Red Hat Enterprise Linux 6, mounting an NFS export from a Windows 2012 server failed due to the fact that the Windows server contains support for the minor version 1 (v4.1) of the NFS version 4 protocol only, along with support for versions 2 and 3. The lack of the minor version 0 (v4.0) support caused Red Hat Enterprise Linux 6 clients to fail instead of rolling back to version 3 as expected. This update fixes this bug and mounting an NFS export works as expected.
- BZ#832575
- Previously, the size of the multicast IGMP (Internet Group Management Protocol) snooping hash table for a bridge was limited to 256 entries even though the maximum is 512. This was due to the hash table size being incorrectly compared to the maximum hash table size, hash_max, and the following message could have been produced by the kernel:
Multicast hash table maximum reached, disabling snooping: vnet1, 512
With this update, the hash table value is correctly compared to the hash_max value, and the error message no longer occurs under these circumstances. - BZ#834185
- The xmit packet size was previously 64K, exceeding the hardware capability of the be2net card because the size did not account for the Ethernet header. The adapter was therefore unable to process xmit requests exceeding this size, produced error messages and could become unresponsive. To prevent these problems, GSO (Generic Segmentation Offload) maximum size has been reduced to account for the Ethernet header.
- BZ#835797
- Signed-unsigned values comparison could under certain circumstances lead to a superfluous reshed_task() routine to be called, causing several unnecessary cycles in the scheduler. This problem has been fixed, preventing the unnecessary cycles in the scheduler.
- BZ#838025
- When using virtualization with the netconsole module configured over the main system bridge, guests could not be added to the bridge, because TAP interfaces did not support netpoll. This update adds support of netpoll to the TUN/TAP interfaces so that bridge devices in virtualization setups can use netconsole.
- BZ#838640
- In the ext4 file system, splitting an unwritten extent while using Direct I/O could fail to mark the modified extent as dirty, resulting in multiple extents claiming to map the same block. This could lead to the kernel or fsck reporting errors due to multiply claimed blocks being detected in certain inodes. In the ext4_split_unwritten_extents() function used for Direct I/O, the buffer which contains the modified extent is now properly marked as dirty in all cases. Errors due to multiply claimed blocks in inodes should no longer occur for applications using Direct I/O.
- BZ#839266
- When the netconsole module was configured over bridge and the "service network restart" command was executed, a deadlock could occur, resulting in a kernel panic. This was caused by recursive rtnl locking by both bridge and netconsole code during network interface unregistration. With this update, the rtnl lock usage is fixed, and the kernel no longer crashes in this scenario.
- BZ#756044
- Migrating virtual machines from Intel hosts that supported the VMX "Unrestricted Guest" feature to older hosts without this feature could result in kvm returning the "unhandled exit 80000021" error for guests in real mode. The underlying source code has been modified so that migration completes successfully on hosts where "Unrestricted Guest" is disabled or not supported.
- BZ#843849
- The kernel contains a rule to blacklist direct memory access (DMA) modes for "2GB ATA Flash Disk" devices. However, this device ID string did not contain a space at the beginning of the name. Due to this, the rule failed to match the device and failed to disable DMA modes. With this update, the string correctly reads " 2GB ATA Flash Disk", and the rule can be matched as expected.
Enhancements
Note
procfs
entries, sysfs
default values, boot parameters, kernel configuration options, or any noticeable behavior changes, refer to Chapter 1, Important Changes to External Kernel Parameters.
- BZ#872799
- The INET socket interface has been modified to send a warning message when the ip_options structure is allocated directly by a third-party module using the kmalloc() function.
- BZ#823010
- The z90crypt device driver has been updated to support the new Crypto Express 4 (CEX4) adapter card.
- BZ#586028
- This update adds the ability to use InfiniBand's Queue Pair (QP) interface under KVM. The QP interface can be exported to a KVM guest.
- BZ#795598
- With this update, it possible to adjust the TCP initial receive window, using the "initrwnd" iproute setting, on a per-route basis.
- BZ#831623
- A new "route_localnet" interface option has been added, which enables routing of addresses within the 127.0.0.0/8 block.
- BZ#847998
- With this update, a warning message is logged when a storage device reports a certain SCSI Unit Attention code.
Security Fix
- CVE-2013-2094, Important
- It was found that the Red Hat Enterprise Linux 6.1 kernel update (RHSA-2011:0542) introduced an integer conversion issue in the Linux kernel's Performance Events implementation. This led to a user-supplied index into the perf_swevent_enabled array not being validated properly, resulting in out-of-bounds kernel memory access. A local, unprivileged user could use this flaw to escalate their privileges.
Security Fix
- CVE-2013-0871, Important
- A race condition was found in the way the Linux kernel's ptrace implementation handled PTRACE_SETREGS requests when the debuggee was woken due to a SIGKILL signal instead of being stopped. A local, unprivileged user could use this flaw to escalate their privileges.
7.103.14. RHBA-2013:0093 — kernel bug fix update
Bug Fixes
- BZ#1012275
- A bug in the kernel's memory management code allowed the callback routine to call the
find_vma()
function without acquiring the memory map semaphore when reading from the/proc/<pid>/pagemap
file. This could trigger a kernel panic or lead to another system failure. A set of patches has been applied to address this problem so that memory map semaphore is now held when walking through the memory page map. Moreover, this update fix a bug that allowed exceeding of the page mid-level directory (PMD) boundary between memory pages, which could cause various problems. The memory management code now handles reading of memory page map as expected. - BZ#1021813
- As a result of a recent fix preventing a deadlock upon an attempt to cover an active
XFS
log, the behavior of thexfs_log_need_covered()
function has changed. However,xfs_log_need_covered()
is also called to ensure that theXFS
log tail is correctly updated as a part of theXFS
journal sync operation. As a consequence, when shutting down anXFS
file system, the sync operation failed and some files might have been lost. A patch has been applied to ensure that the tail of the XFS log is updated by logging a dummy record to theXFS
journal. The sync operation completes successfully and files are properly written to the disk in this situation. - BZ#1026622
- When the Wacom Intuous 4 tablet was attached to the system using GNOME, and writing to the LED control occurred before the input device was opened, a bug that set the tablet to an incorrect mode could have been triggered. Consequently, after hot plugging the tablet or logging into the system, the Wacom Intuous 4 tablet could stop functioning correctly and was no longer detected by the Wacom control panel. A patch has been applied to fix this problem by correctly marking get requests as input requests in the
wacom_get_report()
function. The Wacom Intuous 4 tablets now work as expected in the described scenario. - BZ#1028171
- A change introduced in an earlier kernel version caused the
d_mountpoint()
function to return a different value than before, although the semantics of the function stayed the same. Consequently, certain third-party kernel modules, such as the Veritas file system (VxFS
) module, could stop working properly. A patch has been applied, ensuring that thed_mountpoint()
function returns expected values. - BZ#1029091
- The stripe_width is the stripe device length divided by the number of stripes. However, previously, the stripe_width was not being calculated properly. As a consequence, device topologies that had previously worked correctly with the stripe target were no longer considered valid. In particular, there was a higher risk of encountering this problem if one of the stripe devices had a 4K logical block size. As a result, the following error message could be displayed:
device-mapper: table: 253:4: len=21845 not aligned to h/w logical block size 4096 of dm-1
This update fixes the stripe_width calculation error, and device topologies work as expected in this scenario. - BZ#1029425
- The kernel's thread helper previously used larvals of the request threads without holding a reference count. This could result in a NULL pointer dereference and a subsequent kernel panic if the helper thread completed after the larval had been destroyed upon the request thread exiting. With this update, the helper thread holds a reference count on the request threads larvals so that a NULL pointer dereference is now avoided.
- BZ#1029731
- When printing information about SCSI devices on systems with a very large number (more than about 1600) of SCSI devices, intermittent page allocation errors occurred, and the
cat
command failed with an ENOMEM error. This was caused by the show routine that iterated over all SCSI devices and attempted to dump information about all of them into the buffer at the same time. This update modifies the SCSI driver to define its own seq_file operations to iterate over the SCSI devices. As a result, each/proc/scsi/scsi
show()
operation only dumps approximately 180 bytes into the buffer at a time, and the error no longer occurs. - BZ#1029860
- Under high memory pressure, the
shrink_zone()
function could complete an iteration without reclaiming or even scanning any pages, in which case it should give up. The latter was detected by comparing the sc->nr_scanned pointer against a snapshot taken at the start ofshrink_zone()
, instead of at the start of the iteration. This meantshrink_zone()
could enter an indefinite loop if it scanned some pages in one iteration, but failed on subsequent iterations due to memory pressure. A patch has been applied to fix this bug, preventing the indefinite loop. - BZ#1029904
- Due to a bug in the SELinux Makefile, a kernel compilation could fail when the
-j
option was specified to perform the compilation with multiple parallel jobs. This happened because SELinux expected the existence of an automatically generated file,flask.h
, prior to the compiling of some dependent files. The Makefile has been corrected and theflask.h
dependency now applies to all objects from theselinux-y
list. The parallel compilation of the kernel now succeeds as expected. - BZ#1030169
- A previous change in the NFSv4 code resulted in breaking the sync NFSv4 mount option. A patch has been applied that restores functionality of the sync mount option.
- BZ#1032160
- When performing I/O operations on a heavily-fragmented
GFS2
file system, significant performance degradation could occur. This was caused by the allocation strategy that GFS2 used to search for an ideal contiguous chunk of free blocks in all the available resource groups. A series of patches has been applied that improves performance of GFS2 file systems in case of heavy fragmentation. GFS2 now allocates the biggest extent found in the resource groups if it fulfills the minimum requirements. GFS2 has also reduced the amount of bitmap searching in case of multi-block reservations by keeping track of the smallest extent for which the multi-block reservation would fail in the given resource groups. This improves GFS2 performance by avoiding unnecessary resource groups free block searches that would fail. Additionally, this patch series fixes a bug in the GFS2 block allocation code where a multi-block reservation was not properly removed from the resource groups' reservation tree when it was disqualified, which eventually triggered a BUG_ON() macro due to an incorrect count of reserved blocks. - BZ#1032165
- An earlier patch to the kernel added the dynamic queue depth throttling functionality to the QLogic's qla2xxx driver that allowed the driver to adjust queue depth for attached SCSI devices. However, the kernel might have crashed when having this functionality enabled in certain environments, such as on systems with EMC PowerPath Multipathing installed that were under heavy I/O load. To resolve this problem, the dynamic queue depth throttling functionality has been removed from the qla2xxx driver.
- BZ#1032248
- As a result of a recent fix preventing a deadlock upon an attempt to cover an active
XFS
log, the behavior of thexfs_log_need_covered()
function has changed. However,xfs_log_need_covered()
is also called to ensure that the XFS log tail is correctly updated as a part of the XFS journal sync operation. As a consequence, when shutting down an XFS file system, the sync operation failed and some files might have been lost. A patch has been applied to ensure that the tail of the XFS log is updated by logging a dummy record to the XFS journal. The sync operation completes successfully and files are properly written to the disk in this situation. - BZ#1032394
- Due to a bug in the mlx4 driver, Mellanox Ethernet cards were brought down unexpectedly while adjusting their Transmission (Tx) or Reception (Rx) ring. A patch has been applied so that the mlx4 driver now properly verifies the state of the Ethernet card when the coalescing of the Tx or Rx ring is being set, which resolves this problem.
- BZ#1036775
- During the probe operations on the Broadcom 5717 and later devices, the tg3 driver was incorrectly switching the devices to a low-power mode. Consequently, some ports on the certain device slots were not recognized. With this update, the tg3 driver no longer resumes a low-power mode when probing the aforementioned devices.
- BZ#1038934
- Certain storage device or storage environment failures could cause all SCSI commands and task management functions that were sent to a SCSI target to time out, without any other indication of an error. As a consequence, the Linux SCSI error handling code stopped issuing any I/O operations on the entire host bus adapter (HBA) until the recovery operations completed. Additionally, when using DM Multipath, I/O operations did not fail over to a working path in this situation. To resolve this problem, a new
sysfs
parameter,eh_deadline
, has been added to the SCSI host object. This parameter allows to set the maximum amount of time for which the SCSI error handling attempts to perform error recovery before resetting the entire HBA adapter. This timeout is disabled by default. The default value of this timeout can be reset for all SCSI HBA adapters on the system using theeh_deadline
parameter. The described scenario no longer occurs ifeh_deadline
is properly used. - BZ#1040825
- Due to several bugs in the IPv6 code, a soft lockup could occur when the number of cached IPv6 destination entries reached the garbage collector treshold on a high-traffic router. A series of patches has been applied to address this problem. These patches ensure that the route probing is performed asynchronously to prevent a dead lock with garbage collection. Also, the garbage collector is now run asynchronously, preventing CPUs that concurrently requested the garbage collector from waiting until all other CPUs finish the garbage collection. As a result, soft lockups no longer occur in the described situation.
- BZ#1042733
- The RPC client always retransmitted zero-copy of the page data if it timed out before the first RPC transmission completed. However, such a retransmission could cause data corruption if using the O_DIRECT buffer and the first RPC call completed while the respective TCP socket still held a reference to the pages. To prevent the data corruption, retransmission of the RPC call is, in this situation, performed using the
sendmsg()
function. Thesendmsg()
function retransmits an authentic reproduction of the first RPC transmission because the TCP socket holds the full copy of the page data.