5.13. kernel
The Kernel
- NUMA class systems should not be booted with a single memory node configuration. Configuration of single node NUMA systems will result in contention for the memory resources on all of the non-local memory nodes. As only one node will have local memory the CPUs on that single node will starve the remaining CPUs for memory allocations, locks, and any kernel data structure access. This contention will lead to the "CPU#n stuck for 10s!" error messages. This configuration can also result in NMI watchdog timeout panics if a spinlock is acquired via spinlock_irq() and held for more than 60 seconds. The system can also hang for indeterminate lengths of timeTo minimize this problem NUMA class systems need to have their memory evenly distributed between nodes. NUMA information can be obtained from dmesg output as well as from the numastat command.
- When upgrading from Red Hat Enterprise Linux 5.0, 5.1 or 5.2 to more recent releases, the gfs2-kmod may still be installed on the system. This package must be manually removed or it will override the (newer) version of GFS2 which is built into the kernel. Do not install the
gfs2-kmod
package on later versions of Red Hat Enterprise Linux.gfs2-kmod
is not required since GFS2 is built into the kernel from 5.3 onwards. The content of the gfs2-kmod package is considered a Technology Preview of GFS2, and has not received any updates since Red Hat Enterprise Linux 5.3 was released.Note that this note only applies to GFS2 and not to GFS, for which the gfs-kmod package continues to be the only method of obtaining the required kernel module. - Issues might be encountered on a system with 8Gb/s LPe1200x HBAs and firmware version 2.00a3 when the Red Hat Enterprise Linux 5.6 kernel is used with the in-box LPFC driver. Such issues include loss of LUNs and/or fiber channel host hangs during fabric faults with multipathing.To work around these issues, it is recommended to either:
- Downgrade the firmware revision of the 8Gb/s LPe1200x HBA to revision 1.11a5, or
- Modify the LPFC driver’s
lpfc_enable_npiv
module parameter to zero.When loading the LPFC driver from the initrd image (i.e. at system boot time), add the lineoptions lpfc_enable_npiv=0
to/etc/modprobe.conf
and re-build the initrd image.When loading the LPFC driver dynamically, include thelpfc_enable_npiv=0
option in the insmod or modprobe command line.
For additional information on how to set the LPFC driver module parameters, refer to the Emulex Drivers for Linux User Manual. - If AMD IOMMU is enabled in BIOS on ProLiant DL165 G7 systems, the system will reboot automatically when IOMMU attempts to initalize. To work around this issue, either disable IOMMU, or update the BIOS to version
2010.09.06
or later. - As of Red Hat Enterprise Linux 5.6 the ext4 file system is fully supported. However, provisioning ext4 file systems with the anaconda installer is not supported, and ext4 file systems need to be provisioned manually after the installation.
- In some cases the NFS server fails to notify NFSv4 clients about renames and unlinks done by other clients, or by non-NFS users of the server. An application on a client may then be able to open the file at its old pathname (and read old cached data from it, and perform read locks on it), long after the file no longer exists at that pathname on the server.To work around this issue, use NFSv3 instead of NFSv4. Alternatively, turn off support for leases by writing
0
to/proc/sys/fs/leases-enable
(ideally on boot, before the nfs server is started). This change prevents NFSv4 delegations from being given out, restore correctness at the expense of some performance. - Some laptops may generate continuous events in response to the lid being shut. Consequently, the gnome-power-manager utility will consume CPU resources as it responds to each event.
- A kernel panic may be triggered by the lpfc driver when multiple Emulex OneConnect Universal Converged Network Adapter initiators are included in the same Storage Area Network (SAN) zone. Typically, this kernel panic will present after a cable is pulled or one of the systems is rebooted. To work around this issue, configure the SAN to use single initiator zoning. (BZ#574858)
- If a Huawei USB modem is unplugged from a system, the device may not be detected when it is attached again. To work around this issue, the usbserial and usb-storage driver modules need to be reloaded, allowing the system to detect the device. Alternatively, the if the system is rebooted, the modem will be detected also. (BZ#517454)
- Memory on-line is not currently supported with the Boxboro-EX platform. (BZ#515299)
- Unloading a PF (SR-IOV Physical function) driver from a host when a guest is using a VF (virtual function) from that device can cause a host crash. A PF driver for an SR-IOV device should not be unloaded until after all guest virtual machines with assigned VFs from that SR-IOV device have terminated. (BZ#514360)
- Data corruption on NFS filesystems might be encountered on network adapters without support for error-correcting code (ECC) memory that also have TCP segmentation offloading (TSO) enabled in the driver. Note: data that might be corrupted by the sender still passes the checksum performed by the IP stack of the receiving machine A possible work around to this issue is to disable TSO on network adapters that do not support ECC memory. BZ#504811
- After installation, a System z machine with a large number of memory and CPUs (e.g. 16 CPU's and 200GB of memory) might may fail to IPL. To work around this issue, change the line
ramdisk=/boot/initrd-2.6.18-<kernel-version-number>.el5.img
toramdisk=/boot/initrd-2.6.18-<kernel-version-number>.el5.img,0x02000000
The commandzipl -V
should now show0x02000000
as the starting address for the inital RAM disk (initrd). Stop the logigal partiton (LPAR), and then manually increase the the storage size of the LPAR. - On certain hardware configurations the kernel may panic when the Broadcom iSCSI offload driver (
bnx2i.ko
andcnic.ko
) is loaded. To work around this do not manually load the bnx2i or cnic modules, and temporarily disable theiscsi
service from starting. To disable the iscsi service, runchkconfig --del iscsi chkconfig --del iscsid
On the first boot of your system, theiscsi
service may start automatically. To bypass this, during bootup, enter interactive start up and stop the iscsi service from starting. - In Red Hat Enterprise Linux 5, invoking the kernel system call "setpriority()" with a "which" parameter of type "PRIO_PROCESS" does not set the priority of child threads. (BZ#472251)
- Physical CPUs cannot be safely placed offline or online when the 'kvm_intel' or 'kvm_amd' module is loaded. This precludes physical CPU offline and online operations when KVM guests that utilize processor virtualization support are running. It also precludes physical CPU offline and online operations without KVM guests running when the 'kvm_intel' or 'kvm_amd' module is simply loaded and not being used.If the kmod-kvm package is installed, the 'kvm_intel' or 'kvm_amd' module automatically loads during boot on some systems. If a physical CPU is placed offline while the 'kvm_intel' or 'kvm_amd' module is loaded a subsequent attempt to online that CPU may fail with an I/O error.To work around this issue, unload the 'kvm_intel' or 'kvm_amd' before performing physical CPU hot-plug operations. It may be necessary to shut down KVM guests before the 'kvm_intel' or 'kvm_amd' will successfully unload.For example, to offline a physical CPU 6 on an Intel based system:
# rmmod kvm_intel # echo 0 > /sys/devices/system/cpu/cpu6/online # modprobe kvm_intel
- A change to the cciss driver in Red Hat Enterprise Linux 5.4 made it incompatible with the "echo disk > /sys/power/state" suspend-to-disk operation. Consequently, the system will not suspend properly, returning messages such as:
Stopping tasks: ====================================================================== stopping tasks timed out after 20 seconds (1 tasks remaining): cciss_scan00 Restarting tasks...<6> Strange, cciss_scan00 not stopped done
(BZ#513472) - The kernel is unable to properly detect whether there is media present in a CD-ROM drive during kickstart installs. The function to check the presence of media incorrectly interprets the "logical unit is becoming ready" sense, returning that the drive is ready when it is not. To work around this issue, wait several seconds between inserting a CD and asking the installer (anaconda) to refresh the CD. (BZ#510632)
- Applications attempting to
malloc
memory approximately larger than the size of the physical memory on the node on a NUMA system may hang or appear to stall. This issue may occur on a NUMA system where the remote memory distance, as defined in SLIT, is greater than 20 and RAM based filesystem liketmpfs
orramfs
is mounted.To work around this issue, unmount all RAM based filesystems (i.e. tmpfs or ramfs). If unmounting the RAM based filesystems is not possible, modify the application to allocate lesser memory. Finally, if modifying the application is not possible, disable NUMA memory reclaim by running:sysctl vm.zone_reclaim_mode=0
Important
Turning NUMA reclaim negatively effects the overall throughput of the system. - Configuring IRQ SMP affinity has no effect on some devices that use message signalled interrupts (MSI) with no MSI per-vector masking capability. Examples of such devices include Broadcom NetXtreme Ethernet devices that use the
bnx2
driver.If you need to configure IRQ affinity for such a device, disable MSI by creating a file in/etc/modprobe.d/
containing the following line:options bnx2 disable_msi=1
Alternatively, you can disable MSI completely using the kernel boot parameterpci=nomsi
. (BZ#432451) - The
smartctl
tool cannot properly read SMART parameters from SATA devices. (BZ#429606) - IBM T60 laptops will power off completely when suspended and plugged into a docking station. To avoid this, boot the system with the argument
acpi_sleep=s3_bios
. (BZ#439006) - The QLogic iSCSI Expansion Card for the IBM Bladecenter provides both ethernet and iSCSI functions. Some parts on the card are shared by both functions. However, the current
qla3xxx
andqla4xxx
drivers support ethernet and iSCSI functions individually. Both drivers do not support the use of ethernet and iSCSI functions simultaneously.Because of this limitation, successive resets (via consecutiveifdown
/ifup
commands) may hang the device. To avoid this, allow a 10-second interval after anifup
before issuing anifdown
. Also, allow the same 10-second interval after anifdown
before issuing anifup
. This interval allows ample time to stabilize and re-initialize all functions when anifup
is issued. (BZ#276891) - Laptops equipped with the Cisco Aironet MPI-350 wireless may hang trying to get a DHCP address during any network-based installation using the wired ethernet port.To work around this, use local media for your installation. Alternatively, you can disable the wireless card in the laptop BIOS prior to installation (you can re-enable the wireless card after completing the installation). (BZ#213262)
- Hardware testing for the Mellanox MT25204 has revealed that an internal error occurs under certain high-load conditions. When the
ib_mthca
driver reports a catastrophic error on this hardware, it is usually related to an insufficient completion queue depth relative to the number of outstanding work requests generated by the user application.Although the driver will reset the hardware and recover from such an event, all existing connections at the time of the error will be lost. This generally results in a segmentation fault in the user application. Further, ifopensm
is running at the time the error occurs, then you need to manually restart it in order to resume proper operation. (BZ#251934) - The IBM T41 laptop model does not enter properly; as such, will still consume battery life as normal. This is because Red Hat Enterprise Linux 5 does not yet include the
radeonfb
module.To work around this, add a script namedhal-system-power-suspend
to/usr/share/hal/scripts/
containing the following lines:chvt 1 radeontool light off radeontool dac off
This script will ensure that the IBM T41 laptop enters properly. To ensure that the system resumes normal operations properly, add the scriptrestore-after-standby
to the same directory as well, containing the following lines:radeontool dac on radeontool light on chvt 7
- If the
edac
module is loaded, BIOS memory reporting will not work. This is because theedac
module clears the register that the BIOS uses for reporting memory errors.The current Red Hat Enterprise Linux Driver Update Model instructs the kernel to load all available modules (including theedac
module) by default. If you wish to ensure BIOS memory reporting on your system, you need to manually blacklist theedac
modules. To do so, add the following lines to/etc/modprobe.conf
:blacklist edac_mc blacklist i5000_edac blacklist i3000_edac blacklist e752x_edac
- Due to outstanding driver issues with hardware encryption acceleration, users of Intel WiFi Link 4965, 5100, 5150, 5300, and 5350 wireless cards are advised to disable hardware accelerated encryption using module parameters. Failure to do so may result in the inability to connect to Wired Equivalent Privacy (WEP) protected wireless networks after connecting to WiFi Protected Access (WPA) protected wireless networks.To do so, add the following options to
/etc/modprobe.conf
:alias wlan0 iwlagn options iwlagn swcrypto50=1 swcrypto=1
(where wlan0 is the default interface name of the first Intel WiFi Link device)
The following note applies to PowerPC Architectures:
- The size of the PPC kernel image is too large for OpenFirmware to support. Consequently, network booting will fail, resulting in the following error message:
Please wait, loading kernel... /pci@8000000f8000000/ide@4,1/disk@0:2,vmlinux-anaconda: No such file or directory boot:
To work around this:- Boot to the OpenFirmware prompt, by pressing the '8' key when the IBM splash screen is displayed.
- Run the following command:
setenv real-base 2000000
- Boot into System Managment Services (SMS) with the command:
0> dev /packages/gui obe