Chapter 32. Using Advanced Error Reporting
When you use the Advanced Error Reporting (AER), you receive notifications of error events for Peripheral Component Interconnect Express (PCIe) devices. RHEL enables this kernel feature by default and collects the reported errors in the kernel logs. If you use the rasdaemon program, these errors are parsed and stored in its database.
32.1. Overview of AER Copy linkLink copied to clipboard!
Advanced Error Reporting (AER) is a kernel feature that provides enhanced error reporting for Peripheral Component Interconnect Express (PCIe) devices. The AER kernel driver attaches root ports which support PCIe AER capability in order to:
- Gather the comprehensive error information
- Report errors to the users
- Perform error recovery actions
When AER captures an error, it sends an error message to the console. For a repairable error, the console output is a warning.
Example 32.1. Example AER output
32.2. Collecting and displaying AER messages Copy linkLink copied to clipboard!
To collect and display AER messages, use the rasdaemon program.
Procedure
Install the
rasdaemonpackage.yum install rasdaemon
# yum install rasdaemonCopy to Clipboard Copied! Toggle word wrap Toggle overflow Enable and start the
rasdaemonservice.systemctl enable --now rasdaemon
# systemctl enable --now rasdaemon Created symlink /etc/systemd/system/multi-user.target.wants/rasdaemon.service/usr/lib/systemd/system/rasdaemon.service. Copy to Clipboard Copied! Toggle word wrap Toggle overflow Issue the
ras-mc-ctlcommand.ras-mc-ctl --summary ras-mc-ctl --errors
# ras-mc-ctl --summary # ras-mc-ctl --errorsCopy to Clipboard Copied! Toggle word wrap Toggle overflow The command displays a summary of the logged errors (the
--summaryoption) or displays the errors stored in the error database (the--errorsoption).