Chapter 15. Checking and repairing a file system
RHEL provides file system administration utilities which are capable of checking and repairing file systems. These tools are often referred to as fsck
tools, where fsck
is a shortened version of file system check. In most cases, these utilities are run automatically during system boot, if needed, but can also be manually invoked if required.
File system checkers guarantee only metadata consistency across the file system. They have no awareness of the actual data contained within the file system and are not data recovery tools.
15.1. Scenarios that require a file system check
The relevant fsck
tools can be used to check your system if any of the following occurs:
- System fails to boot
- Files on a specific disk become corrupt
- The file system shuts down or changes to read-only due to inconsistencies
- A file on the file system is inaccessible
File system inconsistencies can occur for various reasons, including but not limited to hardware errors, storage administration errors, and software bugs.
File system check tools cannot repair hardware problems. A file system must be fully readable and writable if repair is to operate successfully. If a file system was corrupted due to a hardware error, the file system must first be moved to a good disk, for example with the dd(8)
utility.
For journaling file systems, all that is normally required at boot time is to replay the journal if required and this is usually a very short operation.
However, if a file system inconsistency or corruption occurs, even for journaling file systems, then the file system checker must be used to repair the file system.
It is possible to disable file system check at boot by setting the sixth field in /etc/fstab
to 0
. However, Red Hat does not recommend doing so unless you are having issues with fsck
at boot time, for example with extremely large or remote file systems.
Additional resources
-
fstab(5)
,fsck(8)
, anddd(8)
man pages on your system
15.2. Potential side effects of running fsck
Generally, running the file system check and repair tool can be expected to automatically repair at least some of the inconsistencies it finds. In some cases, the following issues can arise:
- Severely damaged inodes or directories may be discarded if they cannot be repaired.
- Significant changes to the file system may occur.
To ensure that unexpected or undesirable changes are not permanently made, ensure you follow any precautionary steps outlined in the procedure.
15.3. Error-handling mechanisms in XFS
This section describes how XFS handles various kinds of errors in the file system.
Unclean unmounts
Journalling maintains a transactional record of metadata changes that happen on the file system.
In the event of a system crash, power failure, or other unclean unmount, XFS uses the journal (also called log) to recover the file system. The kernel performs journal recovery when mounting the XFS file system.
Corruption
In this context, corruption means errors on the file system caused by, for example:
- Hardware faults
- Bugs in storage firmware, device drivers, the software stack, or the file system itself
- Problems that cause parts of the file system to be overwritten by something outside of the file system
When XFS detects corruption in the file system or the file-system metadata, it may shut down the file system and report the incident in the system log. Note that if the corruption occurred on the file system hosting the /var
directory, these logs will not be available after a reboot.
Example 15.1. System log entry reporting an XFS corruption
# dmesg --notime | tail -15 XFS (loop0): Mounting V5 Filesystem XFS (loop0): Metadata CRC error detected at xfs_agi_read_verify+0xcb/0xf0 [xfs], xfs_agi block 0x2 XFS (loop0): Unmount and run xfs_repair XFS (loop0): First 128 bytes of corrupted metadata buffer: 00000000027b3b56: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000005f9abc7a: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000005b0aef35: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000da9d2ded: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000001e265b07: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000006a40df69: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000000b272907: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000e484aac5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ XFS (loop0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2 len 1 error 74 XFS (loop0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117, agno 0 XFS (loop0): Failed to read root inode 0x80, error 11
User-space utilities usually report the Input/output error message when trying to access a corrupted XFS file system. Mounting an XFS file system with a corrupted log results in a failed mount and the following error message:
mount: /mount-point: mount(2) system call failed: Structure needs cleaning.
You must manually use the xfs_repair
utility to repair the corruption.
Additional resources
-
xfs_repair(8)
man page on your system
15.4. Checking an XFS file system with xfs_repair
Perform a read-only check of an XFS file system by using the xfs_repair utility. Unlike other file system repair utilities, xfs_repair
does not run at boot time, even when an XFS file system was not cleanly unmounted. In case of an unclean unmount, XFS simply replays the log at mount time, ensuring a consistent file system; xfs_repair
cannot repair an XFS file system with a dirty log without remounting it first.
Although an fsck.xfs
binary is present in the xfsprogs
package, this is present only to satisfy initscripts
that look for an fsck.file
system binary at boot time. fsck.xfs
immediately exits with an exit code of 0.
Procedure
Replay the log by mounting and unmounting the file system:
# mount file-system # umount file-system
NoteIf the mount fails with a structure needs cleaning error, the log is corrupted and cannot be replayed. The dry run should discover and report more on-disk corruption as a result.
Use the
xfs_repair
utility to perform a dry run to check the file system. Any errors are printed and an indication of the actions that would be taken, without modifying the file system.# xfs_repair -n block-device
Mount the file system:
# mount file-system
Additional resources
-
xfs_repair(8)
andxfs_metadump(8)
man pages on your system
15.5. Repairing an XFS file system with xfs_repair
This procedure repairs a corrupted XFS file system using the xfs_repair
utility.
Procedure
Create a metadata image prior to repair for diagnostic or testing purposes using the
xfs_metadump
utility. A pre-repair file system metadata image can be useful for support investigations if the corruption is due to a software bug. Patterns of corruption present in the pre-repair image can aid in root-cause analysis.Use the
xfs_metadump
debugging tool to copy the metadata from an XFS file system to a file. The resultingmetadump
file can be compressed using standard compression utilities to reduce the file size if largemetadump
files need to be sent to support.# xfs_metadump block-device metadump-file
Replay the log by remounting the file system:
# mount file-system # umount file-system
Use the
xfs_repair
utility to repair the unmounted file system:If the mount succeeded, no additional options are required:
# xfs_repair block-device
If the mount failed with the Structure needs cleaning error, the log is corrupted and cannot be replayed. Use the
-L
option (force log zeroing) to clear the log:WarningThis command causes all metadata updates in progress at the time of the crash to be lost, which might cause significant file system damage and data loss. This should be used only as a last resort if the log cannot be replayed.
# xfs_repair -L block-device
Mount the file system:
# mount file-system
Additional resources
-
xfs_repair(8)
man page on your system
15.6. Error handling mechanisms in ext2, ext3, and ext4
The ext2, ext3, and ext4 file systems use the e2fsck
utility to perform file system checks and repairs. The file names fsck.ext2
, fsck.ext3
, and fsck.ext4
are hardlinks to the e2fsck
utility. These binaries are run automatically at boot time and their behavior differs based on the file system being checked and the state of the file system.
A full file system check and repair is invoked for ext2, which is not a metadata journaling file system, and for ext4 file systems without a journal.
For ext3 and ext4 file systems with metadata journaling, the journal is replayed in userspace and the utility exits. This is the default action because journal replay ensures a consistent file system after a crash.
If these file systems encounter metadata inconsistencies while mounted, they record this fact in the file system superblock. If e2fsck
finds that a file system is marked with such an error, e2fsck
performs a full check after replaying the journal (if present).
Additional resources
-
fsck(8)
ande2fsck(8)
man pages on your system
15.7. Checking an ext2, ext3, or ext4 file system with e2fsck
This procedure checks an ext2, ext3, or ext4 file system using the e2fsck
utility.
Procedure
Replay the log by remounting the file system:
# mount file-system # umount file-system
Perform a dry run to check the file system.
# e2fsck -n block-device
NoteAny errors are printed and an indication of the actions that would be taken, without modifying the file system. Later phases of consistency checking may print extra errors as it discovers inconsistencies which would have been fixed in early phases if it were running in repair mode.
Additional resources
-
e2image(8)
ande2fsck(8)
man pages on your system
15.8. Repairing an ext2, ext3, or ext4 file system with e2fsck
This procedure repairs a corrupted ext2, ext3, or ext4 file system using the e2fsck
utility.
Procedure
Save a file system image for support investigations. A pre-repair file system metadata image can be useful for support investigations if the corruption is due to a software bug. Patterns of corruption present in the pre-repair image can aid in root-cause analysis.
NoteSeverely damaged file systems may cause problems with metadata image creation.
If you are creating the image for testing purposes, use the
-r
option to create a sparse file of the same size as the file system itself.e2fsck
can then operate directly on the resulting file.# e2image -r block-device image-file
If you are creating the image to be archived or provided for diagnostic, use the
-Q
option, which creates a more compact file format suitable for transfer.# e2image -Q block-device image-file
Replay the log by remounting the file system:
# mount file-system # umount file-system
Automatically repair the file system. If user intervention is required,
e2fsck
indicates the unfixed problem in its output and reflects this status in the exit code.# e2fsck -p block-device
Additional resources
-
e2image(8)
man page on your system -
e2fsck(8)
man page on your system
-