Deployment Guide
Deployment, configuration and administration of Red Hat Enterprise Linux 5
Edition 11
Abstract
Introduction
- Setting up a network interface card (NIC)
- Configuring a Virtual Private Network (VPN)
- Configuring Samba shares
- Managing your software with RPM
- Determining information about your system
- Upgrading your kernel
- File systems
- Package management
- Network-related configuration
- System configuration
- System monitoring
- Kernel and Driver Configuration
- Security and Authentication
- Red Hat Training and Certification
1. Document Conventions
command
- Linux commands (and other operating system commands, when used) are represented this way. This style should indicate to you that you can type the word or phrase on the command line and press Enter to invoke a command. Sometimes a command contains words that would be displayed in a different style on their own (such as file names). In these cases, they are considered to be part of the command, so the entire phrase is displayed as a command. For example:Use the
cat testfile
command to view the contents of a file, namedtestfile
, in the current working directory. file name
- File names, directory names, paths, and RPM package names are represented this way. This style indicates that a particular file or directory exists with that name on your system. Examples:The
.bashrc
file in your home directory contains bash shell definitions and aliases for your own use.The/etc/fstab
file contains information about different system devices and file systems.Install thewebalizer
RPM if you want to use a Web server log file analysis program. - application
- This style indicates that the program is an end-user application (as opposed to system software). For example:Use Mozilla to browse the Web.
- key
- A key on the keyboard is shown in this style. For example:To use Tab completion to list particular files in a directory, type
ls
, then a character, and finally the Tab key. Your terminal displays the list of files in the working directory that begin with that character. - key+combination
- A combination of keystrokes is represented in this way. For example:The Ctrl+Alt+Backspace key combination exits your graphical session and returns you to the graphical login screen or the console.
- text found on a GUI interface
- A title, word, or phrase found on a GUI interface screen or window is shown in this style. Text shown in this style indicates a particular GUI screen or an element on a GUI screen (such as text associated with a checkbox or field). Example:Select the Require Password checkbox if you would like your screensaver to require a password before stopping.
- A word in this style indicates that the word is the top level of a pulldown menu. If you click on the word on the GUI screen, the rest of the menu should appear. For example:Underon a GNOME terminal, the option allows you to open multiple shell prompts in the same window.Instructions to type in a sequence of commands from a GUI menu look like the following example:Go to Emacs text editor.(the main menu on the panel) > > to start the
- This style indicates that the text can be found on a clickable button on a GUI screen. For example:Click on thebutton to return to the webpage you last viewed.
computer output
- Text in this style indicates text displayed to a shell prompt such as error messages and responses to commands. For example:The
ls
command displays the contents of a directory. For example:Desktop about.html logs paulwesterberg.png Mail backupfiles mail reports
The output returned in response to the command (in this case, the contents of the directory) is shown in this style. prompt
- A prompt, which is a computer's way of signifying that it is ready for you to input something, is shown in this style. Examples:
$
#
[stephen@maturin stephen]$
leopard login:
user input
- Text that the user types, either on the command line or into a text box on a GUI screen, is displayed in this style. In the following example,
text
is displayed in this style:To boot your system into the text based installation program, you must type in thetext
command at theboot:
prompt. - <replaceable>
- Text used in examples that is meant to be replaced with data provided by the user is displayed in this style. In the following example, <version-number> is displayed in this style:The directory for the kernel source is
/usr/src/kernels/<version-number>/
, where <version-number> is the version and type of kernel installed on this system.
Note
Note
/usr/share/doc/
contains additional documentation for packages installed on your system.
Important
Warning
Warning
2. Send in Your Feedback
http://bugzilla.redhat.com/bugzilla/
) against the component Deployment_Guide
.
Part I. File Systems
parted
utility to manage partitions and access control lists (ACLs) to customize file permissions.
Chapter 1. File System Structure
1.1. Why Share a Common Structure?
- Shareable vs. unshareable files
- Variable vs. static files
1.2. Overview of File System Hierarchy Standard (FHS)
/usr/
partition as read-only. This second point is important because the directory contains common executables and should not be changed by users. Also, since the /usr/
directory is mounted as read-only, it can be mounted from the CD-ROM or from another machine via a read-only NFS mount.
1.2.1. FHS Organization
1.2.1.1. The /boot/
Directory
/boot/
directory contains static files required to boot the system, such as the Linux kernel. These files are essential for the system to boot properly.
Warning
/boot/
directory. Doing so renders the system unbootable.
1.2.1.2. The /dev/
Directory
/dev/
directory contains device nodes that either represent devices that are attached to the system or virtual devices that are provided by the kernel. These device nodes are essential for the system to function properly. The udev
daemon takes care of creating and removing all these device nodes in /dev/
.
/dev
directory and subdirectories are either character (providing only a serial stream of input/output) or block (accessible randomly). Character devices include mouse, keyboard, modem while block devices include hard disk, floppy drive etc. If you have GNOME or KDE installed in your system, devices such as external drives or cds are automatically detected when connected (e.g via usb) or inserted (e.g via CD or DVD drive) and a popup window displaying the contents is automatically displayed. Files in the /dev
directory are essential for the system to function properly.
File | Description |
---|---|
/dev/hda | The master device on primary IDE channel. |
/dev/hdb | The slave device on primary IDE channel. |
/dev/tty0 | The first virtual console. |
/dev/tty1 | The second virtual console. |
/dev/sda | The first device on primary SCSI or SATA channel. |
/dev/lp0 | The first parallel port. |
1.2.1.3. The /etc/
Directory
/etc/
directory is reserved for configuration files that are local to the machine. No binaries are to be placed in /etc/
. Any binaries that were once located in /etc/
should be placed into /sbin/
or /bin/
.
/etc
are the X11/
and skel/
:
/etc |- X11/ |- skel/
/etc/X11/
directory is for X Window System configuration files, such as xorg.conf
. The /etc/skel/
directory is for "skeleton" user files, which are used to populate a home directory when a user is first created. Applications also store their configuration files in this directory and may reference them when they are executed.
1.2.1.4. The /lib/
Directory
/lib/
directory should contain only those libraries needed to execute the binaries in /bin/
and /sbin/
. These shared library images are particularly important for booting the system and executing commands within the root file system.
1.2.1.5. The /media/
Directory
/media/
directory contains subdirectories used as mount points for removable media such as usb storage media, DVDs, CD-ROMs, and Zip disks.
1.2.1.6. The /mnt/
Directory
/mnt/
directory is reserved for temporarily mounted file systems, such as NFS file system mounts. For all removable media, please use the /media/
directory. Automatically detected removable media will be mounted in the /media
directory.
Note
/mnt
directory must not be used by installation programs.
1.2.1.7. The /opt/
Directory
/opt/
directory provides storage for most application software packages.
/opt/
directory creates a directory bearing the same name as the package. This directory, in turn, holds files that otherwise would be scattered throughout the file system, giving the system administrator an easy way to determine the role of each file within a particular package.
sample
is the name of a particular software package located within the /opt/
directory, then all of its files are placed in directories inside the /opt/sample/
directory, such as /opt/sample/bin/
for binaries and /opt/sample/man/
for manual pages.
/opt/
directory, giving that large package a way to organize itself. In this way, our sample
package may have different tools that each go in their own sub-directories, such as /opt/sample/tool1/
and /opt/sample/tool2/
, each of which can have their own bin/
, man/
, and other similar directories.
1.2.1.8. The /proc/
Directory
/proc/
directory contains special files that either extract information from or send information to the kernel. Examples include system memory, cpu information, hardware configuration etc.
/proc/
and the many ways this directory can be used to communicate with the kernel, an entire chapter has been devoted to the subject. For more information, refer to Chapter 5, The proc
File System.
1.2.1.9. The /sbin/
Directory
/sbin/
directory stores executables used by the root user. The executables in /sbin/
are used at boot time, for system administration and to perform system recovery operations. Of this directory, the FHS says:
/sbin
contains binaries essential for booting, restoring, recovering, and/or repairing the system in addition to the binaries in/bin
. Programs executed after/usr/
is known to be mounted (when there are no problems) are generally placed into/usr/sbin
. Locally-installed system administration programs should be placed into/usr/local/sbin
.
/sbin/
:
arp
,clock
,halt
,init
,fsck.*
,grub
,ifconfig
,mingetty
,mkfs.*
,mkswap
,reboot
,route
,shutdown
,swapoff
,swapon
1.2.1.10. The /srv/
Directory
/srv/
directory contains site-specific data served by your system running Red Hat Enterprise Linux. This directory gives users the location of data files for a particular service, such as FTP, WWW, or CVS. Data that only pertains to a specific user should go in the /home/
directory.
1.2.1.11. The /sys/
Directory
/sys/
directory utilizes the new sysfs
virtual file system specific to the 2.6 kernel. With the increased support for hot plug hardware devices in the 2.6 kernel, the /sys/
directory contains information similarly held in /proc/
, but displays a hierarchical view of specific device information in regards to hot plug devices.
1.2.1.12. The /usr/
Directory
/usr/
directory is for files that can be shared across multiple machines. The /usr/
directory is often on its own partition and is mounted read-only. At a minimum, the following directories should be subdirectories of /usr/
:
/usr |- bin/ |- etc/ |- games/ |- include/ |- kerberos/ |- lib/ |- libexec/ |- local/ |- sbin/ |- share/ |- src/ |- tmp -> ../var/tmp/
/usr/
directory, the bin/
subdirectory contains executables, etc/
contains system-wide configuration files, games
is for games, include/
contains C header files, kerberos/
contains binaries and other Kerberos-related files, and lib/
contains object files and libraries that are not designed to be directly utilized by users or shell scripts. The libexec/
directory contains small helper programs called by other programs, sbin/
is for system administration binaries (those that do not belong in the /sbin/
directory), share/
contains files that are not architecture-specific, src/
is for source code.
1.2.1.13. The /usr/local/
Directory
The/usr/local
hierarchy is for use by the system administrator when installing software locally. It needs to be safe from being overwritten when the system software is updated. It may be used for programs and data that are shareable among a group of hosts, but not found in/usr
.
/usr/local/
directory is similar in structure to the /usr/
directory. It has the following subdirectories, which are similar in purpose to those in the /usr/
directory:
/usr/local |- bin/ |- etc/ |- games/ |- include/ |- lib/ |- libexec/ |- sbin/ |- share/ |- src/
/usr/local/
directory is slightly different from that specified by the FHS. The FHS says that /usr/local/
should be where software that is to remain safe from system software upgrades is stored. Since software upgrades can be performed safely with RPM Package Manager (RPM), it is not necessary to protect files by putting them in /usr/local/
. Instead, the /usr/local/
directory is used for software that is local to the machine.
/usr/
directory is mounted as a read-only NFS share from a remote host, it is still possible to install a package or program under the /usr/local/
directory.
1.2.1.14. The /var/
Directory
/usr/
as read-only, any programs that write log files or need spool/
or lock/
directories should write them to the /var/
directory. The FHS states /var/
is for:
...variable data files. This includes spool directories and files, administrative and logging data, and transient and temporary files.
/var/
directory:
/var |- account/ |- arpwatch/ |- cache/ |- crash/ |- db/ |- empty/ |- ftp/ |- gdm/ |- kerberos/ |- lib/ |- local/ |- lock/ |- log/ |- mail -> spool/mail/ |- mailman/ |- named/ |- nis/ |- opt/ |- preserve/ |- run/ +- spool/ |- at/ |- clientmqueue/ |- cron/ |- cups/ |- exim/ |- lpd/ |- mail/ |- mailman/ |- mqueue/ |- news/ |- postfix/ |- repackage/ |- rwho/ |- samba/ |- squid/ |- squirrelmail/ |- up2date/ |- uucp |- uucppublic/ |- vbox/ |- tmp/ |- tux/ |- www/ |- yp/
messages
and lastlog
, go in the /var/log/
directory. The /var/lib/rpm/
directory contains RPM system databases. Lock files go in the /var/lock/
directory, usually in directories for the program using the file. The /var/spool/
directory has subdirectories for programs in which data files are stored.
1.3. Special File Locations Under Red Hat Enterprise Linux
/var/lib/rpm/
directory. For more information on RPM, refer to the chapter Chapter 12, Package Management with RPM.
/var/cache/yum/
directory contains files used by the Package Updater, including RPM header information for the system. This location may also be used to temporarily store RPMs downloaded while updating the system. For more information about Red Hat Network, refer to Chapter 15, Registering a System and Managing Subscriptions.
/etc/sysconfig/
directory. This directory stores a variety of configuration information. Many scripts that run at boot time use the files in this directory. Refer to Chapter 32, The sysconfig
Directory for more information about what is within this directory and the role these files play in the boot process.
Chapter 2. Using the mount
Command
mount
or umount
command respectively. This chapter describes the basic usage of these commands, and covers some advanced topics such as moving a mount point or creating shared subtrees.
2.1. Listing Currently Mounted File Systems
mount
command with no additional arguments:
mount
device on directory type type (options)
sysfs
, tmpfs
, and others. To display only the devices with a certain file system type, supply the -t
option on the command line:
mount
-t
type
mount
command to list the mounted file systems, see Example 2.1, “Listing Currently Mounted ext3
File Systems”.
Example 2.1. Listing Currently Mounted ext3
File Systems
/
and /boot
partitions are formatted to use ext3
. To display only the mount points that use this file system, type the following at a shell prompt:
~]$ mount -t ext3
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
/dev/vda1 on /boot type ext3 (rw)
2.2. Mounting a File System
mount
command in the following form:
mount
[option…] device directory
mount
command is run, it reads the content of the /etc/fstab
configuration file to see if the given file system is listed. This file contains a list of device names and the directory in which the selected file systems should be mounted, as well as the file system type and mount options. Because of this, when you are mounting a file system that is specified in this file, you can use one of the following variants of the command:
mount
[option…] directorymount
[option…] device
root
, you must have permissions to mount the file system (see Section 2.2.2, “Specifying the Mount Options”).
2.2.1. Specifying the File System Type
mount
detects the file system automatically. However, there are certain file systems, such as NFS
(Network File System) or CIFS
(Common Internet File System), that are not recognized, and need to be specified manually. To specify the file system type, use the mount
command in the following form:
mount
-t
type device directory
mount
command. For a complete list of all available file system types, consult the relevant manual page as referred to in Section 2.4.1, “Installed Documentation”.
Type | Description |
---|---|
ext2 | The ext2 file system. |
ext3 | The ext3 file system. |
ext4 | The ext4 file system. |
iso9660 | The ISO 9660 file system. It is commonly used by optical media, typically CDs. |
jfs | The JFS file system created by IBM. |
nfs | The NFS file system. It is commonly used to access files over the network. |
nfs4 | The NFSv4 file system. It is commonly used to access files over the network. |
ntfs | The NTFS file system. It is commonly used on machines that are running the Windows operating system. |
udf | The UDF file system. It is commonly used by optical media, typically DVDs. |
vfat | The FAT file system. It is commonly used on machines that are running the Windows operating system, and on certain digital media such as USB flash drives or floppy disks. |
Example 2.2. Mounting a USB Flash Drive
/dev/sdc1
device and that the /media/flashdisk/
directory exists, you can mount it to this directory by typing the following at a shell prompt as root
:
~]# mount -t vfat /dev/sdc1 /media/flashdisk
2.2.2. Specifying the Mount Options
mount
-o
options
mount
will incorrectly interpret the values following spaces as additional parameters.
Option | Description |
---|---|
async | Allows the asynchronous input/output operations on the file system. |
auto | Allows the file system to be mounted automatically using the mount -a command. |
defaults | Provides an alias for async,auto,dev,exec,nouser,rw,suid . |
exec | Allows the execution of binary files on the particular file system. |
loop | Mounts an image as a loop device. |
noauto | Disallows the automatic mount of the file system using the mount -a command. |
noexec | Disallows the execution of binary files on the particular file system. |
nouser | Disallows an ordinary user (that is, other than root ) to mount and unmount the file system. |
remount | Remounts the file system in case it is already mounted. |
ro | Mounts the file system for reading only. |
rw | Mounts the file system for both reading and writing. |
user | Allows an ordinary user (that is, other than root ) to mount and unmount the file system. |
Example 2.3. Mounting an ISO Image
/media/cdrom/
directory exists, you can mount the image to this directory by running the following command as root
:
~]# mount -o ro,loop Fedora-14-x86_64-Live-Desktop.iso /media/cdrom
2.2.3. Sharing Mounts
mount
command implements the --bind
option that provides a means for duplicating certain mounts. Its usage is as follows:
mount
--bind
old_directory new_directory
mount
--rbind
old_directory new_directory
- Shared Mount
- A shared mount allows you to create an exact replica of a given mount point. When a shared mount is created, any mount within the original mount point is reflected in it, and vice versa. To create a shared mount, type the following at a shell prompt:
mount
--make-shared
mount_pointAlternatively, you can change the mount type for the selected mount point and all mount points under it:mount
--make-rshared
mount_pointSee Example 2.4, “Creating a Shared Mount Point” for an example usage. - Slave Mount
- A slave mount allows you to create a limited duplicate of a given mount point. When a slave mount is created, any mount within the original mount point is reflected in it, but no mount within a slave mount is reflected in its original. To create a slave mount, type the following at a shell prompt:
mount
--make-slave
mount_pointAlternatively, you can change the mount type for the selected mount point and all mount points under it:mount
--make-rslave
mount_pointSee Example 2.5, “Creating a Slave Mount Point” for an example usage.Example 2.5. Creating a Slave Mount Point
Imagine you want the content of the/media
directory to appear in/mnt
as well, but you do not want any mounts in the/mnt
directory to be reflected in/media
. To do so, asroot
, first mark the/media
directory as “shared”:~]#
mount --bind /media /media
~]#mount --make-shared /media
Then create its duplicate in/mnt
, but mark it as “slave”:~]#
mount --bind /media /mnt
~]#mount --make-slave /mnt
You can now verify that a mount within/media
also appears in/mnt
. For example, if you have non-empty media in your CD-ROM drive and the/media/cdrom/
directory exists, run the following commands:~]#
mount /dev/cdrom /media/cdrom
~]#ls /media/cdrom
EFI GPL isolinux LiveOS ~]#ls /mnt/cdrom
EFI GPL isolinux LiveOSYou can also verify that file systems mounted in the/mnt
directory are not reflected in/media
. For instance, if you have a non-empty USB flash drive that uses the/dev/sdc1
device plugged in and the/mnt/flashdisk/
directory is present, type: :~]#
mount /dev/sdc1 /mnt/flashdisk
~]#ls /media/flashdisk
~]#ls /mnt/flashdisk
en-US publican.cfg - Private Mount
- A private mount allows you to create an ordinary mount. When a private mount is created, no subsequent mounts within the original mount point are reflected in it, and no mount within a private mount is reflected in its original. To create a private mount, type the following at a shell prompt:
mount
--make-private
mount_pointAlternatively, you can change the mount type for the selected mount point and all mount points under it:mount
--make-rprivate
mount_pointSee Example 2.6, “Creating a Private Mount Point” for an example usage.Example 2.6. Creating a Private Mount Point
Taking into account the scenario in Example 2.4, “Creating a Shared Mount Point”, assume that you have previously created a shared mount point by using the following commands asroot
:~]#
mount --bind /media /media
~]#mount --make-shared /media
~]#mount --bind /media /mnt
To mark the/mnt
directory as “private”, type:~]#
mount --make-private /mnt
You can now verify that none of the mounts within/media
appears in/mnt
. For example, if you have non-empty media in your CD-ROM drive and the/media/cdrom/
directory exists, run the following commands:~]#
mount /dev/cdrom /media/cdrom
~]#ls /media/cdrom
EFI GPL isolinux LiveOS ~]#ls /mnt/cdrom
~]#You can also verify that file systems mounted in the/mnt
directory are not reflected in/media
. For instance, if you have a non-empty USB flash drive that uses the/dev/sdc1
device plugged in and the/mnt/flashdisk/
directory is present, type:~]#
mount /dev/sdc1 /mnt/flashdisk
~]#ls /media/flashdisk
~]#ls /mnt/flashdisk
en-US publican.cfg - Unbindable Mount
- An unbindable mount allows you to prevent a given mount point from being duplicated whatsoever. To create an unbindable mount, type the following at a shell prompt:
mount
--make-unbindable
mount_pointAlternatively, you can change the mount type for the selected mount point and all mount points under it:mount
--make-runbindable
mount_pointSee Example 2.7, “Creating an Unbindable Mount Point” for an example usage.Example 2.7. Creating an Unbindable Mount Point
To prevent the/media
directory from being shared, asroot
, type the following at a shell prompt:~]#
mount --bind /media /media
~]#mount --make-unbindable /media
This way, any subsequent attempt to make a duplicate of this mount will fail with an error:~]#
mount --bind /media /mnt
mount: wrong fs type, bad option, bad superblock on /media/, missing code page or other error In some cases useful info is found in syslog - try dmesg | tail or so
2.2.4. Moving a Mount Point
mount
--move
old_directory new_directory
Example 2.8. Moving an Existing NFS Mount Point
/mnt/userdirs/
, as root
, you can move this mount point to /home
by using the following command:
~]# mount --move /mnt/userdirs /home
~]#ls /mnt/userdirs
~]#ls /home
jill joe
2.3. Unmounting a File System
umount
command:
umount
directoryumount
device
root
, you must have permissions to unmount the file system (see Section 2.2.2, “Specifying the Mount Options”). See Example 2.9, “Unmounting a CD” for an example usage.
Important
umount
command will fail with an error. To determine which processes are accessing the file system, use the fuser
command in the following form:
fuser
-m
directory
/media/cdrom/
directory, type:
~]$ fuser -m /media/cdrom
/media/cdrom: 1793 2013 2022 2435 10532c 10672c
Example 2.9. Unmounting a CD
/media/cdrom/
directory, type the following at a shell prompt:
~]$ umount /media/cdrom
2.4. Additional Resources
2.4.1. Installed Documentation
man 8 mount
— The manual page for themount
command that provides a full documentation on its usage.man 8 umount
— The manual page for theumount
command that provides a full documentation on its usage.man 5 fstab
— The manual page providing a thorough description of the/etc/fstab
file format.
2.4.2. Useful Websites
- Shared subtrees — An LWN article covering the concept of shared subtrees.
- sharedsubtree.txt — Extensive documentation that is shipped with the shared subtrees patches.
Chapter 3. The ext3 File System
3.1. Features of ext3
- Availability
- After an unexpected power failure or system crash (also called an unclean system shutdown), each mounted ext2 file system on the machine must be checked for consistency by the
e2fsck
program. This is a time-consuming process that can delay system boot time significantly, especially with large volumes containing a large number of files. During this time, any data on the volumes is unreachable.The journaling provided by the ext3 file system means that this sort of file system check is no longer necessary after an unclean system shutdown. The only time a consistency check occurs using ext3 is in certain rare hardware failure cases, such as hard drive failures. The time to recover an ext3 file system after an unclean system shutdown does not depend on the size of the file system or the number of files; rather, it depends on the size of the journal used to maintain consistency. The default journal size takes about a second to recover, depending on the speed of the hardware. - Data Integrity
- The ext3 file system prevents loss of data integrity in the event that an unclean system shutdown occurs. The ext3 file system allows you to choose the type and level of protection that your data receives. By default, the ext3 volumes are configured to keep a high level of data consistency with regard to the state of the file system.
- Speed
- Despite writing some data more than once, ext3 has a higher throughput in most cases than ext2 because ext3's journaling optimizes hard drive head motion. You can choose from three journaling modes to optimize speed, but doing so means trade-offs in regards to data integrity if the system was to fail.
- Easy Transition
- It is easy to migrate from ext2 to ext3 and gain the benefits of a robust journaling file system without reformatting. Refer to Section 3.3, “Converting to an ext3 File System” for more on how to perform this task.
3.2. Creating an ext3 File System
- Format the partition with the ext3 file system using
mkfs
. - Label the partition using
e2label
.
3.3. Converting to an ext3 File System
tune2fs
allows you to convert an ext2
filesystem to ext3
.
Note
e2fsck
utility to check your filesystem before and after using tune2fs
. A default installation of Red Hat Enterprise Linux uses ext3 for all file systems.
ext2
filesystem to ext3
, log in as root and type the following command in a terminal:
tune2fs -j <block_device>
- A mapped device — A logical volume in a volume group, for example,
/dev/mapper/VolGroup00-LogVol02
. - A static device — A traditional storage volume, for example,
/dev/hdbX
, where hdb is a storage device name and X is the partition number.
df
command to display mounted file systems.
/dev/mapper/VolGroup00-LogVol02
mkinitrd
program. For information on using the mkinitrd
command, type man mkinitrd
. Also, make sure your GRUB configuration loads the initrd
.
3.4. Reverting to an ext2 File System
umount /dev/mapper/VolGroup00-LogVol02
tune2fs -O ^has_journal /dev/mapper/VolGroup00-LogVol02
e2fsck -y /dev/mapper/VolGroup00-LogVol02
mount -t ext2 /dev/mapper/VolGroup00-LogVol02 /mount/point
.journal
file at the root level of the partition by changing to the directory where it is mounted and typing:
rm -f .journal
/etc/fstab
file.
Chapter 4. The ext4 File System
4.1. Features of ext4
- Main Features
- The ext4 file system uses extents (as opposed to the traditional block mapping scheme used by ext2 and ext3), which improves performance when using large files and reduces metadata overhead for large files. In addition, ext4 also labels unallocated block groups and inode table sections accordingly, which allows them to be skipped during a file system check. This makes for quicker file system checks, which becomes more beneficial as the file system grows in size.
- Allocation Features
- The ext4 file system features the following allocation schemes:
- Persistent pre-allocation
- Delayed allocation
- Multi-block allocation
- Stripe-aware allocation
Because of delayed allocation and other performance optimizations, ext4's behavior of writing files to disk is different from ext3. In ext4, a program's writes to the file system are not guaranteed to be on-disk unless the program issues anfsync()
call afterwards.By default, ext3 automatically forces newly created files to disk almost immediately even withoutfsync()
. This behavior hid bugs in programs that did not usefsync()
to ensure that written data was on-disk. The ext4 file system, on the other hand, often waits several seconds to write out changes to disk, allowing it to combine and reorder writes for better disk performance than ext3.Warning
Unlike ext3, the ext4 file system does not force data to disk on transaction commit. As such, it takes longer for buffered writes to be flushed to disk. As with any file system, use data integrity calls such asfsync()
to ensure that data is written to permanent storage. - Other ext4 Features
- The ext4 file system also supports the following:
- Extended attributes (
xattr
), which allows the system to associate several additional name/value pairs per file. - Quota journaling, which avoids the need for lengthy quota consistency checks after a crash.
Note
The only supported journaling mode in ext4 isdata=ordered
(default). - Subsecond timestamps, which allow to specify inode timestamp fields in nanosecond resolution.
4.2. Managing an ext4 File System
~]# yum install e4fsprogs
mke4fs
— A utility used to create an ext4 file system.mkfs.ext4
— Another command used to create an ext4 file system.e4fsck
— A utility used to repair inconsistencies of an ext4 file system.tune4fs
— A utility used to modify ext4 file system attributes.resize4fs
— A utility used to resize an ext4 file system.e4label
— A utility used to display or modify the label of the ext4 file system.dumpe4fs
— A utility used to display the super block and blocks group information for the ext4 file system.debuge4fs
— An interactive file system debugger, used to examine ext4 file systems, manually repair corrupted file systems and create test cases fore4fsck
.
4.3. Creating an ext4 File System
mke4fs
and mkfs.ext4
commands for available options. Also, you may want to examine and modify the configuration file of mke4fs
, /etc/mke4fs.conf
, if you plan to create ext4 file systems more often.
- Format the partition with the ext4 file system using the
mkfs.ext4
ormke4fs
command:~]#
mkfs.ext4 block_device
~]#
mke4fs -t ext4 block_device
where block_device is a partition which will contain the ext4 filesystem you wish to create. - Label the partition using the
e4label
command.~]#
e4label <block_device> new-label
- Create a mount point and mount the new file system to that mount point:
~]#
mkdir /mount/point
~]#mount block_device /mount/point
- A mapped device — A logical volume in a volume group, for example,
/dev/mapper/VolGroup00-LogVol02
. - A static device — A traditional storage volume, for example,
/dev/hdbX
, where hdb is a storage device name and X is the partition number.
mkfs.ext4
chooses an optimal geometry. This may also be true on some hardware RAIDs which export geometry information to the operating system.
-E
option of mkfs.ext4
(that is, extended file system options) with the following sub-options:
- stride=value
- Specifies the RAID chunk size.
- stripe-width=value
- Specifies the number of data disks in a RAID device, or the number of stripe units in the stripe.
value
must be specified in file system block units. For example, to create a file system with a 64k stride (that is, 16 x 4096) on a 4k-block file system, use the following command:
~]# mkfs.ext4 -E stride=16,stripe-width=64 block_device
man mkfs.ext4
.
4.4. Mounting an ext4 File System
~]# mount block_device /mount/point
acl
, noacl
, data
, quota
, noquota
, user_xattr
, nouser_xattr
, and many others that were already used with the ext2 and ext3 file systems, are backward compatible and have the same usage and functionality. Also, with the ext4 file system, several new ext4-specific mount options have been added, for example:
- barrier / nobarrier
- By default, ext4 uses write barriers to ensure file system integrity even when power is lost to a device with write caches enabled. For devices without write caches, or with battery-backed write caches, you disable barriers using the
nobarrier
option:~]#
mount -o nobarrier block_device /mount/point
- stripe=value
- This option allows you to specify the number of file system blocks allocated for a single file operation. For RAID5 this number should be equal the RAID chunk size multiplied by the number of disks.
- journal_ioprio=value
- This option allows you to set priority of I/O operations submitted during a commit operation. The option can have a value from 7 to 0 (0 is the highest priority), and is set to 3 by default, which is slightly higher priority than the default I/O priority.
tune4fs
utility. For example, the following command sets the file system on the /dev/mapper/VolGroup00-LogVol02
device to be mounted by default with debugging disabled and user-specified extended attributes and Posix access control lists enabled:
~]# tune4fs -o ^debug,user_xattr,acl /dev/mapper/VolGroup00-LogVol02
tune4fs
(8) manual page.
~]# mount -t ext4 block_device /mount/point
delayed allocation
and multi-block allocation
, and exclude features such as extent mapping
.
Warning
mount
(8) manual page.
Note
/etc/fstab
file accordingly. For example:
/dev/mapper/VolGroup00-LogVol02 /test ext4 defaults 0 0
4.5. Resizing an ext4 File System
resize4fs
command:
~]# resize4fs block_devicenew_size
resize2fs
utility reads the size in units of file system block size, unless a suffix indicating a specific unit is used. The following suffixes indicate specific units:
s
— 512 byte sectorsK
— kilobytesM
— megabytesG
— gigabytes
size
parameter is optional (and often redundant) when expanding. The resize4fs
automatically expands to fill all available space of the container, usually a logical volume or partition. For more information about resizing an ext4 file system, refer to the resize4fs
(8) manual page.
Chapter 5. The proc
File System
/proc/
directory — also called the proc
file system — contains a hierarchy of special files which represent the current state of the kernel — allowing applications and users to peer into the kernel's view of the system.
/proc/
directory, one can find a wealth of information detailing the system hardware and any processes currently running. In addition, some of the files within the /proc/
directory tree can be manipulated by users and applications to communicate configuration changes to the kernel.
5.1. A Virtual File System
/proc/
directory contains another type of file called a virtual file. It is for this reason that /proc/
is often referred to as a virtual file system.
/proc/interrupts
, /proc/meminfo
, /proc/mounts
, and /proc/partitions
provide an up-to-the-moment glimpse of the system's hardware. Others, like the /proc/filesystems
file and the /proc/sys/
directory provide system configuration information and interfaces.
/proc/ide/
contains information for all physical IDE devices. Likewise, process directories contain information about each running process on the system.
5.1.1. Viewing Virtual Files
cat
, more
, or less
commands on files within the /proc/
directory, users can immediately access enormous amounts of information about the system. For example, to display the type of CPU a computer has, type cat /proc/cpuinfo
to receive output similar to the following:
processor : 0 vendor_id : AuthenticAMD cpu family : 5 model : 9 model name : AMD-K6(tm) 3D+ Processor stepping : 1 cpu MHz : 400.919 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr mce cx8 pge mmx syscall 3dnow k6_mtrr bogomips : 799.53
/proc/
file system, some of the information is easily understandable while some is not human-readable. This is in part why utilities exist to pull data from virtual files and display it in a useful way. Examples of these utilities include lspci
, apm
, free
, and top
.
Note
/proc/
directory are readable only by the root user.
5.1.2. Changing Virtual Files
/proc/
directory are read-only. However, some can be used to adjust settings in the kernel. This is especially true for files in the /proc/sys/
subdirectory.
echo
command and a greater than symbol (>
) to redirect the new value to the file. For example, to change the hostname on the fly, type:
echo www.example.com > /proc/sys/kernel/hostname
cat /proc/sys/net/ipv4/ip_forward
returns either a 0
or a 1
. A 0
indicates that the kernel is not forwarding network packets. Using the echo
command to change the value of the ip_forward
file to 1
immediately turns packet forwarding on.
Note
/proc/sys/
subdirectory is /sbin/sysctl
. For more information on this command, refer to Section 5.4, “Using the sysctl
Command”
/proc/sys/
subdirectory, refer to Section 5.3.9, “ /proc/sys/
”.
5.1.3. Restricting Access to Process Directories
/proc/
so that they can be viewed only by the root
user. You can restrict the access to these directories with the use of the hidepid
option.
mount
command with the -o remount
option. As root
, type:
mount
-o remount
,hidepid
=value/proc
hidepid
is one of:
0
(default) — every user can read all world-readable files stored in a process directory.1
— users can access only their own process directories. This protects the sensitive files likecmdline
,sched
, orstatus
from access by non-root users. This setting does not affect the actual file permissions.2
— process files are invisible to non-root users. The existence of a process can be learned by other means, but its effective UID and GID is hidden. Hiding these IDs complicates an intruder's task of gathering information about running processes.
Example 5.1. Restricting access to process directories
root
user, type:
~]#mount
-o remount
,hidepid
=1
/proc
hidepid
=1
, a non-root user cannot access the contents of process directories. An attempt to do so fails with the following message:
~]$ls
/proc/1/
ls: /proc/1/: Operation not permitted
hidepid
=2
enabled, process directories are made invisible to non-root users:
~]$ls
/proc/1/
ls: /proc/1/: No such file or directory
hidepid
is set to 1 or 2. To do this, use the gid
option. As root
, type:
mount
-o remount
,hidepid
=value,gid
=gid/proc
hidepid
was set to 0. However, users which are not supposed to monitor the tasks in the whole system should not be added to the group. For more information on managing users and groups see Chapter 37, Users and Groups.
5.2. Top-level Files within the proc
File System
/proc/
directory.
Note
5.2.1. /proc/apm
apm
command. If a system with no battery is connected to an AC power source, this virtual file would look similar to the following:
1.16 1.2 0x07 0x01 0xff 0x80 -1% -1 ?
apm -v
command on such a system results in output similar to the following:
APM BIOS 1.2 (kernel driver 1.16ac) AC on-line, no system battery
apm
is able do little more than put the machine in standby mode. The apm
command is much more useful on laptops. For example, the following output is from the command cat /proc/apm
on a laptop while plugged into a power outlet:
1.16 1.2 0x03 0x01 0x03 0x09 100% -1 ?
apm
file changes to something like the following:
1.16 1.2 0x03 0x00 0x00 0x01 99% 1792 min
apm -v
command now yields more useful data, such as the following:
APM BIOS 1.2 (kernel driver 1.16) AC off-line, battery status high: 99% (1 day, 5:52)
5.2.2. /proc/buddyinfo
DMA
row references the first 16 MB on a system, the HighMem
row references all memory greater than 4 GB on a system, and the Normal
row references all memory in between.
/proc/buddyinfo
:
Node 0, zone DMA 90 6 2 1 1 ... Node 0, zone Normal 1650 310 5 0 0 ... Node 0, zone HighMem 2 0 0 1 1 ...
5.2.3. /proc/cmdline
/proc/cmdline
file looks like the following:
ro root=/dev/VolGroup00/LogVol00 rhgb quiet 3
- ro
- The root device is mounted read-only at boot time. The presence of
ro
on the kernel boot line overrides any instances ofrw
. - root=/dev/VolGroup00/LogVol00
- This tells us on which disk device or, in this case, on which logical volume, the root filesystem image is located. With our sample
/proc/cmdline
output, the root filesystem image is located on the first logical volume (LogVol00
) of the first LVM volume group (VolGroup00
). On a system not using Logical Volume Management, the root file system might be located on/dev/sda1
or/dev/sda2
, meaning on either the first or second partition of the first SCSI or SATA disk drive, depending on whether we have a separate (preceding) boot or swap partition on that drive.For more information on LVM used in Red Hat Enterprise Linux, refer to http://www.tldp.org/HOWTO/LVM-HOWTO/index.html. - rhgb
- A short lowercase acronym that stands for Red Hat Graphical Boot, providing "rhgb" on the kernel command line signals that graphical booting is supported, assuming that
/etc/inittab
shows that the default runlevel is set to 5 with a line like this:id:5:initdefault:
- quiet
- Indicates that all verbose kernel messages except those which are extremely serious should be suppressed at boot time.
5.2.4. /proc/cpuinfo
/proc/cpuinfo
:
processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 2.40GHz stepping : 7 cpu MHz : 2392.371 cache size : 512 KB physical id : 0 siblings : 2 runqueue : 0 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm bogomips : 4771.02
processor
— Provides each processor with an identifying number. On systems that have one processor, only a0
is present.cpu family
— Authoritatively identifies the type of processor in the system. For an Intel-based system, place the number in front of "86" to determine the value. This is particularly helpful for those attempting to identify the architecture of an older system such as a 586, 486, or 386. Because some RPM packages are compiled for each of these particular architectures, this value also helps users determine which packages to install.model name
— Displays the common name of the processor, including its project name.cpu MHz
— Shows the precise speed in megahertz for the processor to the thousandths decimal place.cache size
— Displays the amount of level 2 memory cache available to the processor.siblings
— Displays the number of sibling CPUs on the same physical CPU for architectures which use hyper-threading.flags
— Defines a number of different qualities about the processor, such as the presence of a floating point unit (FPU) and the ability to process MMX instructions.
5.2.5. /proc/crypto
/proc/crypto
file looks like the following:
name : sha1 module : kernel type : digest blocksize : 64 digestsize : 20 name : md5 module : md5 type : digest blocksize : 64 digestsize : 16
5.2.6. /proc/devices
Character devices: 1 mem 4 /dev/vc/0 4 tty 4 ttyS 5 /dev/tty 5 /dev/console 5 /dev/ptmx 7 vcs 10 misc 13 input 29 fb 36 netlink 128 ptm 136 pts 180 usb Block devices: 1 ramdisk 3 ide0 9 md 22 ide1 253 device-mapper 254 mdp
/proc/devices
includes the major number and name of the device, and is broken into two major sections: Character devices
and Block devices
.
- Character devices do not require buffering. Block devices have a buffer available, allowing them to order requests before addressing them. This is important for devices designed to store information — such as hard drives — because the ability to order the information before writing it to the device allows it to be placed in a more efficient order.
- Character devices send data with no preconfigured size. Block devices can send and receive information in blocks of a size configured per device.
/usr/share/doc/kernel-doc-<version>/Documentation/devices.txt
5.2.7. /proc/dma
/proc/dma
files looks like the following:
4: cascade
5.2.8. /proc/execdomains
0-0 Linux [kernel]
PER_LINUX
execution domain, different personalities can be implemented as dynamically loadable modules.
5.2.9. /proc/fb
/proc/fb
for systems which contain frame buffer devices looks similar to the following:
0 VESA VGA
5.2.10. /proc/filesystems
/proc/filesystems
file looks similar to the following:
nodev sysfs nodev rootfs nodev bdev nodev proc nodev sockfs nodev binfmt_misc nodev usbfs nodev usbdevfs nodev futexfs nodev tmpfs nodev pipefs nodev eventpollfs nodev devpts ext2 nodev ramfs nodev hugetlbfs iso9660 nodev mqueue ext3 nodev rpc_pipefs nodev autofs
nodev
are not mounted on a device. The second column lists the names of the file systems supported.
mount
command cycles through the file systems listed here when one is not specified as an argument.
5.2.11. /proc/interrupts
/proc/interrupts
looks similar to the following:
CPU0 0: 80448940 XT-PIC timer 1: 174412 XT-PIC keyboard 2: 0 XT-PIC cascade 8: 1 XT-PIC rtc 10: 410964 XT-PIC eth0 12: 60330 XT-PIC PS/2 Mouse 14: 1314121 XT-PIC ide0 15: 5195422 XT-PIC ide1 NMI: 0 ERR: 0
CPU0 CPU1 0: 1366814704 0 XT-PIC timer 1: 128 340 IO-APIC-edge keyboard 2: 0 0 XT-PIC cascade 8: 0 1 IO-APIC-edge rtc 12: 5323 5793 IO-APIC-edge PS/2 Mouse 13: 1 0 XT-PIC fpu 16: 11184294 15940594 IO-APIC-level Intel EtherExpress Pro 10/100 Ethernet 20: 8450043 11120093 IO-APIC-level megaraid 30: 10432 10722 IO-APIC-level aic7xxx 31: 23 22 IO-APIC-level aic7xxx NMI: 0 ERR: 0
XT-PIC
— This is the old AT computer interrupts.IO-APIC-edge
— The voltage signal on this interrupt transitions from low to high, creating an edge, where the interrupt occurs and is only signaled once. This kind of interrupt, as well as theIO-APIC-level
interrupt, are only seen on systems with processors from the 586 family and higher.IO-APIC-level
— Generates interrupts when its voltage signal is high until the signal is low again.
5.2.12. /proc/iomem
00000000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000c7fff : Video ROM 000f0000-000fffff : System ROM 00100000-07ffffff : System RAM 00100000-00291ba8 : Kernel code 00291ba9-002e09cb : Kernel data e0000000-e3ffffff : VIA Technologies, Inc. VT82C597 [Apollo VP3] e4000000-e7ffffff : PCI Bus #01 e4000000-e4003fff : Matrox Graphics, Inc. MGA G200 AGP e5000000-e57fffff : Matrox Graphics, Inc. MGA G200 AGP e8000000-e8ffffff : PCI Bus #01 e8000000-e8ffffff : Matrox Graphics, Inc. MGA G200 AGP ea000000-ea00007f : Digital Equipment Corporation DECchip 21140 [FasterNet] ea000000-ea00007f : tulip ffff0000-ffffffff : reserved
5.2.13. /proc/ioports
/proc/ioports
provides a list of currently registered port regions used for input or output communication with a device. This file can be quite long. The following is a partial listing:
0000-001f : dma1 0020-003f : pic1 0040-005f : timer 0060-006f : keyboard 0070-007f : rtc 0080-008f : dma page reg 00a0-00bf : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 01f0-01f7 : ide0 02f8-02ff : serial(auto) 0376-0376 : ide1 03c0-03df : vga+ 03f6-03f6 : ide0 03f8-03ff : serial(auto) 0cf8-0cff : PCI conf1 d000-dfff : PCI Bus #01 e000-e00f : VIA Technologies, Inc. Bus Master IDE e000-e007 : ide0 e008-e00f : ide1 e800-e87f : Digital Equipment Corporation DECchip 21140 [FasterNet] e800-e87f : tulip
5.2.14. /proc/kcore
/proc/
files, kcore
displays a size. This value is given in bytes and is equal to the size of the physical memory (RAM) used plus 4 KB.
gdb
, and is not human readable.
Warning
/proc/kcore
virtual file. The contents of the file scramble text output on the terminal. If this file is accidentally viewed, press Ctrl+C to stop the process and then type reset
to bring back the command line prompt.
5.2.15. /proc/kmsg
/sbin/klogd
or /bin/dmesg
.
5.2.16. /proc/loadavg
uptime
and other commands. A sample /proc/loadavg
file looks similar to the following:
0.20 0.18 0.12 1/80 11206
5.2.17. /proc/locks
/proc/locks
file for a lightly loaded system looks similar to the following:
1: POSIX ADVISORY WRITE 3568 fd:00:2531452 0 EOF 2: FLOCK ADVISORY WRITE 3517 fd:00:2531448 0 EOF 3: POSIX ADVISORY WRITE 3452 fd:00:2531442 0 EOF 4: POSIX ADVISORY WRITE 3443 fd:00:2531440 0 EOF 5: POSIX ADVISORY WRITE 3326 fd:00:2531430 0 EOF 6: POSIX ADVISORY WRITE 3175 fd:00:2531425 0 EOF 7: POSIX ADVISORY WRITE 3056 fd:00:2548663 0 EOF
FLOCK
signifying the older-style UNIX file locks from a flock
system call and POSIX
representing the newer POSIX locks from the lockf
system call.
ADVISORY
or MANDATORY
. ADVISORY
means that the lock does not prevent other people from accessing the data; it only prevents other attempts to lock it. MANDATORY
means that no other access to the data is permitted while the lock is held. The fourth column reveals whether the lock is allowing the holder READ
or WRITE
access to the file. The fifth column shows the ID of the process holding the lock. The sixth column shows the ID of the file being locked, in the format of MAJOR-DEVICE:MINOR-DEVICE:INODE-NUMBER
. The seventh and eighth column shows the start and end of the file's locked region.
5.2.18. /proc/mdstat
/proc/mdstat
looks similar to the following:
Personalities : read_ahead not set unused devices: <none>
md
device is present. In that case, view /proc/mdstat
to find the current status of mdX
RAID devices.
/proc/mdstat
file below shows a system with its md0
configured as a RAID 1 device, while it is currently re-syncing the disks:
Personalities : [linear] [raid1] read_ahead 1024 sectors md0: active raid1 sda2[1] sdb2[0] 9940 blocks [2/2] [UU] resync=1% finish=12.3min algorithm 2 [3/3] [UUU] unused devices: <none>
5.2.19. /proc/meminfo
/proc/
directory, as it reports a large amount of valuable information about the systems RAM usage.
/proc/meminfo
virtual file is from a system with 256 MB of RAM and 512 MB of swap space:
MemTotal: 255908 kB MemFree: 69936 kB Buffers: 15812 kB Cached: 115124 kB SwapCached: 0 kB Active: 92700 kB Inactive: 63792 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 255908 kB LowFree: 69936 kB SwapTotal: 524280 kB SwapFree: 524280 kB Dirty: 4 kB Writeback: 0 kB Mapped: 42236 kB Slab: 25912 kB Committed_AS: 118680 kB PageTables: 1236 kB VmallocTotal: 3874808 kB VmallocUsed: 1416 kB VmallocChunk: 3872908 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 4096 kB
free
, top
, and ps
commands. In fact, the output of the free
command is similar in appearance to the contents and structure of /proc/meminfo
. But by looking directly at /proc/meminfo
, more details are revealed:
MemTotal
— Total amount of physical RAM, in kilobytes.MemFree
— The amount of physical RAM, in kilobytes, left unused by the system.Buffers
— The amount of physical RAM, in kilobytes, used for file buffers.Cached
— The amount of physical RAM, in kilobytes, used as cache memory.SwapCached
— The amount of swap, in kilobytes, used as cache memory.Active
— The total amount of buffer or page cache memory, in kilobytes, that is in active use. This is memory that has been recently used and is usually not reclaimed for other purposes.Inactive
— The total amount of buffer or page cache memory, in kilobytes, that are free and available. This is memory that has not been recently used and can be reclaimed for other purposes.HighTotal
andHighFree
— The total and free amount of memory, in kilobytes, that is not directly mapped into kernel space. TheHighTotal
value can vary based on the type of kernel used.LowTotal
andLowFree
— The total and free amount of memory, in kilobytes, that is directly mapped into kernel space. TheLowTotal
value can vary based on the type of kernel used.SwapTotal
— The total amount of swap available, in kilobytes.SwapFree
— The total amount of swap free, in kilobytes.Dirty
— The total amount of memory, in kilobytes, waiting to be written back to the disk.Writeback
— The total amount of memory, in kilobytes, actively being written back to the disk.Mapped
— The total amount of memory, in kilobytes, which have been used to map devices, files, or libraries using themmap
command.Slab
— The total amount of memory, in kilobytes, used by the kernel to cache data structures for its own use.Committed_AS
— The total amount of memory, in kilobytes, estimated to complete the workload. This value represents the worst case scenario value, and also includes swap memory.PageTables
— The total amount of memory, in kilobytes, dedicated to the lowest page table level.VMallocTotal
— The total amount of memory, in kilobytes, of total allocated virtual address space.VMallocUsed
— The total amount of memory, in kilobytes, of used virtual address space.VMallocChunk
— The largest contiguous block of memory, in kilobytes, of available virtual address space.HugePages_Total
— The total number of hugepages for the system. The number is derived by dividingHugepagesize
by the megabytes set aside for hugepages specified in/proc/sys/vm/hugetlb_pool
. This statistic only appears on the x86, Itanium, and AMD64 architectures.HugePages_Free
— The total number of hugepages available for the system. This statistic only appears on the x86, Itanium, and AMD64 architectures.Hugepagesize
— The size for each hugepages unit in kilobytes. By default, the value is 4096 KB on uniprocessor kernels for 32 bit architectures. For SMP, hugemem kernels, and AMD64, the default is 2048 KB. For Itanium architectures, the default is 262144 KB. This statistic only appears on the x86, Itanium, and AMD64 architectures.
5.2.20. /proc/misc
63 device-mapper 175 agpgart 135 rtc 134 apm_bios
5.2.21. /proc/modules
/proc/modules
file output:
Note
/sbin/lsmod
command.
nfs 170109 0 - Live 0x129b0000 lockd 51593 1 nfs, Live 0x128b0000 nls_utf8 1729 0 - Live 0x12830000 vfat 12097 0 - Live 0x12823000 fat 38881 1 vfat, Live 0x1287b000 autofs4 20293 2 - Live 0x1284f000 sunrpc 140453 3 nfs,lockd, Live 0x12954000 3c59x 33257 0 - Live 0x12871000 uhci_hcd 28377 0 - Live 0x12869000 md5 3777 1 - Live 0x1282c000 ipv6 211845 16 - Live 0x128de000 ext3 92585 2 - Live 0x12886000 jbd 65625 1 ext3, Live 0x12857000 dm_mod 46677 3 - Live 0x12833000
Live
, Loading
, or Unloading
are the only possible values.
oprofile
.
5.2.22. /proc/mounts
rootfs / rootfs rw 0 0 /proc /proc proc rw,nodiratime 0 0 none /dev ramfs rw 0 0 /dev/mapper/VolGroup00-LogVol00 / ext3 rw 0 0 none /dev ramfs rw 0 0 /proc /proc proc rw,nodiratime 0 0 /sys /sys sysfs rw 0 0 none /dev/pts devpts rw 0 0 usbdevfs /proc/bus/usb usbdevfs rw 0 0 /dev/hda1 /boot ext3 rw 0 0 none /dev/shm tmpfs rw 0 0 none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
/etc/mtab
, except that /proc/mount
is more up-to-date.
ro
) or read-write (rw
). The fifth and sixth columns are dummy values designed to match the format used in /etc/mtab
.
5.2.23. /proc/mtrr
/proc/mtrr
file may look similar to the following:
reg00: base=0x00000000 ( 0MB), size= 256MB: write-back, count=1 reg01: base=0xe8000000 (3712MB), size= 32MB: write-combining, count=1
/proc/mtrr
file can increase performance more than 150%.
/usr/share/doc/kernel-doc-<version>/Documentation/mtrr.txt
5.2.24. /proc/partitions
major minor #blocks name 3 0 19531250 hda 3 1 104391 hda1 3 2 19422585 hda2 253 0 22708224 dm-0 253 1 524288 dm-1
major
— The major number of the device with this partition. The major number in the/proc/partitions
, (3
), corresponds with the block deviceide0
, in/proc/devices
.minor
— The minor number of the device with this partition. This serves to separate the partitions into different physical devices and relates to the number at the end of the name of the partition.#blocks
— Lists the number of physical disk blocks contained in a particular partition.name
— The name of the partition.
5.2.25. /proc/pci
/proc/pci
can be rather long. A sampling of this file from a basic system looks similar to the following:
Bus 0, device 0, function 0: Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge (rev 3). Master Capable. Latency=64. Prefetchable 32 bit memory at 0xe4000000 [0xe7ffffff]. Bus 0, device 1, function 0: PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge (rev 3). Master Capable. Latency=64. Min Gnt=128. Bus 0, device 4, function 0: ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 2). Bus 0, device 4, function 1: IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 1). Master Capable. Latency=32. I/O at 0xd800 [0xd80f]. Bus 0, device 4, function 2: USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 1). IRQ 5. Master Capable. Latency=32. I/O at 0xd400 [0xd41f]. Bus 0, device 4, function 3: Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 2). IRQ 9. Bus 0, device 9, function 0: Ethernet controller: Lite-On Communications Inc LNE100TX (rev 33). IRQ 5. Master Capable. Latency=32. I/O at 0xd000 [0xd0ff]. Bus 0, device 12, function 0: VGA compatible controller: S3 Inc. ViRGE/DX or /GX (rev 1). IRQ 11. Master Capable. Latency=32. Min Gnt=4.Max Lat=255.
Note
lspci -vb
5.2.26. /proc/slabinfo
/proc/slabinfo
file manually, the /usr/bin/slabtop
program displays kernel slab cache information in real time. This program allows for custom configurations, including column sorting and screen refreshing.
/usr/bin/slabtop
usually looks like the following example:
Active / Total Objects (% used) : 133629 / 147300 (90.7%) Active / Total Slabs (% used) : 11492 / 11493 (100.0%) Active / Total Caches (% used) : 77 / 121 (63.6%) Active / Total Size (% used) : 41739.83K / 44081.89K (94.7%) Minimum / Average / Maximum Object : 0.01K / 0.30K / 128.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 44814 43159 96% 0.62K 7469 6 29876K ext3_inode_cache 36900 34614 93% 0.05K 492 75 1968K buffer_head 35213 33124 94% 0.16K 1531 23 6124K dentry_cache 7364 6463 87% 0.27K 526 14 2104K radix_tree_node 2585 1781 68% 0.08K 55 47 220K vm_area_struct 2263 2116 93% 0.12K 73 31 292K size-128 1904 1125 59% 0.03K 16 119 64K size-32 1666 768 46% 0.03K 14 119 56K anon_vma 1512 1482 98% 0.44K 168 9 672K inode_cache 1464 1040 71% 0.06K 24 61 96K size-64 1320 820 62% 0.19K 66 20 264K filp 678 587 86% 0.02K 3 226 12K dm_io 678 587 86% 0.02K 3 226 12K dm_tio 576 574 99% 0.47K 72 8 288K proc_inode_cache 528 514 97% 0.50K 66 8 264K size-512 492 372 75% 0.09K 12 41 48K bio 465 314 67% 0.25K 31 15 124K size-256 452 331 73% 0.02K 2 226 8K biovec-1 420 420 100% 0.19K 21 20 84K skbuff_head_cache 305 256 83% 0.06K 5 61 20K biovec-4 290 4 1% 0.01K 1 290 4K revoke_table 264 264 100% 4.00K 264 1 1056K size-4096 260 256 98% 0.19K 13 20 52K biovec-16 260 256 98% 0.75K 52 5 208K biovec-64
/proc/slabinfo
that are included into /usr/bin/slabtop
include:
OBJS
— The total number of objects (memory blocks), including those in use (allocated), and some spares not in use.ACTIVE
— The number of objects (memory blocks) that are in use (allocated).USE
— Percentage of total objects that are active. ((ACTIVE/OBJS)(100))OBJ SIZE
— The size of the objects.SLABS
— The total number of slabs.OBJ/SLAB
— The number of objects that fit into a slab.CACHE SIZE
— The cache size of the slab.NAME
— The name of the slab.
/usr/bin/slabtop
program, refer to the slabtop
man page.
5.2.27. /proc/stat
/proc/stat
, which can be quite long, usually begins like the following example:
cpu 259246 7001 60190 34250993 137517 772 0 cpu0 259246 7001 60190 34250993 137517 772 0 intr 354133732 347209999 2272 0 4 4 0 0 3 1 1249247 0 0 80143 0 422626 5169433 ctxt 12547729 btime 1093631447 processes 130523 procs_running 1 procs_blocked 0 preempt 5651840 cpu 209841 1554 21720 118519346 72939 154 27168 cpu0 42536 798 4841 14790880 14778 124 3117 cpu1 24184 569 3875 14794524 30209 29 3130 cpu2 28616 11 2182 14818198 4020 1 3493 cpu3 35350 6 2942 14811519 3045 0 3659 cpu4 18209 135 2263 14820076 12465 0 3373 cpu5 20795 35 1866 14825701 4508 0 3615 cpu6 21607 0 2201 14827053 2325 0 3334 cpu7 18544 0 1550 14831395 1589 0 3447 intr 15239682 14857833 6 0 6 6 0 5 0 1 0 0 0 29 0 2 0 0 0 0 0 0 0 94982 0 286812 ctxt 4209609 btime 1078711415 processes 21905 procs_running 1 procs_blocked 0
cpu
— Measures the number of jiffies (1/100 of a second for x86 systems) that the system has been in user mode, user mode with low priority (nice), system mode, idle task, I/O wait, IRQ (hardirq), and softirq respectively. The IRQ (hardirq) is the direct response to a hardware event. The IRQ takes minimal work for queuing the "heavy" work up for the softirq to execute. The softirq runs at a lower priority than the IRQ and therefore may be interrupted more frequently. The total for all CPUs is given at the top, while each individual CPU is listed below with its own statistics. The following example is a 4-way Intel Pentium Xeon configuration with multi-threading enabled, therefore showing four physical processors and four virtual processors totaling eight processors.page
— The number of memory pages the system has written in and out to disk.swap
— The number of swap pages the system has brought in and out.intr
— The number of interrupts the system has experienced.btime
— The boot time, measured in the number of seconds since January 1, 1970, otherwise known as the epoch.
5.2.28. /proc/swaps
/proc/swaps
may look similar to the following:
Filename Type Size Used Priority /dev/mapper/VolGroup00-LogVol01 partition 524280 0 -1
/proc/
directory, /proc/swaps
provides a snapshot of every swap file name, the type of swap space, the total size, and the amount of space in use (in kilobytes). The priority column is useful when multiple swap files are in use. The lower the priority, the more likely the swap file is to be used.
5.2.29. /proc/sysrq-trigger
echo
command to write to this file, a remote root user can execute most System Request Key commands remotely as if at the local terminal. To echo
values to this file, the /proc/sys/kernel/sysrq
must be set to a value other than 0
. For more information about the System Request Key, refer to Section 5.3.9.3, “ /proc/sys/kernel/
”.
5.2.30. /proc/uptime
/proc/uptime
is quite minimal:
350735.47 234388.90
5.2.31. /proc/version
gcc
in use, as well as the version of Red Hat Enterprise Linux installed on the system:
Linux version 2.6.8-1.523 (user@foo.redhat.com) (gcc version 3.4.1 20040714 \ (Red Hat Enterprise Linux 3.4.1-7)) #1 Mon Aug 16 13:27:03 EDT 2004
5.3. Directories within /proc/
/proc/
directory.
5.3.1. Process Directories
/proc/
directory contains a number of directories with numerical names. A listing of them may be similar to the following:
dr-xr-xr-x 3 root root 0 Feb 13 01:28 1 dr-xr-xr-x 3 root root 0 Feb 13 01:28 1010 dr-xr-xr-x 3 xfs xfs 0 Feb 13 01:28 1087 dr-xr-xr-x 3 daemon daemon 0 Feb 13 01:28 1123 dr-xr-xr-x 3 root root 0 Feb 13 01:28 11307 dr-xr-xr-x 3 apache apache 0 Feb 13 01:28 13660 dr-xr-xr-x 3 rpc rpc 0 Feb 13 01:28 637 dr-xr-xr-x 3 rpcuser rpcuser 0 Feb 13 01:28 666
/proc/
process directory vanishes.
cmdline
— Contains the command issued when starting the process.cwd
— A symbolic link to the current working directory for the process.environ
— A list of the environment variables for the process. The environment variable is given in all upper-case characters, and the value is in lower-case characters.exe
— A symbolic link to the executable of this process.fd
— A directory containing all of the file descriptors for a particular process. These are given in numbered links:total 0 lrwx------ 1 root root 64 May 8 11:31 0 -> /dev/null lrwx------ 1 root root 64 May 8 11:31 1 -> /dev/null lrwx------ 1 root root 64 May 8 11:31 2 -> /dev/null lrwx------ 1 root root 64 May 8 11:31 3 -> /dev/ptmx lrwx------ 1 root root 64 May 8 11:31 4 -> socket:[7774817] lrwx------ 1 root root 64 May 8 11:31 5 -> /dev/ptmx lrwx------ 1 root root 64 May 8 11:31 6 -> socket:[7774829] lrwx------ 1 root root 64 May 8 11:31 7 -> /dev/ptmx
maps
— A list of memory maps to the various executables and library files associated with this process. This file can be rather long, depending upon the complexity of the process, but sample output from thesshd
process begins like the following:08048000-08086000 r-xp 00000000 03:03 391479 /usr/sbin/sshd 08086000-08088000 rw-p 0003e000 03:03 391479 /usr/sbin/sshd 08088000-08095000 rwxp 00000000 00:00 0 40000000-40013000 r-xp 0000000 03:03 293205 /lib/ld-2.2.5.so 40013000-40014000 rw-p 00013000 03:03 293205 /lib/ld-2.2.5.so 40031000-40038000 r-xp 00000000 03:03 293282 /lib/libpam.so.0.75 40038000-40039000 rw-p 00006000 03:03 293282 /lib/libpam.so.0.75 40039000-4003a000 rw-p 00000000 00:00 0 4003a000-4003c000 r-xp 00000000 03:03 293218 /lib/libdl-2.2.5.so 4003c000-4003d000 rw-p 00001000 03:03 293218 /lib/libdl-2.2.5.so
mem
— The memory held by the process. This file cannot be read by the user.root
— A link to the root directory of the process.stat
— The status of the process.statm
— The status of the memory in use by the process. Below is a sample/proc/statm
file:263 210 210 5 0 205 0
The seven columns relate to different memory statistics for the process. From left to right, they report the following aspects of the memory used:- Total program size, in kilobytes.
- Size of memory portions, in kilobytes.
- Number of pages that are shared.
- Number of pages that are code.
- Number of pages of data/stack.
- Number of library pages.
- Number of dirty pages.
status
— The status of the process in a more readable form thanstat
orstatm
. Sample output forsshd
looks similar to the following:Name: sshd State: S (sleeping) Tgid: 797 Pid: 797 PPid: 1 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 FDSize: 32 Groups: VmSize: 3072 kB VmLck: 0 kB VmRSS: 840 kB VmData: 104 kB VmStk: 12 kB VmExe: 300 kB VmLib: 2528 kB SigPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 8000000000001000 SigCgt: 0000000000014005 CapInh: 0000000000000000 CapPrm: 00000000fffffeff CapEff: 00000000fffffeff
The information in this output includes the process name and ID, the state (such asS (sleeping)
orR (running)
), user/group ID running the process, and detailed data regarding memory usage.
5.3.1.1. /proc/self/
/proc/self/
directory is a link to the currently running process. This allows a process to look at itself without having to know its process ID.
/proc/self/
directory produces the same contents as listing the process directory for that process.
5.3.2. /proc/bus/
/proc/bus/
by the same name, such as /proc/bus/pci/
.
/proc/bus/
vary depending on the devices connected to the system. However, each bus type has at least one directory. Within these bus directories are normally at least one subdirectory with a numerical name, such as 001
, which contain binary files.
/proc/bus/usb/
subdirectory contains files that track the various devices on any USB buses, as well as the drivers required for them. The following is a sample listing of a /proc/bus/usb/
directory:
total 0 dr-xr-xr-x 1 root root 0 May 3 16:25 001 -r--r--r-- 1 root root 0 May 3 16:25 devices -r--r--r-- 1 root root 0 May 3 16:25 drivers
/proc/bus/usb/001/
directory contains all devices on the first USB bus and the devices
file identifies the USB root hub on the motherboard.
/proc/bus/usb/devices
file:
T: Bus=01 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2 B: Alloc= 0/900 us ( 0%), #Int= 0, #Iso= 0 D: Ver= 1.00 Cls=09(hub ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1 P: Vendor=0000 ProdID=0000 Rev= 0.00 S: Product=USB UHCI Root Hub S: SerialNumber=d400 C:* #Ifs= 1 Cfg#= 1 Atr=40 MxPwr= 0mA I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=255ms
5.3.3. /proc/driver/
rtc
which provides output from the driver for the system's Real Time Clock (RTC), the device that keeps the time while the system is switched off. Sample output from /proc/driver/rtc
looks like the following:
rtc_time : 16:21:00 rtc_date : 2004-08-31 rtc_epoch : 1900 alarm : 21:16:27 DST_enable : no BCD : yes 24hr : yes square_wave : no alarm_IRQ : no update_IRQ : no periodic_IRQ : no periodic_freq : 1024 batt_status : okay
/usr/share/doc/kernel-doc-<version>/Documentation/rtc.txt
.
5.3.4. /proc/fs
cat /proc/fs/nfsd/exports
displays the file systems being shared and the permissions granted for those file systems. For more on file system sharing with NFS, refer to Chapter 21, Network File System (NFS).
5.3.5. /proc/ide/
/proc/ide/ide0
and /proc/ide/ide1
. In addition, a drivers
file is available, providing the version number of the various drivers used on the IDE channels:
ide-floppy version 0.99. newide ide-cdrom version 4.61 ide-disk version 1.18
/proc/ide/piix
file which reveals whether DMA or UDMA is enabled for the devices on the IDE channels:
Intel PIIX4 Ultra 33 Chipset. ------------- Primary Channel ---------------- Secondary Channel ------------- enabled enabled ------------- drive0 --------- drive1 -------- drive0 ---------- drive1 ------ DMA enabled: yes no yes no UDMA enabled: yes no no no UDMA enabled: 2 X X X UDMA DMA PIO
ide0
, provides additional information. The channel
file provides the channel number, while the model
identifies the bus type for the channel (such as pci
).
5.3.5.1. Device Directories
/dev/
directory. For instance, the first IDE drive on ide0
would be hda
.
Note
/proc/ide/
directory.
cache
— The device cache.capacity
— The capacity of the device, in 512 byte blocks.driver
— The driver and version used to control the device.geometry
— The physical and logical geometry of the device.media
— The type of device, such as adisk
.model
— The model name or number of the device.settings
— A collection of current device parameters. This file usually contains quite a bit of useful, technical information. A samplesettings
file for a standard IDE hard disk looks similar to the following:name value min max mode ---- ----- --- --- ---- acoustic 0 0 254 rw address 0 0 2 rw bios_cyl 38752 0 65535 rw bios_head 16 0 255 rw bios_sect 63 0 63 rw bswap 0 0 1 r current_speed 68 0 70 rw failures 0 0 65535 rw init_speed 68 0 70 rw io_32bit 0 0 3 rw keepsettings 0 0 1 rw lun 0 0 7 rw max_failures 1 0 65535 rw multcount 16 0 16 rw nice1 1 0 1 rw nowerr 0 0 1 rw number 0 0 3 rw pio_mode write-only 0 255 w unmaskirq 0 0 1 rw using_dma 1 0 1 rw wcache 1 0 1 rw
5.3.6. /proc/irq/
/proc/irq/prof_cpu_mask
file is a bitmask that contains the default values for the smp_affinity
file in the IRQ directory. The values in smp_affinity
specify which CPUs handle that particular IRQ.
/proc/irq/
directory, refer to the following installed documentation:
/usr/share/doc/kernel-doc-<version>/Documentation/filesystems/proc.txt
5.3.7. /proc/net/
/proc/net/
directory:
arp
— Lists the kernel's ARP table. This file is particularly useful for connecting a hardware address to an IP address on a system.atm/
directory — The files within this directory contain Asynchronous Transfer Mode (ATM) settings and statistics. This directory is primarily used with ATM networking and ADSL cards.dev
— Lists the various network devices configured on the system, complete with transmit and receive statistics. This file displays the number of bytes each interface has sent and received, the number of packets inbound and outbound, the number of errors seen, the number of packets dropped, and more.dev_mcast
— Lists Layer2 multicast groups on which each device is listening.igmp
— Lists the IP multicast addresses which this system joined.ip_conntrack
— Lists tracked network connections for machines that are forwarding IP connections.ip_tables_names
— Lists the types ofiptables
in use. This file is only present ifiptables
is active on the system and contains one or more of the following values:filter
,mangle
, ornat
.ip_mr_cache
— Lists the multicast routing cache.ip_mr_vif
— Lists multicast virtual interfaces.netstat
— Contains a broad yet detailed collection of networking statistics, including TCP timeouts, SYN cookies sent and received, and much more.psched
— Lists global packet scheduler parameters.raw
— Lists raw device statistics.route
— Lists the kernel's routing table.rt_cache
— Contains the current routing cache.snmp
— List of Simple Network Management Protocol (SNMP) data for various networking protocols in use.sockstat
— Provides socket statistics.tcp
— Contains detailed TCP socket information.tr_rif
— Lists the token ring RIF routing table.udp
— Contains detailed UDP socket information.unix
— Lists UNIX domain sockets currently in use.wireless
— Lists wireless interface data.
5.3.8. /proc/scsi/
/proc/ide/
directory, but it is for connected SCSI devices.
/proc/scsi/scsi
, which contains a list of every recognized SCSI device. From this listing, the type of device, as well as the model name, vendor, SCSI channel and ID data is available.
Attached devices: Host: scsi1 Channel: 00 Id: 05 Lun: 00 Vendor: NEC Model: CD-ROM DRIVE:466 Rev: 1.06 Type: CD-ROM ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 06 Lun: 00 Vendor: ARCHIVE Model: Python 04106-XXX Rev: 7350 Type: Sequential-Access ANSI SCSI revision: 02 Host: scsi2 Channel: 00 Id: 06 Lun: 00 Vendor: DELL Model: 1x6 U2W SCSI BP Rev: 5.35 Type: Processor ANSI SCSI revision: 02 Host: scsi2 Channel: 02 Id: 00 Lun: 00 Vendor: MegaRAID Model: LD0 RAID5 34556R Rev: 1.01 Type: Direct-Access ANSI SCSI revision: 02
/proc/scsi/
, which contains files specific to each SCSI controller using that driver. From the previous example, aic7xxx/
and megaraid/
directories are present, since two drivers are in use. The files in each of the directories typically contain an I/O address range, IRQ information, and statistics for the SCSI controller using that driver. Each controller can report a different type and amount of information. The Adaptec AIC-7880 Ultra SCSI host adapter's file in this example system produces the following output:
Adaptec AIC7xxx driver version: 5.1.20/3.2.4 Compile Options: TCQ Enabled By Default : Disabled AIC7XXX_PROC_STATS : Enabled AIC7XXX_RESET_DELAY : 5 Adapter Configuration: SCSI Adapter: Adaptec AIC-7880 Ultra SCSI host adapter Ultra Narrow Controller PCI MMAPed I/O Base: 0xfcffe000 Adapter SEEPROM Config: SEEPROM found and used. Adaptec SCSI BIOS: Enabled IRQ: 30 SCBs: Active 0, Max Active 1, Allocated 15, HW 16, Page 255 Interrupts: 33726 BIOS Control Word: 0x18a6 Adapter Control Word: 0x1c5f Extended Translation: Enabled Disconnect Enable Flags: 0x00ff Ultra Enable Flags: 0x0020 Tag Queue Enable Flags: 0x0000 Ordered Queue Tag Flags: 0x0000 Default Tag Queue Depth: 8 Tagged Queue By Device array for aic7xxx host instance 1: {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255} Actual queue depth per device for aic7xxx host instance 1: {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1} Statistics: (scsi1:0:5:0) Device using Narrow/Sync transfers at 20.0 MByte/sec, offset 15 Transinfo settings: current(12/15/0/0), goal(12/15/0/0), user(12/15/0/0) Total transfers 0 (0 reads and 0 writes) < 2K 2K+ 4K+ 8K+ 16K+ 32K+ 64K+ 128K+ Reads: 0 0 0 0 0 0 0 0 Writes: 0 0 0 0 0 0 0 0 (scsi1:0:6:0) Device using Narrow/Sync transfers at 10.0 MByte/sec, offset 15 Transinfo settings: current(25/15/0/0), goal(12/15/0/0), user(12/15/0/0) Total transfers 132 (0 reads and 132 writes) < 2K 2K+ 4K+ 8K+ 16K+ 32K+ 64K+ 128K+ Reads: 0 0 0 0 0 0 0 0 Writes: 0 0 0 1 131 0 0 0
5.3.9. /proc/sys/
/proc/sys/
directory is different from others in /proc/
because it not only provides information about the system but also allows the system administrator to immediately enable and disable kernel features.
Warning
/proc/sys/
directory. Changing the wrong setting may render the kernel unstable, requiring a system reboot.
/proc/sys/
.
-l
option at the shell prompt. If the file is writable, it may be used to configure the kernel. For example, a partial listing of /proc/sys/fs
looks like the following:
-r--r--r-- 1 root root 0 May 10 16:14 dentry-state -rw-r--r-- 1 root root 0 May 10 16:14 dir-notify-enable -r--r--r-- 1 root root 0 May 10 16:14 dquot-nr -rw-r--r-- 1 root root 0 May 10 16:14 file-max -r--r--r-- 1 root root 0 May 10 16:14 file-nr
dir-notify-enable
and file-max
can be written to and, therefore, can be used to configure the kernel. The other files only provide feedback on current settings.
/proc/sys/
file is done by echoing the new value into the file. For example, to enable the System Request Key on a running kernel, type the command:
echo 1 > /proc/sys/kernel/sysrq
sysrq
from 0
(off) to 1
(on).
/proc/sys/
configuration files contain more than one value. To correctly send new values to them, place a space character between each value passed with the echo
command, such as is done in this example:
echo 4 2 45 > /proc/sys/kernel/acct
Note
echo
command disappear when the system is restarted. To make configuration changes take effect after the system is rebooted, refer to Section 5.4, “Using the sysctl
Command”.
/proc/sys/
directory contains several subdirectories controlling different aspects of a running kernel.
5.3.9.1. /proc/sys/dev/
cdrom/
and raid/
. Customized kernels can have other directories, such as parport/
, which provides the ability to share one parallel port between multiple device drivers.
cdrom/
directory contains a file called info
, which reveals a number of important CD-ROM parameters:
CD-ROM information, Id: cdrom.c 3.20 2003/12/17 drive name: hdc drive speed: 48 drive # of slots: 1 Can close tray: 1 Can open tray: 1 Can lock tray: 1 Can change speed: 1 Can select disk: 0 Can read multisession: 1 Can read MCN: 1 Reports media changed: 1 Can play audio: 1 Can write CD-R: 0 Can write CD-RW: 0 Can read DVD: 0 Can write DVD-R: 0 Can write DVD-RAM: 0 Can read MRW: 0 Can write MRW: 0 Can write RAM: 0
/proc/sys/dev/cdrom
, such as autoclose
and checkmedia
, can be used to control the system's CD-ROM. Use the echo
command to enable or disable these features.
/proc/sys/dev/raid/
directory becomes available with at least two files in it: speed_limit_min
and speed_limit_max
. These settings determine the acceleration of RAID devices for I/O intensive tasks, such as resyncing the disks.
5.3.9.2. /proc/sys/fs/
binfmt_misc/
directory is used to provide kernel support for miscellaneous binary formats.
/proc/sys/fs/
include:
dentry-state
— Provides the status of the directory cache. The file looks similar to the following:57411 52939 45 0 0 0
The first number reveals the total number of directory cache entries, while the second number displays the number of unused entries. The third number tells the number of seconds between when a directory has been freed and when it can be reclaimed, and the fourth measures the pages currently requested by the system. The last two numbers are not used and display only zeros.dquot-nr
— Lists the maximum number of cached disk quota entries.file-max
— Lists the maximum number of file handles that the kernel allocates. Raising the value in this file can resolve errors caused by a lack of available file handles.file-nr
— Lists the number of allocated file handles, used file handles, and the maximum number of file handles.overflowgid
andoverflowuid
— Defines the fixed group ID and user ID, respectively, for use with file systems that only support 16-bit group and user IDs.super-max
— Controls the maximum number of superblocks available.super-nr
— Displays the current number of superblocks in use.
5.3.9.3. /proc/sys/kernel/
acct
— Controls the suspension of process accounting based on the percentage of free space available on the file system containing the log. By default, the file looks like the following:4 2 30
The first value dictates the percentage of free space required for logging to resume, while the second value sets the threshold percentage of free space when logging is suspended. The third value sets the interval, in seconds, that the kernel polls the file system to see if logging should be suspended or resumed.cap-bound
— Controls the capability bounding settings, which provides a list of capabilities for any process on the system. If a capability is not listed here, then no process, no matter how privileged, can do it. The idea is to make the system more secure by ensuring that certain things cannot happen, at least beyond a certain point in the boot process.For a valid list of values for this virtual file, refer to the following installed documentation:/lib/modules/<kernel-version>/build/include/linux/capability.h
.ctrl-alt-del
— Controls whether Ctrl+Alt+Delete gracefully restarts the computer usinginit
(0
) or forces an immediate reboot without syncing the dirty buffers to disk (1
).domainname
— Configures the system domain name, such asexample.com
.exec-shield
— Configures the Exec Shield feature of the kernel. Exec Shield provides protection against certain types of buffer overflow attacks.There are two possible values for this virtual file:0
— Disables Exec Shield.1
— Enables Exec Shield. This is the default value.
Important
If a system is running security-sensitive applications that were started while Exec Shield was disabled, these applications must be restarted when Exec Shield is enabled in order for Exec Shield to take effect.exec-shield-randomize
— Enables location randomization of various items in memory. This helps deter potential attackers from locating programs and daemons in memory. Each time a program or daemon starts, it is put into a different memory location each time, never in a static or absolute memory address.There are two possible values for this virtual file:0
— Disables randomization of Exec Shield. This may be useful for application debugging purposes.1
— Enables randomization of Exec Shield. This is the default value. Note: Theexec-shield
file must also be set to1
forexec-shield-randomize
to be effective.
hostname
— Configures the system hostname, such aswww.example.com
.hotplug
— Configures the utility to be used when a configuration change is detected by the system. This is primarily used with USB and Cardbus PCI. The default value of/sbin/hotplug
should not be changed unless testing a new program to fulfill this role.modprobe
— Sets the location of the program used to load kernel modules. The default value is/sbin/modprobe
which meanskmod
calls it to load the module when a kernel thread callskmod
.msgmax
— Sets the maximum size of any message sent from one process to another and is set to8192
bytes by default. Be careful when raising this value, as queued messages between processes are stored in non-swappable kernel memory. Any increase inmsgmax
would increase RAM requirements for the system.msgmnb
— Sets the maximum number of bytes in a single message queue. The default is16384
.msgmni
— Sets the maximum number of message queue identifiers. The default is16
.osrelease
— Lists the Linux kernel release number. This file can only be altered by changing the kernel source and recompiling.ostype
— Displays the type of operating system. By default, this file is set toLinux
, and this value can only be changed by changing the kernel source and recompiling.overflowgid
andoverflowuid
— Defines the fixed group ID and user ID, respectively, for use with system calls on architectures that only support 16-bit group and user IDs.panic
— Defines the number of seconds the kernel postpones rebooting when the system experiences a kernel panic. By default, the value is set to0
, which disables automatic rebooting after a panic.printk
— This file controls a variety of settings related to printing or logging error messages. Each error message reported by the kernel has a loglevel associated with it that defines the importance of the message. The loglevel values break down in this order:0
— Kernel emergency. The system is unusable.1
— Kernel alert. Action must be taken immediately.2
— Condition of the kernel is considered critical.3
— General kernel error condition.4
— General kernel warning condition.5
— Kernel notice of a normal but significant condition.6
— Kernel informational message.7
— Kernel debug-level messages.
Four values are found in theprintk
file:6 4 1 7
Each of these values defines a different rule for dealing with error messages. The first value, called the console loglevel, defines the lowest priority of messages printed to the console. (Note that, the lower the priority, the higher the loglevel number.) The second value sets the default loglevel for messages without an explicit loglevel attached to them. The third value sets the lowest possible loglevel configuration for the console loglevel. The last value sets the default value for the console loglevel.random/
directory — Lists a number of values related to generating random numbers for the kernel.rtsig-max
— Configures the maximum number of POSIX real-time signals that the system may have queued at any one time. The default value is1024
.rtsig-nr
— Lists the current number of POSIX real-time signals queued by the kernel.sem
— Configures semaphore settings within the kernel. A semaphore is a System V IPC object that is used to control utilization of a particular process.shmall
— Sets the total amount of shared memory pages that can be used at one time, system-wide. By default, this value is2097152
.shmmax
— Sets the largest shared memory segment size allowed by the kernel. By default, this value is33554432
. However, the kernel supports much larger values than this.shmmni
— Sets the maximum number of shared memory segments for the whole system. By default, this value is4096
.sysrq
— Activates the System Request Key, if this value is set to anything other than zero (0
), the default.The System Request Key allows immediate input to the kernel through simple key combinations. For example, the System Request Key can be used to immediately shut down or restart a system, sync all mounted file systems, or dump important information to the console. To initiate a System Request Key, type Alt+SysRq+ <system request code> . Replace <system request code> with one of the following system request codes:r
— Disables raw mode for the keyboard and sets it to XLATE (a limited keyboard mode which does not recognize modifiers such as Alt, Ctrl, or Shift for all keys).k
— Kills all processes active in a virtual console. Also called Secure Access Key (SAK), it is often used to verify that the login prompt is spawned frominit
and not a Trojan copy designed to capture usernames and passwords.b
— Reboots the kernel without first unmounting file systems or syncing disks attached to the system.c
— Crashes the system without first unmounting file systems or syncing disks attached to the system.o
— Shuts off the system.s
— Attempts to sync disks attached to the system.u
— Attempts to unmount and remount all file systems as read-only.p
— Outputs all flags and registers to the console.t
— Outputs a list of processes to the console.m
— Outputs memory statistics to the console.0
through9
— Sets the log level for the console.e
— Kills all processes exceptinit
using SIGTERM.i
— Kills all processes exceptinit
using SIGKILL.l
— Kills all processes using SIGKILL (includinginit
). The system is unusable after issuing this System Request Key code.h
— Displays help text.
This feature is most beneficial when using a development kernel or when experiencing system freezes.Warning
The System Request Key feature is considered a security risk because an unattended console provides an attacker with access to the system. For this reason, it is turned off by default.Refer to/usr/share/doc/kernel-doc-<version>/Documentation/sysrq.txt
for more information about the System Request Key.sysrq-key
— Defines the key code for the System Request Key (84
is the default).sysrq-sticky
— Defines whether the System Request Key is a chorded key combination. The accepted values are as follows:0
— Alt+SysRq and the system request code must be pressed simultaneously. This is the default value.1
— Alt+SysRq must be pressed simultaneously, but the system request code can be pressed anytime before the number of seconds specified in/proc/sys/kernel/sysrq-timer
elapses.
sysrq-timer
— Specifies the number of seconds allowed to pass before the system request code must be pressed. The default value is10
.tainted
— Indicates whether a non-GPL module is loaded.0
— No non-GPL modules are loaded.1
— At least one module without a GPL license (including modules with no license) is loaded.2
— At least one module was force-loaded with the commandinsmod -f
.
threads-max
— Sets the maximum number of threads to be used by the kernel, with a default value of2048
.version
— Displays the date and time the kernel was last compiled. The first field in this file, such as#3
, relates to the number of times a kernel was built from the source base.
5.3.9.4. /proc/sys/net/
ethernet/
, ipv4/
, ipx/
, and ipv6/
. By altering the files within these directories, system administrators are able to adjust the network configuration on a running system.
/proc/sys/net/
directories are discussed.
/proc/sys/net/core/
directory contains a variety of settings that control the interaction between the kernel and networking layers. The most important of these files are:
message_burst
— Sets the amount of time in tenths of a second required to write a new warning message. This setting is used to mitigate Denial of Service (DoS) attacks. The default setting is50
.message_cost
— Sets a cost on every warning message. The higher the value of this file (default of5
), the more likely the warning message is ignored. This setting is used to mitigate DoS attacks.The idea of a DoS attack is to bombard the targeted system with requests that generate errors and fill up disk partitions with log files or require all of the system's resources to handle the error logging. The settings inmessage_burst
andmessage_cost
are designed to be modified based on the system's acceptable risk versus the need for comprehensive logging.netdev_max_backlog
— Sets the maximum number of packets allowed to queue when a particular interface receives packets faster than the kernel can process them. The default value for this file is300
.optmem_max
— Configures the maximum ancillary buffer size allowed per socket.rmem_default
— Sets the receive socket buffer default size in bytes.rmem_max
— Sets the receive socket buffer maximum size in bytes.wmem_default
— Sets the send socket buffer default size in bytes.wmem_max
— Sets the send socket buffer maximum size in bytes.
/proc/sys/net/ipv4/
directory contains additional networking settings. Many of these settings, used in conjunction with one another, are useful in preventing attacks on the system or when using the system to act as a router.
Warning
/proc/sys/net/ipv4/
directory:
icmp_destunreach_rate
,icmp_echoreply_rate
,icmp_paramprob_rate
, andicmp_timeexeed_rate
— Set the maximum ICMP send packet rate, in 1/100 of a second, to hosts under certain conditions. A setting of0
removes any delay and is not a good idea.icmp_echo_ignore_all
andicmp_echo_ignore_broadcasts
— Allows the kernel to ignore ICMP ECHO packets from every host or only those originating from broadcast and multicast addresses, respectively. A value of0
allows the kernel to respond, while a value of1
ignores the packets.ip_default_ttl
— Sets the default Time To Live (TTL), which limits the number of hops a packet may make before reaching its destination. Increasing this value can diminish system performance.ip_forward
— Permits interfaces on the system to forward packets to one other. By default, this file is set to0
. Setting this file to1
enables network packet forwarding.ip_local_port_range
— Specifies the range of ports to be used by TCP or UDP when a local port is needed. The first number is the lowest port to be used and the second number specifies the highest port. Any systems that expect to require more ports than the default 1024 to 4999 should use a range from 32768 to 61000.tcp_syn_retries
— Provides a limit on the number of times the system re-transmits a SYN packet when attempting to make a connection.tcp_retries1
— Sets the number of permitted re-transmissions attempting to answer an incoming connection. Default of3
.tcp_retries2
— Sets the number of permitted re-transmissions of TCP packets. Default of15
.
/usr/share/doc/kernel-doc-<version>/Documentation/networking/ ip-sysctl.txt
/proc/sys/net/ipv4/
directory.
/proc/sys/net/ipv4/
directory and each covers a different aspect of the network stack. The /proc/sys/net/ipv4/conf/
directory allows each system interface to be configured in different ways, including the use of default settings for unconfigured devices (in the /proc/sys/net/ipv4/conf/default/
subdirectory) and settings that override all special configurations (in the /proc/sys/net/ipv4/conf/all/
subdirectory).
/proc/sys/net/ipv4/neigh/
directory contains settings for communicating with a host directly connected to the system (called a network neighbor) and also contains different settings for systems more than one hop away.
/proc/sys/net/ipv4/route/
. Unlike conf/
and neigh/
, the /proc/sys/net/ipv4/route/
directory contains specifications that apply to routing with any interfaces on the system. Many of these settings, such as max_size
, max_delay
, and min_delay
, relate to controlling the size of the routing cache. To clear the routing cache, write any value to the flush
file.
/usr/share/doc/kernel-doc-<version>/Documentation/filesystems/proc.txt
5.3.9.5. /proc/sys/vm/
/proc/sys/vm/
directory:
block_dump
— Configures block I/O debugging when enabled. All read/write and block dirtying operations done to files are logged accordingly. This can be useful if diagnosing disk spin up and spin downs for laptop battery conservation. All output whenblock_dump
is enabled can be retrieved viadmesg
. The default value is0
.Note
Ifblock_dump
is enabled at the same time as kernel debugging, it is prudent to stop theklogd
daemon, as it generates erroneous disk activity caused byblock_dump
.dirty_background_ratio
— Starts background writeback of dirty data at this percentage of total memory, via a pdflush daemon. The default value is10
.dirty_expire_centisecs
— Defines when dirty in-memory data is old enough to be eligible for writeout. Data which has been dirty in-memory for longer than this interval is written out next time a pdflush daemon wakes up. The default value is3000
, expressed in hundredths of a second.dirty_ratio
— Starts active writeback of dirty data at this percentage of total memory for the generator of dirty data, via pdflush. The default value is40
.dirty_writeback_centisecs
— Defines the interval between pdflush daemon wakeups, which periodically writes dirty in-memory data out to disk. The default value is500
, expressed in hundredths of a second.laptop_mode
— Minimizes the number of times that a hard disk needs to spin up by keeping the disk spun down for as long as possible, therefore conserving battery power on laptops. This increases efficiency by combining all future I/O processes together, reducing the frequency of spin ups. The default value is0
, but is automatically enabled in case a battery on a laptop is used.This value is controlled automatically by the acpid daemon once a user is notified battery power is enabled. No user modifications or interactions are necessary if the laptop supports the ACPI (Advanced Configuration and Power Interface) specification.For more information, refer to the following installed documentation:/usr/share/doc/kernel-doc-<version>/Documentation/laptop-mode.txt
lower_zone_protection
— Determines how aggressive the kernel is in defending lower memory allocation zones. This is effective when utilized with machines configured withhighmem
memory space enabled. The default value is0
, no protection at all. All other integer values are in megabytes, andlowmem
memory is therefore protected from being allocated by users.For more information, refer to the following installed documentation:/usr/share/doc/kernel-doc-<version>/Documentation/filesystems/proc.txt
max_map_count
— Configures the maximum number of memory map areas a process may have. In most cases, the default value of65536
is appropriate.min_free_kbytes
— Forces the Linux VM (virtual memory manager) to keep a minimum number of kilobytes free. The VM uses this number to compute apages_min
value for eachlowmem
zone in the system. The default value is in respect to the total memory on the machine.nr_hugepages
— Indicates the current number of configuredhugetlb
pages in the kernel.For more information, refer to the following installed documentation:/usr/share/doc/kernel-doc-<version>/Documentation/vm/hugetlbpage.txt
nr_pdflush_threads
— Indicates the number of pdflush daemons that are currently running. This file is read-only, and should not be changed by the user. Under heavy I/O loads, the default value of two is increased by the kernel.overcommit_memory
— Configures the conditions under which a large memory request is accepted or denied. The following three modes are available:0
— The kernel performs heuristic memory over commit handling by estimating the amount of memory available and failing requests that are blatantly invalid. Unfortunately, since memory is allocated using a heuristic rather than a precise algorithm, this setting can sometimes allow available memory on the system to be overloaded. This is the default setting.1
— The kernel performs no memory over commit handling. Under this setting, the potential for memory overload is increased, but so is performance for memory intensive tasks (such as those executed by some scientific software).2
— The kernel fails requests for memory that add up to all of swap plus the percent of physical RAM specified in/proc/sys/vm/overcommit_ratio
. This setting is best for those who desire less risk of memory overcommitment.Note
This setting is only recommended for systems with swap areas larger than physical memory.
overcommit_ratio
— Specifies the percentage of physical RAM considered when/proc/sys/vm/overcommit_memory
is set to2
. The default value is50
.page-cluster
— Sets the number of pages read in a single attempt. The default value of3
, which actually relates to 16 pages, is appropriate for most systems.swappiness
— Determines how much a machine should swap. The higher the value, the more swapping occurs. The default value, as a percentage, is set to60
.
/usr/share/doc/kernel-doc-<version>/Documentation/
, which contains additional information.
5.3.10. /proc/sysvipc/
msg
), semaphores (sem
), and shared memory (shm
).
5.3.11. /proc/tty/
drivers
file is a list of the current tty devices in use, as in the following example:
serial /dev/cua 5 64-127 serial:callout serial /dev/ttyS 4 64-127 serial pty_slave /dev/pts 136 0-255 pty:slave pty_master /dev/ptm 128 0-255 pty:master pty_slave /dev/ttyp 3 0-255 pty:slave pty_master /dev/pty 2 0-255 pty:master /dev/vc/0 /dev/vc/0 4 0 system:vtmaster /dev/ptmx /dev/ptmx 5 2 system /dev/console /dev/console 5 1 system:console /dev/tty /dev/tty 5 0 system:/dev/tty unknown /dev/vc/%d 4 1-63 console
/proc/tty/driver/serial
file lists the usage statistics and status of each of the serial tty lines.
ldiscs
file, and more detailed information is available within the ldisc/
directory.
5.3.12. /proc/<PID>/
/proc/sys/vm/panic_on_oom
. When set to 1
the kernel will panic on OOM. A setting of 0
instructs the kernel to call a function named oom_killer
on an OOM. Usually, oom_killer
can kill rogue processes and the system will survive.
/proc/sys/vm/panic_on_oom
.
~]#cat /proc/sys/vm/panic_on_oom
1 ~]#echo 0 > /proc/sys/vm/panic_on_oom
~]#cat /proc/sys/vm/panic_on_oom
0
oom_killer
score. In /proc/<PID>/
there are two tools labelled oom_adj
and oom_score
. Valid scores for oom_adj
are in the range -16 to +15. To see the current oom_killer
score, view the oom_score
for the process. oom_killer
will kill processes with the highest scores first.
oom_killer
will kill it.
~]#cat /proc/12465/oom_score
79872 ~]#echo -5 > /proc/12465/oom_adj
~]#cat /proc/12465/oom_score
78
oom_killer
for that process. In the example below, oom_score
returns a value of 0, indicating that this process would not be killed.
~]#cat /proc/12465/oom_score
78 ~]#echo -17 > /proc/12465/oom_adj
~]#cat /proc/12465/oom_score
0
badness()
is used to determine the actual score for each process. This is done by adding up 'points' for each examined process. The process scoring is done in the following way:
- The basis of each process's score is its memory size.
- The memory size of any of the process's children (not including a kernel thread) is also added to the score
- The process's score is increased for 'niced' processes and decreased for long running processes.
- Processes with the
CAP_SYS_ADMIN
andCAP_SYS_RAWIO
capabilities have their scores reduced. - The final score is then bitshifted by the value saved in the
oom_adj
file.
oom_score
value will most probably be a non-privileged, recently started process that, along with its children, uses a large amount of memory, has been 'niced', and handles no raw I/O.
5.4. Using the sysctl
Command
/sbin/sysctl
command is used to view, set, and automate kernel settings in the /proc/sys/
directory.
/proc/sys/
directory, type the /sbin/sysctl -a
command as root. This creates a large, comprehensive list, a small portion of which looks something like the following:
net.ipv4.route.min_delay = 2 kernel.sysrq = 0 kernel.sem = 250 32000 32 128
/proc/sys/net/ipv4/route/min_delay
file is listed as net.ipv4.route.min_delay
, with the directory slashes replaced by dots and the proc.sys
portion assumed.
sysctl
command can be used in place of echo
to assign values to writable files in the /proc/sys/
directory. For example, instead of using the command
echo 1 > /proc/sys/kernel/sysrq
sysctl
command as follows:
~]# sysctl -w kernel.sysrq="1"
kernel.sysrq = 1
/proc/sys/
is helpful during testing, this method does not work as well on a production system as special settings within /proc/sys/
are lost when the machine is rebooted. To preserve custom settings, add them to the /etc/sysctl.conf
file.
init
program runs the /etc/rc.d/rc.sysinit
script. This script contains a command to execute sysctl
using /etc/sysctl.conf
to determine the values passed to the kernel. Any values added to /etc/sysctl.conf
therefore take effect each time the system boots.
5.5. Additional Resources
proc
file system.
5.5.1. Installed Documentation
proc
file system is installed on the system by default.
/usr/share/doc/kernel-doc-<version>/Documentation/filesystems/proc.txt
— Contains assorted, but limited, information about all aspects of the/proc/
directory./usr/share/doc/kernel-doc-<version>/Documentation/sysrq.txt
— An overview of System Request Key options./usr/share/doc/kernel-doc-<version>/Documentation/sysctl/
— A directory containing a variety ofsysctl
tips, including modifying values that concern the kernel (kernel.txt
), accessing file systems (fs.txt
), and virtual memory use (vm.txt
)./usr/share/doc/kernel-doc-<version>/Documentation/networking/ip-sysctl.txt
— A detailed overview of IP networking options.
5.5.2. Useful Websites
- http://www.linuxhq.com/ — This website maintains a complete database of source, patches, and documentation for various versions of the Linux kernel.
Chapter 6. Redundant Array of Independent Disks (RAID)
6.1. What is RAID?
6.1.1. Who Should Use RAID?
- Enhances speed
- Increases storage capacity using a single virtual disk
- Minimizes disk failure
6.1.2. Hardware RAID versus Software RAID
- Hardware RAID
- The hardware-based array manages the RAID subsystem independently from the host. It presents a single disk per RAID array to the host.A hardware RAID device connects to the SCSI controller and presents the RAID arrays as a single SCSI drive. An external RAID system moves all RAID handling “intelligence” into a controller located in the external disk subsystem. The whole subsystem is connected to the host via a normal SCSI controller and appears to the host as a single disk.RAID controller cards function like a SCSI controller to the operating system, and handle all the actual drive communications. The user plugs the drives into the RAID controller (just like a normal SCSI controller) and then adds them to the RAID controllers configuration, and the operating system won't know the difference.
- Software RAID
- Software RAID implements the various RAID levels in the kernel disk (block device) code. It offers the cheapest possible solution, as expensive disk controller cards or hot-swap chassis[1] are not required. Software RAID also works with cheaper IDE disks as well as SCSI disks. With today's faster CPUs, software RAID outperforms hardware RAID.The Linux kernel contains an MD driver that allows the RAID solution to be completely hardware independent. The performance of a software-based array depends on the server CPU performance and load.To learn more about software RAID, here are the key features:
- Threaded rebuild process
- Kernel-based configuration
- Portability of arrays between Linux machines without reconstruction
- Backgrounded array reconstruction using idle system resources
- Hot-swappable drive support
- Automatic CPU detection to take advantage of certain CPU optimizations
6.1.3. RAID Levels and Linear Support
- Level 0
- RAID level 0, often called “striping”, is a performance-oriented striped data mapping technique. This means the data being written to the array is broken down into strips and written across the member disks of the array, allowing high I/O performance at low inherent cost but provides no redundancy. The storage capacity of a level 0 array is equal to the total capacity of the member disks in a hardware RAID or the total capacity of member partitions in a software RAID.
- Level 1
- RAID level 1, or “mirroring”, has been used longer than any other form of RAID. Level 1 provides redundancy by writing identical data to each member disk of the array, leaving a “mirrored” copy on each disk. Mirroring remains popular due to its simplicity and high level of data availability. Level 1 operates with two or more disks that may use parallel access for high data-transfer rates when reading but more commonly operate independently to provide high I/O transaction rates. Level 1 provides very good data reliability and improves performance for read-intensive applications but at a relatively high cost. The storage capacity of the level 1 array is equal to the capacity of one of the mirrored hard disks in a hardware RAID or one of the mirrored partitions in a software RAID.
Note
RAID level 1 comes at a high cost because you write the same information to all of the disks in the array, which wastes drive space. For example, if you have RAID level 1 set up so that your root (/
) partition exists on two 40G drives, you have 80G total but are only able to access 40G of that 80G. The other 40G acts like a mirror of the first 40G. - Level 4
- RAID level 4 uses parity[2] concentrated on a single disk drive to protect data. It is better suited to transaction I/O rather than large file transfers. Because the dedicated parity disk represents an inherent bottleneck, level 4 is seldom used without accompanying technologies such as write-back caching. Although RAID level 4 is an option in some RAID partitioning schemes, it is not an option allowed in Red Hat Enterprise Linux RAID installations. The storage capacity of hardware RAID level 4 is equal to the capacity of member disks, minus the capacity of one member disk. The storage capacity of software RAID level 4 is equal to the capacity of the member partitions, minus the size of one of the partitions if they are of equal size.
Note
RAID level 4 takes up the same amount of space as RAID level 5, but level 5 has more advantages. For this reason, level 4 is not supported. - Level 5
- RAID level 5 is the most common type of RAID. By distributing parity across some or all of an array's member disk drives, RAID level 5 eliminates the write bottleneck inherent in level 4. The only performance bottleneck is the parity calculation process. With modern CPUs and software RAID, that usually is not a very big problem. As with level 4, the result is asymmetrical performance, with reads substantially outperforming writes. Level 5 is often used with write-back caching to reduce the asymmetry. The storage capacity of hardware RAID level 5 is equal to the capacity of member disks, minus the capacity of one member disk. The storage capacity of software RAID level 5 is equal to the capacity of the member partitions, minus the size of one of the partitions if they are of equal size.
- Linear RAID
- Linear RAID is a simple grouping of drives to create a larger virtual drive. In linear RAID, the chunks are allocated sequentially from one member drive, going to the next drive only when the first is completely filled. This grouping provides no performance benefit, as it is unlikely that any I/O operations will be split between member drives. Linear RAID also offers no redundancy and, in fact, decreases reliability — if any one member drive fails, the entire array cannot be used. The capacity is the total of all member disks.
6.2. Configuring Software RAID
- Creating software RAID partitions on physical hard drives.
- Creating RAID devices from the software RAID partitions.
- (Optional) Configuring LVM from the RAID devices.
- Creating file systems from the RAID devices.
/dev/hda
and /dev/hdb
) to illustrate the creation of simple RAID 1 and RAID 0 configurations, and detail how to create a simple RAID configuration by implementing multiple RAID devices.
6.2.1. Creating the RAID Partitions
Figure 6.1. Two Blank Drives, Ready For Configuration
- In Disk Druid, click the button to enter the software RAID creation screen.
- Choose Figure 6.2, “RAID Partition Options”. Note that no other RAID options (such as entering a mount point) are available until RAID partitions, as well as RAID devices, are created. Click to confirm the choice.to create a RAID partition as shown in
Figure 6.2. RAID Partition Options
- A software RAID partition must be constrained to one drive. For, select the drive to use for RAID. If you have multiple drives, by default all drives are selected and you must deselect the drives you do not want.
Figure 6.3. Adding a RAID Partition
- Edit the Size (MB) field, and enter the size that you want the partition to be (in MB).
- Select Fixed Size to specify partition size. Select Fill all space up to (MB) and enter a value (in MB) to specify partition size range. Select Fill to maximum allowable size to allow maximum available space of the hard disk. Note that if you make more than one space growable, they share the available free space on the disk.
- Select Force to be a primary partition if you want the partition to be a primary partition. A primary partition is one of the first four partitions on the hard drive. If unselected, the partition is created as a logical partition. If other operating systems are already on the system, unselecting this option should be considered. For more information on primary versus logical/extended partitions, refer to the appendix section of the Red Hat Enterprise Linux Installation Guide.
/boot
partition as a software RAID device, leaving the root partition (/
), /home
, and swap
as regular file systems. Figure 6.4, “RAID 1 Partitions Ready, Pre-Device and Mount Point Creation” shows successfully allocated space for the RAID 1 configuration (for /boot
), which is now ready for RAID device and mount point creation:
Figure 6.4. RAID 1 Partitions Ready, Pre-Device and Mount Point Creation
6.2.2. Creating the RAID Devices and Mount Points
- On the main partitioning screen, click the RAID Options dialog appears as shown in Figure 6.5, “RAID Options”.button. The
Figure 6.5. RAID Options
- Select the Create a RAID device option, and click . As shown in Figure 6.6, “Making a RAID Device and Assigning a Mount Point”, the Make RAID Device dialog appears, allowing you to make a RAID device and assign a mount point.
Figure 6.6. Making a RAID Device and Assigning a Mount Point
- Select a mount point from the Mount Point pulldown list.
- Choose the file system type for the partition from the File System Type pulldown list. At this point you can either configure a dynamic LVM file system or a traditional static ext2/ext3 file system. For more information on LVM and its configuration during the installation process, refer to Chapter 11, LVM (Logical Volume Manager). If LVM is not required, continue on with the following instructions.
- From the RAID Device pulldown list, select a device name such as md0.
- From the RAID Level, choose the required RAID level.
Note
If you are making a RAID partition of/boot
, you must choose RAID level 1, and it must use one of the first two drives (IDE first, SCSI second). If you are not creating a separate RAID partition of/boot
, and you are making a RAID partition for the root file system (that is,/
), it must be RAID level 1 and must use one of the first two drives (IDE first, SCSI second). - The RAID partitions created appear in the RAID Members list. Select which of these partitions should be used to create the RAID device.
- If configuring RAID 1 or RAID 5, specify the number of spare partitions in the Number of spares field. If a software RAID partition fails, the spare is automatically used as a replacement. For each spare you want to specify, you must create an additional software RAID partition (in addition to the partitions for the RAID device). Select the partitions for the RAID device and the partition(s) for the spare(s).
- Click Drive Summary list.to confirm the setup. The RAID device appears in the
- Repeat this chapter's entire process for configuring additional partitions, devices, and mount points, such as the root partition (
/
), home partition (/home
), orswap
.
Figure 6.7. Sample RAID Configuration
Figure 6.8. Sample RAID With LVM Configuration
6.3. Managing Software RAID
- Reviewing existing software RAID configuration.
- Creating a new RAID device.
- Replacing a faulty device in an array.
- Adding a new device to an existing array.
- Deactivating and removing an existing RAID device.
- Saving the configuration.
6.3.1. Reviewing RAID Configuration
/proc/mdstat
special file. To list these devices, display the content of this file by typing the following at a shell prompt:
cat
/proc/mdstat
root
:
mdadm
--query
device…
mdadm
--detail
raid_device…
mdadm
--examine
component_device…
mdadm --detail
command displays information about a RAID device, mdadm --examine
only relays information about a RAID device as it relates to a given component device. This distinction is particularly important when working with a RAID device that itself is a component of another RAID device.
mdadm --query
command, as well as both mdadm --detail
and mdadm --examine
commands allow you to specify multiple devices at once.
Example 6.1. Reviewing RAID configuration
/dev/md0
is a RAID device by typing the following at a shell prompt:
~]# mdadm --query /dev/md0
/dev/md0: 125.38MiB raid1 2 devices, 0 spares. Use mdadm --detail for more detail.
/dev/md0: No md super block found, not an md component.
~]# mdadm --detail /dev/md0
/dev/md0:
Version : 0.90
Creation Time : Tue Jun 28 16:05:49 2011
Raid Level : raid1
Array Size : 128384 (125.40 MiB 131.47 MB)
Used Dev Size : 128384 (125.40 MiB 131.47 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Thu Jun 30 17:06:34 2011
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : 49c5ac74:c2b79501:5c28cb9c:16a6dd9f
Events : 0.6
Number Major Minor RaidDevice State
0 3 1 0 active sync /dev/hda1
1 3 65 1 active sync /dev/hdb1
~]$ cat /proc/mdstat
Personalities : [raid0] [raid1]
md0 : active raid1 hdb1[1] hda1[0]
128384 blocks [2/2] [UU]
md1 : active raid0 hdb2[1] hda2[0]
1573888 blocks 256k chunks
md2 : active raid0 hdb3[1] hda3[0]
19132928 blocks 256k chunks
unused devices: <none>
6.3.2. Creating a New RAID Device
root
:
mdadm
--create
raid_device--level
=level--raid-devices
=number component_device…
mdadm
(8) manual page.
Example 6.2. Creating a new RAID device
~]# ls /dev/sd*
/dev/sda /dev/sda1 /dev/sdb /dev/sdb1
/dev/md3
as a new RAID level 1 array from /dev/sda1
and /dev/sdb1
, run the following command:
~]# mdadm --create /dev/md3 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
mdadm: array /dev/md3 started.
6.3.3. Replacing a Faulty Device
root
:
mdadm
raid_device--fail
component_device
mdadm
raid_device--remove
component_device
mdadm
raid_device--add
component_device
Example 6.3. Replacing a faulty device
/dev/md3
, with the following layout (that is, the RAID device created in Example 6.2, “Creating a new RAID device”):
~]# mdadm --detail /dev/md3 | tail -n 3
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
/dev/sdb1
device as faulty:
~]# mdadm /dev/md3 --fail /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md3
~]# mdadm /dev/md3 --remove /dev/sdb1
mdadm: hot removed /dev/sdb1
~]# mdadm /dev/md3 --add /dev/sdb1
mdadm: added /dev/sdb1
6.3.4. Extending a RAID Device
root
:
mdadm
raid_device--add
component_device
mdadm
--grow
raid_device--raid-devices
=number
Example 6.4. Extending a RAID device
/dev/md3
, with the following layout (that is, the RAID device created in Example 6.2, “Creating a new RAID device”):
~]# mdadm --detail /dev/md3 | tail -n 3
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
/dev/sdc
, has been added and has exactly one partition. To add it to the /dev/md3
array, type the following at a shell prompt:
~]# mdadm /dev/md3 --add /dev/sdc1
mdadm: added /dev/sdc1
/dev/sdc1
as a spare device. To change the size of the array to actually use it, type:
~]# mdadm --grow /dev/md3 --raid-devices=3
6.3.5. Removing a RAID Device
root
:
mdadm
--stop
raid_device
mdadm
--remove
raid_device
mdadm
--zero-superblock
component_device…
Example 6.5. Removing a RAID device
/dev/md3
, with the following layout (that is, the RAID device created in Example 6.4, “Extending a RAID device”):
~]# mdadm --detail /dev/md3 | tail -n 4
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
~]# mdadm --stop /dev/md3
mdadm: stopped /dev/md3
/dev/md3
device by running the following command:
~]# mdadm --remove /dev/md3
~]# mdadm --zero-superblock /dev/sda1 /dev/sdb1 /dev/sdc1
6.3.6. Preserving the Configuration
mdadm
command only apply to the current session, and will not survive a system restart. At boot time, the mdmonitor
service reads the content of the /etc/mdadm.conf
configuration file to see which RAID devices to start. If the software RAID was configured during the graphical installation process, this file contains directives listed in Table 6.1, “Common mdadm.conf directives” by default.
Option | Description |
---|---|
ARRAY |
Allows you to identify a particular array.
|
DEVICE |
Allows you to specify a list of devices to scan for a RAID component (for example, “/dev/hda1”). You can also use the keyword
partitions to use all partitions listed in /proc/partitions , or containers to specify an array container.
|
MAILADDR | Allows you to specify an email address to use in case of an alert. |
ARRAY
lines are presently in use regardless of the configuration, run the following command as root
:
mdadm
--detail
--scan
/etc/mdadm.conf
file. You can also display the ARRAY
line for a particular device:
mdadm
--detail
--brief
raid_device
mdadm
--detail
--brief
raid_device >>/etc/mdadm.conf
Example 6.6. Preserving the configuration
/etc/mdadm.conf
contains the software RAID configuration created during the system installation:
# mdadm.conf written out by anaconda DEVICE partitions MAILADDR root ARRAY /dev/md0 level=raid1 num-devices=2 UUID=49c5ac74:c2b79501:5c28cb9c:16a6dd9f ARRAY /dev/md1 level=raid0 num-devices=2 UUID=76914c11:5bfa2c00:dc6097d1:a1f4506d ARRAY /dev/md2 level=raid0 num-devices=2 UUID=2b5d38d0:aea898bf:92be20e2:f9d893c5
/dev/md3
device as shown in Example 6.2, “Creating a new RAID device”, you can make it persistent by running the following command:
~]# mdadm --detail --brief /dev/md3 >> /etc/mdadm.conf
6.4. Additional Resources
6.4.1. Installed Documentation
mdadm
man page — A manual page for themdadm
utility.mdadm.conf
man page — A manual page that provides a comprehensive list of available/etc/mdadm.conf
configuration options.
Chapter 7. Swap Space
7.1. What is Swap Space?
Amount of RAM in the System | Recommended Amount of Swap Space |
---|---|
4GB of RAM or less | a minimum of 2GB of swap space |
4GB to 16GB of RAM | a minimum of 4GB of swap space |
16GB to 64GB of RAM | a minimum of 8GB of swap space |
64GB to 256GB of RAM | a minimum of 16GB of swap space |
256GB to 512GB of RAM | a minimum of 32GB of swap space |
Important
free
and cat /proc/swaps
commands to verify how much and where swap is in use.
7.2. Adding Swap Space
7.2.1. Extending Swap on an LVM2 Logical Volume
/dev/VolGroup00/LogVol01
is the volume you want to extend):
- Disable swapping for the associated logical volume:
swapoff -v /dev/VolGroup00/LogVol01
- Resize the LVM2 logical volume by 256 MB:
lvm lvresize /dev/VolGroup00/LogVol01 -L +256M
- Format the new swap space:
mkswap /dev/VolGroup00/LogVol01
- Enable the extended logical volume:
swapon -va
- Test that the logical volume has been extended properly:
cat /proc/swaps
free
7.2.2. Creating an LVM2 Logical Volume for Swap
/dev/VolGroup00/LogVol02
is the swap volume you want to add):
- Create the LVM2 logical volume of size 256 MB:
lvm lvcreate VolGroup00 -n LogVol02 -L 256M
- Format the new swap space:
mkswap /dev/VolGroup00/LogVol02
- Add the following entry to the
/etc/fstab
file:/dev/VolGroup00/LogVol02 swap swap defaults 0 0
- Enable the extended logical volume:
swapon -va
- Test that the logical volume has been extended properly:
cat /proc/swaps
free
7.2.3. Creating a Swap File
- Determine the size of the new swap file in megabytes and multiply by 1024 to determine the number of blocks. For example, the block size of a 64 MB swap file is 65536.
- At a shell prompt as root, type the following command with
count
being equal to the desired block size:dd if=/dev/zero of=/swapfile bs=1024 count=65536
- Change the persmissions of the newly created file:
chmod 0600 /swapfile
- Setup the swap file with the command:
mkswap /swapfile
- To enable the swap file immediately but not automatically at boot time:
swapon /swapfile
- To enable it at boot time, edit
/etc/fstab
to include the following entry:/swapfile swap swap defaults 0 0
The next time the system boots, it enables the new swap file. - After adding the new swap file and enabling it, verify it is enabled by viewing the output of the command
cat /proc/swaps
orfree
.
7.3. Removing Swap Space
7.3.1. Reducing Swap on an LVM2 Logical Volume
/dev/VolGroup00/LogVol01
is the volume you want to reduce):
- Disable swapping for the associated logical volume:
swapoff -v /dev/VolGroup00/LogVol01
- Reduce the LVM2 logical volume by 512 MB:
lvm lvreduce /dev/VolGroup00/LogVol01 -L -512M
- Format the new swap space:
mkswap /dev/VolGroup00/LogVol01
- Enable the extended logical volume:
swapon -va
- Test that the logical volume has been reduced properly:
cat /proc/swaps
free
7.3.2. Removing an LVM2 Logical Volume for Swap
/dev/VolGroup00/LogVol02
is the swap volume you want to remove):
- Disable swapping for the associated logical volume:
swapoff -v /dev/VolGroup00/LogVol02
- Remove the LVM2 logical volume of size 512 MB:
lvm lvremove /dev/VolGroup00/LogVol02
- Remove the following entry from the
/etc/fstab
file:/dev/VolGroup00/LogVol02 swap swap defaults 0 0
- Test that the logical volume has been removed:
cat /proc/swaps
free
Chapter 8. Managing Disk Storage
8.1. Standard Partitions using parted
parted
allows users to:
- View the existing partition table
- Change the size of existing partitions
- Add partitions from free space or additional hard drives
parted
package is included when installing Red Hat Enterprise Linux. To start parted
, log in as root and type the command parted /dev/sda
at a shell prompt (where /dev/sda
is the device name for the drive you want to configure).
umount
command and turn off all the swap space on the hard drive with the swapoff
command.
parted
commands” contains a list of commonly used parted
commands. The sections that follow explain some of these commands and arguments in more detail.
Command | Description |
---|---|
check minor-num | Perform a simple check of the file system |
cp from to | Copy file system from one partition to another; from and to are the minor numbers of the partitions |
help | Display list of available commands |
mklabel label | Create a disk label for the partition table |
mkfs minor-num file-system-type | Create a file system of type file-system-type |
mkpart part-type fs-type start-mb end-mb | Make a partition without creating a new file system |
mkpartfs part-type fs-type start-mb end-mb | Make a partition and create the specified file system |
move minor-num start-mb end-mb | Move the partition |
name minor-num name | Name the partition for Mac and PC98 disklabels only |
print | Display the partition table |
quit | Quit parted |
rescue start-mb end-mb | Rescue a lost partition from start-mb to end-mb |
resize minor-num start-mb end-mb | Resize the partition from start-mb to end-mb |
rm minor-num | Remove the partition |
select device | Select a different device to configure |
set minor-num flag state | Set the flag on a partition; state is either on or off |
toggle [NUMBER [FLAG] | Toggle the state of FLAG on partition NUMBER |
unit UNIT | Set the default unit to UNIT |
8.1.1. Viewing the Partition Table
parted
, use the command print
to view the partition table. A table similar to the following appears:
Model: ATA ST3160812AS (scsi) Disk /dev/sda: 160GB Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 32.3kB 107MB 107MB primary ext3 boot 2 107MB 105GB 105GB primary ext3 3 105GB 107GB 2147MB primary linux-swap 4 107GB 160GB 52.9GB extended root 5 107GB 133GB 26.2GB logical ext3 6 133GB 133GB 107MB logical ext3 7 133GB 160GB 26.6GB logical lvm
number
. For example, the partition with minor number 1 corresponds to /dev/sda1
. The Start
and End
values are in megabytes. Valid Type
are metadata, free, primary, extended, or logical. The Filesystem
is the file system type, which can be any of the following:
- ext2
- ext3
- fat16
- fat32
- hfs
- jfs
- linux-swap
- ntfs
- reiserfs
- hp-ufs
- sun-ufs
- xfs
Filesystem
of a device shows no value, this means that its file system type is unknown.
8.1.2. Creating a Partition
Warning
parted
, where /dev/sda is the device on which to create the partition:
parted /dev/sda
print
8.1.2.1. Making the Partition
mkpart primary ext3 1024 2048
Note
mkpartfs
command instead, the file system is created after the partition is created. However, parted
does not support creating an ext3 file system. Thus, if you wish to create an ext3 file system, use mkpart
and create the file system with the mkfs
command as described later.
print
command to confirm that it is in the partition table with the correct partition type, file system type, and size. Also remember the minor number of the new partition so that you can label it. You should also view the output of
cat /proc/partitions
8.1.2.2. Formatting the Partition
mkfs -t ext3 /dev/sda6
Warning
8.1.2.3. Labeling the Partition
/dev/sda6
and you want to label it /work
:
e2label /dev/sda6 /work
8.1.2.4. Creating the Mount Point
mkdir /work
8.1.2.5. Add to /etc/fstab
/etc/fstab
file to include the new partition. The new line should look similar to the following:
LABEL=/work /work ext3 defaults 1 2
LABEL=
followed by the label you gave the partition. The second column should contain the mount point for the new partition, and the next column should be the file system type (for example, ext3 or swap). If you need more information about the format, read the man page with the command man fstab
.
defaults
, the partition is mounted at boot time. To mount the partition without rebooting, as root, type the command:
mount /work
8.1.3. Removing a Partition
Warning
parted
, where /dev/sda is the device on which to remove the partition:
parted /dev/sda
print
rm
. For example, to remove the partition with minor number 3:
rm 3
print
command to confirm that it is removed from the partition table. You should also view the output of
cat /proc/partitions
/etc/fstab
file. Find the line that declares the removed partition, and remove it from the file.
8.1.4. Resizing a Partition
Warning
parted
, where /dev/sda is the device on which to resize the partition:
parted /dev/sda
print
resize
command followed by the minor number for the partition, the starting place in megabytes, and the end place in megabytes. For example:
resize 3 1024 2048
Warning
print
command to confirm that the partition has been resized correctly, is the correct partition type, and is the correct file system type.
df
to make sure the partition was mounted and is recognized with the new size.
8.2. LVM Partition Management
lvm help
at a command prompt.
Command | Description |
---|---|
dumpconfig | Dump the active configuration |
formats | List the available metadata formats |
help | Display the help commands |
lvchange | Change the attributes of logical volume(s) |
lvcreate | Create a logical volume |
lvdisplay | Display information about a logical volume |
lvextend | Add space to a logical volume |
lvmchange | Due to use of the device mapper, this command has been deprecated |
lvmdiskscan | List devices that may be used as physical volumes |
lvmsadc | Collect activity data |
lvmsar | Create activity report |
lvreduce | Reduce the size of a logical volume |
lvremove | Remove logical volume(s) from the system |
lvrename | Rename a logical volume |
lvresize | Resize a logical volume |
lvs | Display information about logical volumes |
lvscan | List all logical volumes in all volume groups |
pvchange | Change attributes of physical volume(s) |
pvcreate | Initialize physical volume(s) for use by LVM |
pvdata | Display the on-disk metadata for physical volume(s) |
pvdisplay | Display various attributes of physical volume(s) |
pvmove | Move extents from one physical volume to another |
pvremove | Remove LVM label(s) from physical volume(s) |
pvresize | Resize a physical volume in use by a volume group |
pvs | Display information about physical volumes |
pvscan | List all physical volumes |
segtypes | List available segment types |
vgcfgbackup | Backup volume group configuration |
vgcfgrestore | Restore volume group configuration |
vgchange | Change volume group attributes |
vgck | Check the consistency of a volume group |
vgconvert | Change volume group metadata format |
vgcreate | Create a volume group |
vgdisplay | Display volume group information |
vgexport | Unregister a volume group from the system |
vgextend | Add physical volumes to a volume group |
vgimport | Register exported volume group with system |
vgmerge | Merge volume groups |
vgmknodes | Create the special files for volume group devices in /dev/ |
vgreduce | Remove a physical volume from a volume group |
vgremove | Remove a volume group |
vgrename | Rename a volume group |
vgs | Display information about volume groups |
vgscan | Search for all volume groups |
vgsplit | Move physical volumes into a new volume group |
version | Display software and driver version information |
Chapter 9. Implementing Disk Quotas
quota
RPM must be installed to implement disk quotas. Note
9.1. Configuring Disk Quotas
- Enable quotas per file system by modifying the
/etc/fstab
file. - Remount the file system(s).
- Create the quota database files and generate the disk usage table.
- Assign quota policies.
9.1.1. Enabling Quotas
/etc/fstab
file. Add the usrquota
and/or grpquota
options to the file systems that require quotas:
/dev/VolGroup00/LogVol00 / ext3 defaults 1 1 LABEL=/boot /boot ext3 defaults 1 2 none /dev/pts devpts gid=5,mode=620 0 0 none /dev/shm tmpfs defaults 0 0 none /proc proc defaults 0 0 none /sys sysfs defaults 0 0 /dev/VolGroup00/LogVol02 /home ext3 defaults,usrquota,grpquota 1 2 /dev/VolGroup00/LogVol01 swap swap defaults 0 0 . . .
/home
file system has both user and group quotas enabled.
Note
/home
partition was created during the installation of Red Hat Enterprise Linux. The root (/
) partition can be used for setting quota policies in the /etc/fstab
file.
9.1.2. Remounting the File Systems
usrquota
and/or grpquota
options, remount each file system whose fstab
entry has been modified. If the file system is not in use by any process, use one of the following methods:
- Issue the
umount
command followed by themount
command to remount the file system.(See theman
page for bothumount
andmount
for the specific syntax for mounting and unmounting various filesystem types.) - Issue the
mount -o remount <file-system>
command (where<file-system>
is the name of the file system) to remount the file system. For example, to remount the/home
file system, the command to issue ismount -o remount /home
.
9.1.3. Creating the Quota Database Files
quotacheck
command.
quotacheck
command examines quota-enabled file systems and builds a table of the current disk usage per file system. The table is then used to update the operating system's copy of disk usage. In addition, the file system's disk quota files are updated.
aquota.user
and aquota.group
) on the file system, use the -c
option of the quotacheck
command. For example, if user and group quotas are enabled for the /home
file system, create the files in the /home
directory:
quotacheck -cug /home
-c
option specifies that the quota files should be created for each file system with quotas enabled, the -u
option specifies to check for user quotas, and the -g
option specifies to check for group quotas.
-u
or -g
options are specified, only the user quota file is created. If only -g
is specified, only the group quota file is created.
quotacheck -avug
a
— Check all quota-enabled, locally-mounted file systemsv
— Display verbose status information as the quota check proceedsu
— Check user disk quota informationg
— Check group disk quota information
quotacheck
has finished running, the quota files corresponding to the enabled quotas (user and/or group) are populated with data for each quota-enabled locally-mounted file system such as /home
.
9.1.4. Assigning Quotas per User
edquota
command.
edquota username
/etc/fstab
for the /home
partition (/dev/VolGroup00/LogVol02
in the example below) and the command edquota testuser
is executed, the following is shown in the editor configured as the default for the system:
Disk quotas for user testuser (uid 501): Filesystem blocks soft hard inodes soft hard /dev/VolGroup00/LogVol02 440436 0 0 37418 0 0
Note
EDITOR
environment variable is used by edquota
. To change the editor, set the EDITOR
environment variable in your ~/.bash_profile
file to the full path of the editor of your choice.
inodes
column shows how many inodes the user is currently using. The last two columns are used to set the soft and hard inode limits for the user on the file system.
Disk quotas for user testuser (uid 501): Filesystem blocks soft hard inodes soft hard /dev/VolGroup00/LogVol02 440436 500000 550000 37418 0 0
quota testuser
9.1.5. Assigning Quotas per Group
devel
group (the group must exist prior to setting the group quota), use the command:
edquota -g devel
Disk quotas for group devel (gid 505): Filesystem blocks soft hard inodes soft hard /dev/VolGroup00/LogVol02 440400 0 0 37418 0 0
quota -g devel
9.1.6. Setting the Grace Period for Soft Limits
edquota -t
edquota
commands operate on a particular user's or group's quota, the -t
option operates on every filesystem with quotas enabled.
9.2. Managing Disk Quotas
9.2.1. Enabling and Disabling
quotaoff -vaug
-u
or -g
options are specified, only the user quotas are disabled. If only -g
is specified, only group quotas are disabled. The -v
switch causes verbose status information to display as the command executes.
quotaon
command with the same options.
quotaon -vaug
/home
, use the following command:
quotaon -vug /home
-u
or -g
options are specified, only the user quotas are enabled. If only -g
is specified, only group quotas are enabled.
9.2.2. Reporting on Disk Quotas
repquota
utility. For example, the command repquota /home
produces this output:
*** Report for user quotas on device /dev/mapper/VolGroup00-LogVol02 Block grace time: 7days; Inode grace time: 7days Block limits File limits User used soft hard grace used soft hard grace ---------------------------------------------------------------------- root -- 36 0 0 4 0 0 kristin -- 540 0 0 125 0 0 testuser -- 440400 500000 550000 37418 0 0
-a
) quota-enabled file systems, use the command:
repquota -a
--
displayed after each user is a quick way to determine whether the block or inode limits have been exceeded. If either soft limit is exceeded, a +
appears in place of the corresponding -
; the first -
represents the block limit, and the second represents the inode limit.
grace
columns are normally blank. If a soft limit has been exceeded, the column contains a time specification equal to the amount of time remaining on the grace period. If the grace period has expired, none
appears in its place.
9.2.3. Keeping Quotas Accurate
quotacheck
include:
- Ensuring quotacheck runs on next reboot
Note
This method works best for (busy) multiuser systems which are periodically rebooted.As root, place a shell script into the/etc/cron.daily/
or/etc/cron.weekly/
directory—or schedule one using thecrontab -e
command—that contains thetouch /forcequotacheck
command. This creates an emptyforcequotacheck
file in the root directory, which the system init script looks for at boot time. If it is found, the init script runsquotacheck
. Afterward, the init script removes the/forcequotacheck
file; thus, scheduling this file to be created periodically withcron
ensures thatquotacheck
is run during the next reboot.Refer to Chapter 39, Automated Tasks for more information about configuringcron
.- Running quotacheck in single user mode
- An alternative way to safely run
quotacheck
is to (re-)boot the system into single-user mode to prevent the possibility of data corruption in quota files and run:~]# quotaoff -vaug /<file_system> ~]# quotacheck -vaug /<file_system> ~]# quotaon -vaug /<file_system>
- Running quotacheck on a running system
- If necessary, it is possible to run
quotacheck
on a machine during a time when no users are logged in, and thus have no open files on the file system being checked. Run the commandquotacheck -vaug <file_system>
; this command will fail ifquotacheck
cannot remount the given <file_system> as read-only. Note that, following the check, the file system will be remounted read-write.Important
Runningquotacheck
on a live file system mounted read-write is not recommended due to the possibility of quota file corruption.
cron
.
Chapter 10. Access Control Lists
acl
package is required to implement ACLs. It contains the utilities used to add, modify, remove, and retrieve ACL information.
cp
and mv
commands copy or move any ACLs associated with files and directories.
10.1. Mounting File Systems
mount -t ext3 -o acl <device-name> <partition>
mount -t ext3 -o acl /dev/VolGroup00/LogVol02 /work
/etc/fstab
file, the entry for the partition can include the acl
option:
LABEL=/work /work ext3 acl 1 2
--with-acl-support
option. No special flags are required when accessing or mounting a Samba share.
10.1.1. NFS
no_acl
option in the /etc/exports
file. To disable ACLs on an NFS share when mounting it on a client, mount it with the no_acl
option via the command line or the /etc/fstab
file.
10.2. Setting Access ACLs
- Per user
- Per group
- Via the effective rights mask
- For users not in the user group for the file
setfacl
utility sets ACLs for files and directories. Use the -m
option to add or modify the ACL of a file or directory:
setfacl -m <rules> <files>
u:<uid>:<perms>
- Sets the access ACL for a user. The user name or UID may be specified. The user may be any valid user on the system.
g:<gid>:<perms>
- Sets the access ACL for a group. The group name or GID may be specified. The group may be any valid group on the system.
m:<perms>
- Sets the effective rights mask. The mask is the union of all permissions of the owning group and all of the user and group entries.
o:<perms>
- Sets the access ACL for users other than the ones in the group for the file.
r
, w
, and x
for read, write, and execute.
setfacl
command is used, the additional rules are added to the existing ACL or the existing rule is modified.
setfacl -m u:andrius:rw /project/somefile
-x
option and do not specify any permissions:
setfacl -x <rules> <files>
setfacl -x u:500 /project/somefile
10.3. Setting Default ACLs
d:
before the rule and specify a directory instead of a file name.
/share/
directory to read and execute for users not in the user group (an access ACL for an individual file can override it):
setfacl -m d:o:rx /share
10.4. Retrieving ACLs
getfacl
command. In the example below, the getfacl
is used to determine the existing ACLs for a file.
getfacl home/john/picture.png
# file: home/john/picture.png # owner: john # group: john user::rw- group::r-- other::r--
[john@main /]$ getfacl home/sales/
# file: home/sales/
# owner: john
# group: john
user::rw-
user:barryg:r--
group::r--
mask::r--
other::r--
default:user::rwx
default:user:john:rwx
default:group::r-x
default:mask::rwx
default:other::r-x
10.5. Archiving File Systems With ACLs
Warning
tar
and dump
commands do not backup ACLs.
star
utility is similar to the tar
utility in that it can be used to generate archives of files; however, some of its options are different. Refer to Table 10.1, “Command Line Options for star
” for a listing of more commonly used options. For all available options, refer to the star
man page. The star
package is required to use this utility.
Option | Description |
---|---|
-c | Creates an archive file. |
-n | Do not extract the files; use in conjunction with -x to show what extracting the files does. |
-r | Replaces files in the archive. The files are written to the end of the archive file, replacing any files with the same path and file name. |
-t | Displays the contents of the archive file. |
-u | Updates the archive file. The files are written to the end of the archive if they do not exist in the archive or if the files are newer than the files of the same name in the archive. This option only work if the archive is a file or an unblocked tape that may backspace. |
-x | Extracts the files from the archive. If used with -U and a file in the archive is older than the corresponding file on the file system, the file is not extracted. |
-help | Displays the most important options. |
-xhelp | Displays the least important options. |
-/ | Do not strip leading slashes from file names when extracting the files from an archive. By default, they are striped when files are extracted. |
-acl | When creating or extracting, archive or restore any ACLs associated with the files and directories. |
10.6. Compatibility with Older Systems
ext_attr
attribute. This attribute can be seen using the following command:
tune2fs -l <filesystem-device>
ext_attr
attribute can be mounted with older kernels, but those kernels do not enforce any ACLs which have been set.
e2fsck
utility included in version 1.22 and higher of the e2fsprogs
package (including the versions in Red Hat Enterprise Linux 2.1 and 4) can check a file system with the ext_attr
attribute. Older versions refuse to check it.
10.7. Additional Resources
10.7.1. Installed Documentation
acl
man page — Description of ACLsgetfacl
man page — Discusses how to get file access control listssetfacl
man page — Explains how to set file access control listsstar
man page — Explains more about thestar
utility and its many options
10.7.2. Useful Websites
- http://acl.bestbits.at/ — Website for ACLs
Chapter 11. LVM (Logical Volume Manager)
11.1. What is LVM?
/boot
partition. The /boot
partition cannot be on a logical volume group because the boot loader cannot read it. If the root (/
) partition is on a logical volume, create a separate /boot
partition which is not a part of a volume group.
Figure 11.1. Logical Volumes
/home
and /
and file system types, such as ext2 or ext3. When "partitions" reach their full capacity, free space from the volume group can be added to the logical volume to increase the size of the partition. When a new hard drive is added to the system, it can be added to the volume group, and partitions that are logical volumes can be increased in size.
Figure 11.2. Logical Volumes
11.2. LVM Configuration
system-config-lvm
utility to create your own LVM configuration post-installation. The next two sections focus on using Disk Druid during installation to complete this task. The third section introduces the LVM utility (system-config-lvm
) which allows you to manage your LVM volumes in X windows or graphically.
- Creating physical volumes from the hard drives.
- Creating volume groups from the physical volumes.
- Creating logical volumes from the volume groups and assign the logical volumes mount points.
/dev/sda
and /dev/sdb
) are used in the following examples. They detail how to create a simple configuration using a single LVM volume group with associated logical volumes during installation.
11.3. Automatic Partitioning
- The
/boot
partition resides on its own non-LVM partition. In the following example, it is the first partition on the first drive (/dev/sda1
). Bootable partitions cannot reside on LVM logical volumes. - A single LVM volume group (
VolGroup00
) is created, which spans all selected drives and all remaining space available. In the following example, the remainder of the first drive (/dev/sda2
), and the entire second drive (/dev/sdb1
) are allocated to the volume group. - Two LVM logical volumes (
LogVol00
andLogVol01
) are created from the newly created spanned volume group. In the following example, the recommended swap space is automatically calculated and assigned toLogVol01
, and the remainder is allocated to the root file system,LogVol00
.
Figure 11.3. Automatic LVM Configuration With Two SCSI Drives
Note
/home
or /var
, so that each file system has its own independent quota configuration limits.
Note
11.4. Manual LVM Partitioning
11.4.1. Creating the /boot
Partition
Figure 11.4. Two Blank Drives, Ready for Configuration
Warning
/boot
partition cannot reside on an LVM volume because the GRUB boot loader cannot read it.
- Select.
- Select /boot from the Mount Point pulldown menu.
- Select ext3 from the File System Type pulldown menu.
- Select only the sda checkbox from the Allowable Drives area.
- Leave 100 (the default) in the Size (MB) menu.
- Leave the Fixed size (the default) radio button selected in the Additional Size Options area.
- Select Force to be a primary partition to make the partition be a primary partition. A primary partition is one of the first four partitions on the hard drive. If unselected, the partition is created as a logical partition. If other operating systems are already on the system, unselecting this option should be considered. For more information on primary versus logical/extended partitions, refer to the appendix section of the Red Hat Enterprise Linux Installation Guide.
Figure 11.5. Creation of the Boot Partition
Figure 11.6. The /boot
Partition Displayed
11.4.2. Creating the LVM Physical Volumes
- Select.
- Select physical volume (LVM) from the File System Type pulldown menu as shown in Figure 11.7, “Creating a Physical Volume”.
Figure 11.7. Creating a Physical Volume
- You cannot enter a mount point yet (you can once you have created all your physical volumes and then all volume groups).
- A physical volume must be constrained to one drive. For, select the drive on which the physical volume are created. If you have multiple drives, all drives are selected, and you must deselect all but one drive.
- Enter the size that you want the physical volume to be.
- Select Fixed size to make the physical volume the specified size, select Fill all space up to (MB) and enter a size in MBs to give range for the physical volume size, or select Fill to maximum allowable size to make it grow to fill all available space on the hard disk. If you make more than one growable, they share the available free space on the disk.
- Select Force to be a primary partition if you want the partition to be a primary partition.
- Clickto return to the main screen.
Figure 11.8. Two Physical Volumes Created
11.4.3. Creating the LVM Volume Groups
- Click thebutton to collect the physical volumes into volume groups. A volume group is basically a collection of physical volumes. You can have multiple logical volumes, but a physical volume can only be in one volume group.
Note
There is overhead disk space reserved in the volume group. The volume group size is slightly less than the total of physical volume sizes.Figure 11.9. Creating an LVM Volume Group
- Change the Volume Group Name if desired.
- Select which physical volumes to use for the volume group.
11.4.4. Creating the LVM Logical Volumes
/
, /home
, and swap space. Remember that /boot
cannot be a logical volume. To add a logical volume, click the button in the Logical Volumes section. A dialog window as shown in Figure 11.10, “Creating a Logical Volume” appears.
Figure 11.10. Creating a Logical Volume
Note
Figure 11.11. Pending Logical Volumes
Figure 11.12. Final Manual Configuration
11.5. Using the LVM utility system-config-lvm
system-config-lvm
from a terminal.
/boot - (Ext3) file system. Displayed under 'Uninitialized Entities'. (DO NOT initialize this partition). LogVol00 - (LVM) contains the (/) directory (312 extents). LogVol02 - (LVM) contains the (/home) directory (128 extents). LogVol03 - (LVM) swap (28 extents).
/dev/hda2
while /boot
was created in /dev/hda1
. The system also consists of 'Uninitialized Entities' which are illustrated in Figure 11.17, “Uninitialized Entities”. The figure below illustrates the main window in the LVM utility. The logical and the physical views of the above configuration are illustrated below. The three logical volumes exist on the same physical volume (hda2).
Figure 11.13. Main LVM Window
Figure 11.14. Physical View Window
Figure 11.15. Logical View Window
/
(root) directory, this task will not be successful as the volume cannot be unmounted.
Figure 11.16. Edit Logical Volume
11.5.1. Utilizing uninitialized entities
/boot
. Uninitialized entities are illustrated below.
Figure 11.17. Uninitialized Entities
11.5.2. Adding Unallocated Volumes to a volume group
- create a new volume group,
- add the unallocated volume to an existing volume group,
- remove the volume from LVM.