此内容没有您所选择的语言版本。
4.2. Disk
The following sections showcase scripts that monitor disk and I/O activity.
4.2.1. Summarizing Disk Read/Write Traffic 复制链接链接已复制到粘贴板!
复制链接链接已复制到粘贴板!
This section describes how to identify which processes are performing the heaviest disk reads/writes to the system.
disktop.stp
#!/usr/bin/env stap
#
# Copyright (C) 2007 Oracle Corp.
#
# Get the status of reading/writing disk every 5 seconds,
# output top ten entries
#
# This is free software,GNU General Public License (GPL);
# either version 2, or (at your option) any later version.
#
# Usage:
# ./disktop.stp
#
global io_stat,device
global read_bytes,write_bytes
probe vfs.read.return {
if ($return>0) {
if (devname!="N/A") {/*skip read from cache*/
io_stat[pid(),execname(),uid(),ppid(),"R"] += $return
device[pid(),execname(),uid(),ppid(),"R"] = devname
read_bytes += $return
}
}
}
probe vfs.write.return {
if ($return>0) {
if (devname!="N/A") { /*skip update cache*/
io_stat[pid(),execname(),uid(),ppid(),"W"] += $return
device[pid(),execname(),uid(),ppid(),"W"] = devname
write_bytes += $return
}
}
}
probe timer.ms(5000) {
/* skip non-read/write disk */
if (read_bytes+write_bytes) {
printf("\n%-25s, %-8s%4dKb/sec, %-7s%6dKb, %-7s%6dKb\n\n",
ctime(gettimeofday_s()),
"Average:", ((read_bytes+write_bytes)/1024)/5,
"Read:",read_bytes/1024,
"Write:",write_bytes/1024)
/* print header */
printf("%8s %8s %8s %25s %8s %4s %12s\n",
"UID","PID","PPID","CMD","DEVICE","T","BYTES")
}
/* print top ten I/O */
foreach ([process,cmd,userid,parent,action] in io_stat- limit 10)
printf("%8d %8d %8d %25s %8s %4s %12d\n",
userid,process,parent,cmd,
device[process,cmd,userid,parent,action],
action,io_stat[process,cmd,userid,parent,action])
/* clear data */
delete io_stat
delete device
read_bytes = 0
write_bytes = 0
}
probe end{
delete io_stat
delete device
delete read_bytes
delete write_bytes
}
disktop.stp outputs the top ten processes responsible for the heaviest reads/writes to disk. Example 4.4, “disktop.stp Sample Output” displays a sample output for this script, and includes the following data per listed process:
UID— user ID. A user ID of0refers to the root user.PID— the ID of the listed process.PPID— the process ID of the listed process's parent process.CMD— the name of the listed process.DEVICE— which storage device the listed process is reading from or writing to.T— the type of action performed by the listed process;Wrefers to write, whileRrefers to read.BYTES— the amount of data read to or written from disk.
The time and date in the output of disktop.stp is returned by the functions
ctime() and gettimeofday_s(). ctime() derives calendar time in terms of seconds passed since the Unix epoch (January 1, 1970). gettimeofday_s() counts the actual number of seconds since Unix epoch, which gives a fairly accurate human-readable timestamp for the output.
In this script, the
$return is a local variable that stores the actual number of bytes each process reads or writes from the virtual file system. $return can only be used in return probes (e.g. vfs.read.return and vfs.read.return).
Example 4.4. disktop.stp Sample Output
[...]
Mon Sep 29 03:38:28 2008 , Average: 19Kb/sec, Read: 7Kb, Write: 89Kb
UID PID PPID CMD DEVICE T BYTES
0 26319 26294 firefox sda5 W 90229
0 2758 2757 pam_timestamp_c sda5 R 8064
0 2885 1 cupsd sda5 W 1678
Mon Sep 29 03:38:38 2008 , Average: 1Kb/sec, Read: 7Kb, Write: 1Kb
UID PID PPID CMD DEVICE T BYTES
0 2758 2757 pam_timestamp_c sda5 R 8064
0 2885 1 cupsd sda5 W 1678
4.2.2. Tracking I/O Time For Each File Read or Write 复制链接链接已复制到粘贴板!
复制链接链接已复制到粘贴板!
This section describes how to monitor the amount of time it takes for each process to read from or write to any file. This is useful if you wish to determine what files are slow to load on a given system.
iotime.stp
global start
global entry_io
global fd_io
global time_io
function timestamp:long() {
return gettimeofday_us() - start
}
function proc:string() {
return sprintf("%d (%s)", pid(), execname())
}
probe begin {
start = gettimeofday_us()
}
global filenames
global filehandles
global fileread
global filewrite
probe syscall.open {
filenames[pid()] = user_string($filename)
}
probe syscall.open.return {
if ($return != -1) {
filehandles[pid(), $return] = filenames[pid()]
fileread[pid(), $return] = 0
filewrite[pid(), $return] = 0
} else {
printf("%d %s access %s fail\n", timestamp(), proc(), filenames[pid()])
}
delete filenames[pid()]
}
probe syscall.read {
if ($count > 0) {
fileread[pid(), $fd] += $count
}
t = gettimeofday_us(); p = pid()
entry_io[p] = t
fd_io[p] = $fd
}
probe syscall.read.return {
t = gettimeofday_us(); p = pid()
fd = fd_io[p]
time_io[p,fd] <<< t - entry_io[p]
}
probe syscall.write {
if ($count > 0) {
filewrite[pid(), $fd] += $count
}
t = gettimeofday_us(); p = pid()
entry_io[p] = t
fd_io[p] = $fd
}
probe syscall.write.return {
t = gettimeofday_us(); p = pid()
fd = fd_io[p]
time_io[p,fd] <<< t - entry_io[p]
}
probe syscall.close {
if (filehandles[pid(), $fd] != "") {
printf("%d %s access %s read: %d write: %d\n", timestamp(), proc(),
filehandles[pid(), $fd], fileread[pid(), $fd], filewrite[pid(), $fd])
if (@count(time_io[pid(), $fd]))
printf("%d %s iotime %s time: %d\n", timestamp(), proc(),
filehandles[pid(), $fd], @sum(time_io[pid(), $fd]))
}
delete fileread[pid(), $fd]
delete filewrite[pid(), $fd]
delete filehandles[pid(), $fd]
delete fd_io[pid()]
delete entry_io[pid()]
delete time_io[pid(),$fd]
}
iotime.stp tracks each time a system call opens, closes, reads from, and writes to a file. For each file any system call accesses, iotime.stp counts the number of microseconds it takes for any reads or writes to finish and tracks the amount of data (in bytes) read from or written to the file.
iotime.stp also uses the local variable
$count to track the amount of data (in bytes) that any system call attempts to read or write. Note that $return (as used in disktop.stp from Section 4.2.1, “Summarizing Disk Read/Write Traffic”) stores the actual amount of data read/written. $count can only be used on probes that track data reads or writes (e.g. syscall.read and syscall.write).
Example 4.5. iotime.stp Sample Output
[...]
825946 3364 (NetworkManager) access /sys/class/net/eth0/carrier read: 8190 write: 0
825955 3364 (NetworkManager) iotime /sys/class/net/eth0/carrier time: 9
[...]
117061 2460 (pcscd) access /dev/bus/usb/003/001 read: 43 write: 0
117065 2460 (pcscd) iotime /dev/bus/usb/003/001 time: 7
[...]
3973737 2886 (sendmail) access /proc/loadavg read: 4096 write: 0
3973744 2886 (sendmail) iotime /proc/loadavg time: 11
[...]
Example 4.5, “iotime.stp Sample Output” prints out the following data:
- A timestamp, in microseconds
- Process ID and process name
- An
accessoriotimeflag - The file accessed
If a process was able to read or write any data, a pair of
access and iotime lines should appear together. The access line's timestamp refer to the time that a given process started accessing a file; at the end of the line, it will show the amount of data read/written (in bytes). The iotime line will show the amount of time (in microseconds) that the process took in order to perform the read or write.
If an
access line is not followed by an iotime line, it simply means that the process did not read or write any data.
4.2.3. Track Cumulative IO 复制链接链接已复制到粘贴板!
复制链接链接已复制到粘贴板!
This section describes how to track the cumulative amount of I/O to the system.
traceio.stp
#! /usr/bin/env stap
# traceio.stp
# Copyright (C) 2007 Red Hat, Inc., Eugene Teo <eteo@redhat.com>
# Copyright (C) 2009 Kai Meyer <kai@unixlords.com>
# Fixed a bug that allows this to run longer
# And added the humanreadable function
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License version 2 as
# published by the Free Software Foundation.
#
global reads, writes, total_io
probe vfs.read.return {
reads[pid(),execname()] += $return
total_io[pid(),execname()] += $return
}
probe vfs.write.return {
writes[pid(),execname()] += $return
total_io[pid(),execname()] += $return
}
function humanreadable(bytes) {
if (bytes > 1024*1024*1024) {
return sprintf("%d GiB", bytes/1024/1024/1024)
} else if (bytes > 1024*1024) {
return sprintf("%d MiB", bytes/1024/1024)
} else if (bytes > 1024) {
return sprintf("%d KiB", bytes/1024)
} else {
return sprintf("%d B", bytes)
}
}
probe timer.s(1) {
foreach([p,e] in total_io- limit 10)
printf("%8d %15s r: %12s w: %12s\n",
p, e, humanreadable(reads[p,e]),
humanreadable(writes[p,e]))
printf("\n")
# Note we don't zero out reads, writes and total_io,
# so the values are cumulative since the script started.
}
traceio.stp prints the top ten executables generating I/O traffic over time. In addition, it also tracks the cumulative amount of I/O reads and writes done by those ten executables. This information is tracked and printed out in 1-second intervals, and in descending order.
Note that traceio.stp also uses the local variable
$return, which is also used by disktop.stp from Section 4.2.1, “Summarizing Disk Read/Write Traffic”.
Example 4.6. traceio.stp Sample Output
[...]
Xorg r: 583401 KiB w: 0 KiB
floaters r: 96 KiB w: 7130 KiB
multiload-apple r: 538 KiB w: 537 KiB
sshd r: 71 KiB w: 72 KiB
pam_timestamp_c r: 138 KiB w: 0 KiB
staprun r: 51 KiB w: 51 KiB
snmpd r: 46 KiB w: 0 KiB
pcscd r: 28 KiB w: 0 KiB
irqbalance r: 27 KiB w: 4 KiB
cupsd r: 4 KiB w: 18 KiB
Xorg r: 588140 KiB w: 0 KiB
floaters r: 97 KiB w: 7143 KiB
multiload-apple r: 543 KiB w: 542 KiB
sshd r: 72 KiB w: 72 KiB
pam_timestamp_c r: 138 KiB w: 0 KiB
staprun r: 51 KiB w: 51 KiB
snmpd r: 46 KiB w: 0 KiB
pcscd r: 28 KiB w: 0 KiB
irqbalance r: 27 KiB w: 4 KiB
cupsd r: 4 KiB w: 18 KiB
4.2.4. I/O Monitoring (By Device) 复制链接链接已复制到粘贴板!
复制链接链接已复制到粘贴板!
This section describes how to monitor I/O activity on a specific device.
traceio2.stp
#! /usr/bin/env stap
global device_of_interest, dev
probe begin {
/* The following is not the most efficient way to do this.
One could directly put the result of usrdev2kerndev()
into device_of_interest. However, want to test out
the other device functions */
dev = usrdev2kerndev($1)
device_of_interest = MKDEV(MAJOR(dev), MINOR(dev))
}
probe vfs.write, vfs.read
{
if (dev == device_of_interest)
printf ("%s(%d) %s 0x%x\n",
execname(), pid(), probefunc(), dev)
}
traceio2.stp takes 1 argument: the whole device number. To get this number, use
stat -c "0x%D" directory, where directory is located in the device you wish to monitor.
The
usrdev2kerndev() function converts the whole device number into the format understood by the kernel. The output produced by usrdev2kerndev() is used in conjunction with the MKDEV(), MINOR(), and MAJOR() functions to determine the major and minor numbers of a specific device.
The output of traceio2.stp includes the name and ID of any process performing a read/write, the function it is performing (i.e.
vfs_read or vfs_write), and the kernel device number.
The following example is an excerpt from the full output of
stap traceio2.stp 0x805, where 0x805 is the whole device number of /home. /home resides in /dev/sda5, which is the device we wish to monitor.
Example 4.7. traceio2.stp Sample Output
[...]
synergyc(3722) vfs_read 0x800005
synergyc(3722) vfs_read 0x800005
cupsd(2889) vfs_write 0x800005
cupsd(2889) vfs_write 0x800005
cupsd(2889) vfs_write 0x800005
[...]
4.2.5. Monitoring Reads and Writes to a File 复制链接链接已复制到粘贴板!
复制链接链接已复制到粘贴板!
This section describes how to monitor reads from and writes to a file in real time.
inodewatch-simple.stp
probe vfs.write, vfs.read
{
dev_nr = $file->f_dentry->d_inode->i_sb->s_dev
inode_nr = $file->f_dentry->d_inode->i_ino
if (dev_nr == ($1 << 20 | $2) # major/minor device
&& inode_nr == $3)
printf ("%s(%d) %s 0x%x/%u\n",
execname(), pid(), probefunc(), dev_nr, inode_nr)
}
inodewatch-simple.stp takes the following information about the file as arguments on the command line:
- The file's major device number.
- The file's minor device number.
- The file's
inodenumber.
To get this information, use
stat -c '%D %i' filename, where filename is an absolute path.
For instance: if you wish to monitor
/etc/crontab, run stat -c '%D %i' /etc/crontab first. This gives the following output:
805 1078319
805 is the base-16 (hexadecimal) device number. The lower two digits are the minor device number and the upper digits are the major number. 1078319 is the inode number. To start monitoring /etc/crontab, run stap inodewatch.stp 0x8 0x05 1078319 (The 0x prefixes indicate base-16 values.
The output of this command contains the name and ID of any process performing a read/write, the function it is performing (i.e.
vfs_read or vfs_write), the device number (in hex format), and the inode number. Example 4.8, “inodewatch-simple.stp Sample Output” contains the output of stap inodewatch.stp 0x8 0x05 1078319 (when cat /etc/crontab is executed while the script is running) :
Example 4.8. inodewatch-simple.stp Sample Output
cat(16437) vfs_read 0x800005/1078319
cat(16437) vfs_read 0x800005/1078319
4.2.6. Monitoring Changes to File Attributes 复制链接链接已复制到粘贴板!
复制链接链接已复制到粘贴板!
This section describes how to monitor if any processes are changing the attributes of a targeted file, in real time.
inodewatch2-simple.stp
global ATTR_MODE = 1
probe kernel.function("inode_setattr") {
dev_nr = $inode->i_sb->s_dev
inode_nr = $inode->i_ino
if (dev_nr == ($1 << 20 | $2) # major/minor device
&& inode_nr == $3
&& $attr->ia_valid & ATTR_MODE)
printf ("%s(%d) %s 0x%x/%u %o %d\n",
execname(), pid(), probefunc(), dev_nr, inode_nr, $attr->ia_mode, uid())
}
Like inodewatch-simple.stp from Section 4.2.5, “Monitoring Reads and Writes to a File”, inodewatch2-simple.stp takes the targeted file's device number (in integer format) and
inode number as arguments. For more information on how to retrieve this information, refer to Section 4.2.5, “Monitoring Reads and Writes to a File”.
The output for inodewatch2-simple.stp is similar to that of inodewatch-simple.stp, except that inodewatch2-simple.stp also contains the attribute changes to the monitored file, as well as the ID of the user responsible (
uid()). Example 4.9, “inodewatch2-simple.stp Sample Output” contains shows the output of inodewatch2-simple.stp while monitoring /home/joe/bigfile when user joe executes chmod 777 /home/joe/bigfile and chmod 666 /home/joe/bigfile.
Example 4.9. inodewatch2-simple.stp Sample Output
chmod(17448) inode_setattr 0x800005/6011835 100777 500
chmod(17449) inode_setattr 0x800005/6011835 100666 500