Dieser Inhalt ist in der von Ihnen ausgewählten Sprache nicht verfügbar.
5.4.2. Using Perf
make
and its children, use the following command:
# perf stat -- make all
perf
command collects a number of different hardware and software counters. It then prints the following information:
Performance counter stats for 'make all': 244011.782059 task-clock-msecs # 0.925 CPUs 53328 context-switches # 0.000 M/sec 515 CPU-migrations # 0.000 M/sec 1843121 page-faults # 0.008 M/sec 789702529782 cycles # 3236.330 M/sec 1050912611378 instructions # 1.331 IPC 275538938708 branches # 1129.203 M/sec 2888756216 branch-misses # 1.048 % 4343060367 cache-references # 17.799 M/sec 428257037 cache-misses # 1.755 M/sec 263.779192511 seconds time elapsed
perf
tool can also record samples. For example, to record data on the make
command and its children, use:
# perf record -- make all
[ perf record: Woken up 42 times to write data ] [ perf record: Captured and wrote 9.753 MB perf.data (~426109 samples) ]
{}
group syntax has been added that allows the creation of event groups based on the way they are specified on the command line.
--group
or -g
options remain the same; if it is specified for record, stat, or top command, all the specified events become members of a single group with the first event as a group leader.
{}
group syntax allows the creation of a group like:
# perf record -e '{cycles, faults}' ls
# perf record -r '{faults:k,cache-references}:p'
:kp
modifier being used for faults, and the :p
modifier being used for the cache-references event.
Both OProfile and Performance Counters for Linux (PCL) use the same hardware Performance Monitoring Unit (PMU). If OProfile is currently running while attempting to use the PCL perf
command, an error message like the following occurs when starting OProfile:
Error: open_counter returned with 16 (Device or resource busy). /bin/dmesg may provide additional information. Fatal: Not all events could be opened.
perf
command, first shut down OProfile:
# opcontrol --deinit
perf.data
to determine the relative frequency of samples. The report output includes the command, object, and function for the samples. Use perf report
to output an analysis of perf.data
. For example, the following command produces a report of the executable that consumes the most time:
# perf report --sort=comm
# Samples: 1083783860000 # # Overhead Command # ........ ............... # 48.19% xsltproc 44.48% pdfxmltex 6.01% make 0.95% perl 0.17% kernel-doc 0.05% xmllint 0.05% cc1 0.03% cp 0.01% xmlto 0.01% sh 0.01% docproc 0.01% ld 0.01% gcc 0.00% rm 0.00% sed 0.00% git-diff-files 0.00% bash 0.00% git-diff-index
make
spends most of this time in xsltproc
and the pdfxmltex
. To reduce the time for the make
to complete, focus on xsltproc
and pdfxmltex
. To list the functions executed by xsltproc
, run:
# perf report -n --comm=xsltproc
comm: xsltproc # Samples: 472520675377 # # Overhead Samples Shared Object Symbol # ........ .......... ............................. ...... # 45.54%215179861044 libxml2.so.2.7.6 [.] xmlXPathCmpNodesExt 11.63%54959620202 libxml2.so.2.7.6 [.] xmlXPathNodeSetAdd__internal_alias 8.60%40634845107 libxml2.so.2.7.6 [.] xmlXPathCompOpEval 4.63%21864091080 libxml2.so.2.7.6 [.] xmlXPathReleaseObject 2.73%12919672281 libxml2.so.2.7.6 [.] xmlXPathNodeSetSort__internal_alias 2.60%12271959697 libxml2.so.2.7.6 [.] valuePop 2.41%11379910918 libxml2.so.2.7.6 [.] xmlXPathIsNaN__internal_alias 2.19%10340901937 libxml2.so.2.7.6 [.] valuePush__internal_alias