Chapter 21. Recording and analyzing performance profiles with perf
The perf tool allows you to record performance data and analyze it at a later time.
Prerequisites
-
You have the
perfuser space tool installed as described in Installing perf.
21.1. The purpose of perf record Copy linkLink copied to clipboard!
The perf record command samples performance data and stores it in a file, perf.data, which can be read and visualized with other perf commands. perf.data is generated in the current directory and can be accessed at a later time, possibly on a different machine.
If you do not specify a command for perf record to record during, it will record until you manually stop the process by pressing Ctrl+C. You can attach perf record to specific processes by passing the -p option followed by one or more process IDs. You can run perf record without root access, however, doing so will only sample performance data in the user space. In the default mode, perf record uses CPU cycles as the sampling event and operates in per-thread mode with inherit mode enabled.
21.2. Recording a performance profile without root access Copy linkLink copied to clipboard!
You can use perf record without root access to sample and record performance data in the user-space only.
Prerequisites
-
You have the
perfuser space tool installed as described in Installing perf.
Procedure
Sample and record the performance data:
perf record command
$ perf record commandCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
commandwith the command you want to sample data during. If you do not specify a command, thenperf recordwill sample data until you manually stop it by pressing Ctrl+C.
21.3. Recording a performance profile with root access Copy linkLink copied to clipboard!
You can use perf record with root access to sample and record performance data in both the user-space and the kernel-space simultaneously.
Prerequisites
-
You have the
perfuser space tool installed as described in Installing perf. - You have root access.
Procedure
Sample and record the performance data:
perf record command
# perf record commandCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
commandwith the command you want to sample data during. If you do not specify a command, thenperf recordwill sample data until you manually stop it by pressing Ctrl+C.
21.4. Recording a performance profile in per-CPU mode Copy linkLink copied to clipboard!
You can use perf record in per-CPU mode to sample and record performance data in both and user-space and the kernel-space simultaneously across all threads on a monitored CPU. By default, per-CPU mode monitors all online CPUs.
Prerequisites
-
You have the
perfuser space tool installed as described in Installing perf.
Procedure
Sample and record the performance data:
perf record -a command
# perf record -a commandCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
commandwith the command you want to sample data during. If you do not specify a command, thenperf recordwill sample data until you manually stop it by pressing Ctrl+C.
21.5. Capturing call graph data with perf record Copy linkLink copied to clipboard!
You can configure the perf record tool so that it records which function is calling other functions in the performance profile. This helps to identify a bottleneck if several processes are calling the same function.
Prerequisites
-
You have the
perfuser space tool installed as described in Installing perf.
Procedure
Sample and record performance data with the
--call-graphoption:perf record --call-graph method command
$ perf record --call-graph method commandCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Replace
commandwith the command you want to sample data during. If you do not specify a command, thenperf recordwill sample data until you manually stop it by pressing Ctrl+C. Replace method with one of the following unwinding methods:
fp-
Uses the frame pointer method. Depending on compiler optimization, such as with binaries built with the GCC option
--fomit-frame-pointer, this may not be able to unwind the stack. dwarf- Uses DWARF Call Frame Information to unwind the stack.
lbr- Uses the last branch record hardware on Intel processors.
-
Replace
21.6. Analyzing perf.data with perf report Copy linkLink copied to clipboard!
You can use perf report to display and analyze a perf.data file.
Prerequisites
-
You have the
perfuser space tool installed as described in Installing perf. -
There is a
perf.datafile in the current directory. -
If the
perf.datafile was created with root access, you need to runperf reportwith root access too.
Procedure
Display the contents of the
perf.datafile for further analysis:perf report
# perf reportCopy to Clipboard Copied! Toggle word wrap Toggle overflow This command displays output similar to the following:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
21.7. Interpretation of perf report output Copy linkLink copied to clipboard!
The table displayed by running the perf report command sorts the data into several columns:
- The 'Overhead' column
- Indicates what percentage of overall samples were collected in that particular function.
- The 'Command' column
- Tells you which process the samples were collected from.
- The 'Shared Object' column
- Displays the name of the ELF image where the samples come from (the name [kernel.kallsyms] is used when the samples come from the kernel).
- The 'Symbol' column
- Displays the function name or symbol.
In default mode, the functions are sorted in descending order with those with the highest overhead displayed first.
21.8. Generating a perf.data file that is readable on a different device Copy linkLink copied to clipboard!
You can use the perf tool to record performance data into a perf.data file to be analyzed on a different device.
Prerequisites
-
You have the
perfuser space tool installed as described in Installing perf. -
The kernel
debuginfopackage is installed. For more information, see Getting debuginfo packages for an application or library using GDB.
Procedure
Capture performance data you are interested in investigating further:
perf record -a --call-graph fp sleep seconds
# perf record -a --call-graph fp sleep secondsCopy to Clipboard Copied! Toggle word wrap Toggle overflow This example would generate a
perf.dataover the entire system for a period ofsecondsseconds as dictated by the use of thesleepcommand. It would also capture call graph data using the frame pointer method.Generate an archive file containing debug symbols of the recorded data:
perf archive
# perf archiveCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Verify that the archive file has been generated in your current active directory:
ls perf.data*
# ls perf.data*Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output will display every file in your current directory that begins with
perf.data. The archive file will be named either:perf.data.tar.gz
perf.data.tar.gzCopy to Clipboard Copied! Toggle word wrap Toggle overflow or
perf.data.tar.bz2
perf.data.tar.bz2Copy to Clipboard Copied! Toggle word wrap Toggle overflow
21.9. Analyzing a perf.data file that was created on a different device Copy linkLink copied to clipboard!
You can use the perf tool to analyze a perf.data file that was generated on a different device.
Prerequisites
-
You have the
perfuser space tool installed as described in Installing perf. -
A
perf.datafile and associated archive file generated on a different device are present on the current device being used.
Procedure
-
Copy both the
perf.datafile and the archive file into your current active directory. Extract the archive file into
~/.debug:mkdir -p ~/.debug tar xf perf.data.tar.bz2 -C ~/.debug
# mkdir -p ~/.debug # tar xf perf.data.tar.bz2 -C ~/.debugCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe archive file might also be named
perf.data.tar.gz.Open the
perf.datafile for further analysis:perf report
# perf reportCopy to Clipboard Copied! Toggle word wrap Toggle overflow
21.10. Why perf displays some function names as raw function addresses Copy linkLink copied to clipboard!
For kernel functions, perf uses the information from the /proc/kallsyms file to map the samples to their respective function names or symbols. For functions executed in the user space, however, you might see raw function addresses because the binary is stripped.
The debuginfo package of the executable must be installed or, if the executable is a locally developed application, the application must be compiled with debugging information turned on (the -g option in GCC) to display the function names or symbols in such a situation.
It is not necessary to re-run the perf record command after installing the debuginfo associated with an executable. Simply re-run the perf report command.
21.11. Enabling debug and source repositories Copy linkLink copied to clipboard!
A standard installation of Red Hat Enterprise Linux does not enable the debug and source repositories. These repositories contain information needed to debug the system components and measure their performance.
Procedure
Enable the source and debug information package channels:
subscription-manager repos --enable rhel-8-for-$(uname -i)-baseos-debug-rpms subscription-manager repos --enable rhel-8-for-$(uname -i)-baseos-source-rpms subscription-manager repos --enable rhel-8-for-$(uname -i)-appstream-debug-rpms subscription-manager repos --enable rhel-8-for-$(uname -i)-appstream-source-rpms
# subscription-manager repos --enable rhel-8-for-$(uname -i)-baseos-debug-rpms # subscription-manager repos --enable rhel-8-for-$(uname -i)-baseos-source-rpms # subscription-manager repos --enable rhel-8-for-$(uname -i)-appstream-debug-rpms # subscription-manager repos --enable rhel-8-for-$(uname -i)-appstream-source-rpmsCopy to Clipboard Copied! Toggle word wrap Toggle overflow The
$(uname -i)part is automatically replaced with a matching value for architecture of your system:Expand Architecture name Value 64-bit Intel and AMD
x86_64
64-bit ARM
aarch64
IBM POWER
ppc64le
64-bit IBM Z
s390x
21.12. Getting debuginfo packages for an application or library using GDB Copy linkLink copied to clipboard!
Debugging information is required to debug code. For code that is installed from a package, the GNU Debugger (GDB) automatically recognizes missing debug information, resolves the package name and provides concrete advice on how to get the package.
Prerequisites
- The application or library you want to debug must be installed on the system.
-
GDB and the
debuginfo-installtool must be installed on the system. For details, see Setting up to debug applications. -
Repositories providing
debuginfoanddebugsourcepackages must be configured and enabled on the system. For details, see Enabling debug and source repositories.
Procedure
Start GDB attached to the application or library you want to debug. GDB automatically recognizes missing debugging information and suggests a command to run.
gdb -q /bin/ls
$ gdb -q /bin/ls Reading symbols from /bin/ls...Reading symbols from .gnu_debugdata for /usr/bin/ls...(no debugging symbols found)...done. (no debugging symbols found)...done. Missing separate debuginfos, use: dnf debuginfo-install coreutils-8.30-6.el8.x86_64 (gdb)Copy to Clipboard Copied! Toggle word wrap Toggle overflow Exit GDB: type q and confirm with Enter.
(gdb) q
(gdb) qCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the command suggested by GDB to install the required
debuginfopackages:dnf debuginfo-install coreutils-8.30-6.el8.x86_64
# dnf debuginfo-install coreutils-8.30-6.el8.x86_64Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
dnfpackage management tool provides a summary of the changes, asks for confirmation and once you confirm, downloads and installs all the necessary files.-
In case GDB is not able to suggest the
debuginfopackage, follow the procedure described in Getting debuginfo packages for an application or library manually.