Chapter 19. Recording and analyzing performance profiles with perf
The perf
tool allows you to record performance data and analyze it at a later time.
Prerequisites
-
You have the
perf
user space tool installed as described in Installing perf.
19.1. The purpose of perf record
The perf record
command samples performance data and stores it in a file, perf.data
, which can be read and visualized with other perf
commands. perf.data
is generated in the current directory and can be accessed at a later time, possibly on a different machine.
If you do not specify a command for perf record
to record during, it will record until you manually stop the process by pressing Ctrl+C
. You can attach perf record
to specific processes by passing the -p
option followed by one or more process IDs. You can run perf record
without root access, however, doing so will only sample performance data in the user space. In the default mode, perf record
uses CPU cycles as the sampling event and operates in per-thread mode with inherit mode enabled.
19.2. Recording a performance profile without root access
You can use perf record
without root access to sample and record performance data in the user-space only.
Prerequisites
-
You have the
perf
user space tool installed as described in Installing perf.
Procedure
Sample and record the performance data:
$ perf record command
Replace
command
with the command you want to sample data during. If you do not specify a command, thenperf record
will sample data until you manually stop it by pressing Ctrl+C.
Additional resources
-
perf-record(1)
man page
19.3. Recording a performance profile with root access
You can use perf record
with root access to sample and record performance data in both the user-space and the kernel-space simultaneously.
Prerequisites
-
You have the
perf
user space tool installed as described in Installing perf. - You have root access.
Procedure
Sample and record the performance data:
# perf record command
Replace
command
with the command you want to sample data during. If you do not specify a command, thenperf record
will sample data until you manually stop it by pressing Ctrl+C.
Additional resources
-
perf-record(1)
man page
19.4. Recording a performance profile in per-CPU mode
You can use perf record
in per-CPU mode to sample and record performance data in both and user-space and the kernel-space simultaneously across all threads on a monitored CPU. By default, per-CPU mode monitors all online CPUs.
Prerequisites
-
You have the
perf
user space tool installed as described in Installing perf.
Procedure
Sample and record the performance data:
# perf record -a command
Replace
command
with the command you want to sample data during. If you do not specify a command, thenperf record
will sample data until you manually stop it by pressing Ctrl+C.
Additional resources
-
perf-record(1)
man page
19.5. Capturing call graph data with perf record
You can configure the perf record
tool so that it records which function is calling other functions in the performance profile. This helps to identify a bottleneck if several processes are calling the same function.
Prerequisites
-
You have the
perf
user space tool installed as described in Installing perf.
Procedure
Sample and record performance data with the
--call-graph
option:$ perf record --call-graph method command
-
Replace
command
with the command you want to sample data during. If you do not specify a command, thenperf record
will sample data until you manually stop it by pressing Ctrl+C. Replace method with one of the following unwinding methods:
fp
-
Uses the frame pointer method. Depending on compiler optimization, such as with binaries built with the GCC option
--fomit-frame-pointer
, this may not be able to unwind the stack. dwarf
- Uses DWARF Call Frame Information to unwind the stack.
lbr
- Uses the last branch record hardware on Intel processors.
-
Replace
Additional resources
-
perf-record(1)
man page
19.6. Analyzing perf.data with perf report
You can use perf report
to display and analyze a perf.data
file.
Prerequisites
-
You have the
perf
user space tool installed as described in Installing perf. -
There is a
perf.data
file in the current directory. -
If the
perf.data
file was created with root access, you need to runperf report
with root access too.
Procedure
Display the contents of the
perf.data
file for further analysis:# perf report
This command displays output similar to the following:
Samples: 2K of event 'cycles', Event count (approx.): 235462960 Overhead Command Shared Object Symbol 2.36% kswapd0 [kernel.kallsyms] [k] page_vma_mapped_walk 2.13% sssd_kcm libc-2.28.so [.] memset_avx2_erms 2.13% perf [kernel.kallsyms] [k] smp_call_function_single 1.53% gnome-shell libc-2.28.so [.] strcmp_avx2 1.17% gnome-shell libglib-2.0.so.0.5600.4 [.] g_hash_table_lookup 0.93% Xorg libc-2.28.so [.] memmove_avx_unaligned_erms 0.89% gnome-shell libgobject-2.0.so.0.5600.4 [.] g_object_unref 0.87% kswapd0 [kernel.kallsyms] [k] page_referenced_one 0.86% gnome-shell libc-2.28.so [.] memmove_avx_unaligned_erms 0.83% Xorg [kernel.kallsyms] [k] alloc_vmap_area 0.63% gnome-shell libglib-2.0.so.0.5600.4 [.] g_slice_alloc 0.53% gnome-shell libgirepository-1.0.so.1.0.0 [.] g_base_info_unref 0.53% gnome-shell ld-2.28.so [.] _dl_find_dso_for_object 0.49% kswapd0 [kernel.kallsyms] [k] vma_interval_tree_iter_next 0.48% gnome-shell libpthread-2.28.so [.] pthread_getspecific 0.47% gnome-shell libgirepository-1.0.so.1.0.0 [.] 0x0000000000013b1d 0.45% gnome-shell libglib-2.0.so.0.5600.4 [.] g_slice_free1 0.45% gnome-shell libgobject-2.0.so.0.5600.4 [.] g_type_check_instance_is_fundamentally_a 0.44% gnome-shell libc-2.28.so [.] malloc 0.41% swapper [kernel.kallsyms] [k] apic_timer_interrupt 0.40% gnome-shell ld-2.28.so [.] _dl_lookup_symbol_x 0.39% kswapd0 [kernel.kallsyms] [k] raw_callee_save___pv_queued_spin_unlock
Additional resources
-
perf-report(1)
man page
19.7. Interpretation of perf report output
The table displayed by running the perf report
command sorts the data into several columns:
- The 'Overhead' column
- Indicates what percentage of overall samples were collected in that particular function.
- The 'Command' column
- Tells you which process the samples were collected from.
- The 'Shared Object' column
- Displays the name of the ELF image where the samples come from (the name [kernel.kallsyms] is used when the samples come from the kernel).
- The 'Symbol' column
- Displays the function name or symbol.
In default mode, the functions are sorted in descending order with those with the highest overhead displayed first.
19.8. Generating a perf.data file that is readable on a different device
You can use the perf
tool to record performance data into a perf.data
file to be analyzed on a different device.
Prerequisites
-
You have the
perf
user space tool installed as described in Installing perf. -
The kernel
debuginfo
package is installed. For more information, see Getting debuginfo packages for an application or library using GDB.
Procedure
Capture performance data you are interested in investigating further:
# perf record -a --call-graph fp sleep seconds
This example would generate a
perf.data
over the entire system for a period ofseconds
seconds as dictated by the use of thesleep
command. It would also capture call graph data using the frame pointer method.Generate an archive file containing debug symbols of the recorded data:
# perf archive
Verification
Verify that the archive file has been generated in your current active directory:
# ls perf.data*
The output will display every file in your current directory that begins with
perf.data
. The archive file will be named either:perf.data.tar.gz
or
perf.data.tar.bz2
19.9. Analyzing a perf.data file that was created on a different device
You can use the perf
tool to analyze a perf.data
file that was generated on a different device.
Prerequisites
-
You have the
perf
user space tool installed as described in Installing perf. -
A
perf.data
file and associated archive file generated on a different device are present on the current device being used.
Procedure
-
Copy both the
perf.data
file and the archive file into your current active directory. Extract the archive file into
~/.debug
:# mkdir -p ~/.debug # tar xf perf.data.tar.bz2 -C ~/.debug
NoteThe archive file might also be named
perf.data.tar.gz
.Open the
perf.data
file for further analysis:# perf report
19.10. Why perf displays some function names as raw function addresses
For kernel functions, perf
uses the information from the /proc/kallsyms
file to map the samples to their respective function names or symbols. For functions executed in the user space, however, you might see raw function addresses because the binary is stripped.
The debuginfo
package of the executable must be installed or, if the executable is a locally developed application, the application must be compiled with debugging information turned on (the -g
option in GCC) to display the function names or symbols in such a situation.
It is not necessary to re-run the perf record
command after installing the debuginfo
associated with an executable. Simply re-run the perf report
command.
Additional Resources
19.11. Enabling debug and source repositories
A standard installation of Red Hat Enterprise Linux does not enable the debug and source repositories. These repositories contain information needed to debug the system components and measure their performance.
Procedure
Enable the source and debug information package channels: The
$(uname -i)
part is automatically replaced with a matching value for architecture of your system:Architecture name Value 64-bit Intel and AMD
x86_64
64-bit ARM
aarch64
IBM POWER
ppc64le
64-bit IBM Z
s390x
19.12. Getting debuginfo packages for an application or library using GDB
Debugging information is required to debug code. For code that is installed from a package, the GNU Debugger (GDB) automatically recognizes missing debug information, resolves the package name and provides concrete advice on how to get the package.
Prerequisites
- The application or library you want to debug must be installed on the system.
-
GDB and the
debuginfo-install
tool must be installed on the system. For details, see Setting up to debug applications. -
Repositories providing
debuginfo
anddebugsource
packages must be configured and enabled on the system. For details, see Enabling debug and source repositories.
Procedure
Start GDB attached to the application or library you want to debug. GDB automatically recognizes missing debugging information and suggests a command to run.
$ gdb -q /bin/ls Reading symbols from /bin/ls...Reading symbols from .gnu_debugdata for /usr/bin/ls...(no debugging symbols found)...done. (no debugging symbols found)...done. Missing separate debuginfos, use: dnf debuginfo-install coreutils-8.30-6.el8.x86_64 (gdb)
Exit GDB: type q and confirm with Enter.
(gdb) q
Run the command suggested by GDB to install the required
debuginfo
packages:# dnf debuginfo-install coreutils-8.30-6.el8.x86_64
The
dnf
package management tool provides a summary of the changes, asks for confirmation and once you confirm, downloads and installs all the necessary files.-
In case GDB is not able to suggest the
debuginfo
package, follow the procedure described in Getting debuginfo packages for an application or library manually.
Additional resources
- How can I download or install debuginfo packages for RHEL systems? — Red Hat Knowledgebase solution