Chapter 3. Debugging Applications
You can use common techniques to debug applications in multiple situations.
3.1. Enabling Debugging with Debugging Information Copy linkLink copied to clipboard!
To debug applications and libraries, you must obtain debugging information.
3.1.1. Debugging information Copy linkLink copied to clipboard!
Debugging information links binary code to source code so you can inspect variables and execution flow. The GNU Compiler Collection (GCC) generates this data in the Debug With Arbitrary Record Format (DWARF) within Executable and Linkable Format (ELF) files, and tools such as the GNU Debugger (GDB) use it to analyze program behavior.
Red Hat Enterprise Linux uses the ELF format for executable binaries, shared libraries, or debuginfo files. Within these ELF files, the DWARF format is used to hold the debug information.
To display DWARF information stored within an ELF file, run the readelf -w file command.
STABS (Symbol Table format) is an older, less capable debugging format, occasionally used with UNIX. For debugging on RHEL, use the DWARF format instead. GCC and GDB support STABS on a best-effort basis only. Tools such as Valgrind and elfutils do not support STABS.
3.1.2. Enabling debugging of C and C++ applications with the GCC Copy linkLink copied to clipboard!
To debug C and C++ applications effectively, generate debugging information during compilation. Use GCC’s -g option to create this data. Debuggers use this data to map executable code to source lines for inspecting variables and logic.
Prerequisites
-
You have the
gccpackage installed.
Procedure
Compile and link your code with the
-goption to generate debugging information:$ gcc ... -g ...Optional: Set the optimization level to
-Og:$ gcc ... -g -Og ...Compiler optimizations can make executable code hard to relate to the source code. The
-Ogoption optimizes the code without interfering with debugging. However, be aware that changing optimization levels can alter the program’s behavior.Optional: Use
-gfor moderate debugging information, or-g3to include macro definitions:$ gcc ... -g3 ...
Verification
Test the code by using the
-fcompare-debugGCC option:$ gcc -fcompare-debug ...This option tests code compiled with and without debug information. If the resulting binaries are identical, the executable code is not affected by debugging options. By using the
-fcompare-debugoption significantly increases compilation time.
3.1.3. Debuginfo and debugsource packages Copy linkLink copied to clipboard!
The debuginfo and debugsource packages contain debugging information and source code for programs and libraries. To debug Red Hat Enterprise Linux applications, install these packages from additional repositories.
- Debugging information package types
-
Debuginfo packages: Thedebuginfopackages provide debugging information needed to provide human-readable names for binary code features. These packages contain.debugfiles, which contain DWARF debugging information. These files are installed to the/usr/lib/debugdirectory.
Debugsource packages: The debugsource packages contain the source files used for compiling the binary code. With both debuginfo and debugsource packages installed, debuggers such as GDB or LLDB can relate the execution of binary code to the source code. The source code files are installed to the /usr/src/debug directory.
3.1.4. Getting debuginfo packages for an application or library using GDB Copy linkLink copied to clipboard!
To obtain the necessary debuginfo packages for troubleshooting installed applications or libraries, use the GNU Debugger (GDB). It automatically detects missing symbols and identifies the specific packages needed. Follow GDB’s recommendations to install these packages and enable full debugging capabilities.
Prerequisites
- The application or library you want to debug must be installed on the system.
-
GDB and the
debuginfo-installtool must be installed on the system. -
Repositories providing
debuginfoanddebugsourcepackages must be configured and enabled on the system. For details, see Enabling debug and source repositories.
Procedure
Start GDB attached to the application or library you want to debug. GDB automatically recognizes missing debugging information and suggests a command to run.
$ gdb -q /bin/lsReading symbols from /bin/ls...Reading symbols from .gnu_debugdata for /usr/bin/ls...(no debugging symbols found)...done. (no debugging symbols found)...done. Missing separate debuginfos, use: dnf debuginfo-install coreutils-9.5-6.el10.x86_64 (gdb)Exit GDB: type q and confirm with Enter.
(gdb) qRun the command suggested by GDB to install the required
debuginfopackages:# dnf debuginfo-install coreutils-9.5-6.el10.x86_64The
dnfpackage management tool provides a summary of the changes, asks for confirmation and once you confirm, downloads and installs all the necessary files.In case GDB is not able to suggest the
debuginfopackage, follow the procedure described in
3.1.5. Getting debuginfo packages for an application or library manually Copy linkLink copied to clipboard!
To manually determine which debuginfo packages you need to install, locate the executable file and find the package that installs it.
Use GDB to determine the packages for installation. Use this manual procedure only if GDB is not able to suggest the package to install.
Prerequisites
- The application or library must be installed on the system.
- The application or library was installed from a package.
-
The
debuginfo-installtool must be available on the system. -
Channels providing the
debuginfopackages must be configured and enabled on the system.
Procedure
Find the executable file of the application or library.
Use the
whichcommand to find the application file.$ which less/usr/bin/lessUse the
locatecommand to find the library file.$ locate libz | grep so/usr/lib64/libz.so.1 /usr/lib64/libz.so.1.2.11If the original reasons for debugging include error messages, pick the result where the library has the same additional numbers in its file name as those mentioned in the error messages. If in doubt, try following the rest of the procedure with the result where the library file name includes no additional numbers.
NoteThe
locatecommand is provided by themlocatepackage. To install it and enable its use:# dnf install mlocateUpdate the database:
# updatedb
Search for a name and version of the package that provided the file:
$ rpm -qf /usr/lib64/libz.so.1.3.1.zlib-ngzlib-ng-compat-2.2.3-1.el10.x86_64The output provides details for the installed package in the name:epoch-version.release.architecture format.
ImportantIf this step does not produce any results, it is not possible to determine which package provided the binary file. There are several possible cases:
- The file is installed from a package which is not known to package management tools in their current configuration.
-
The file is installed from a locally downloaded and manually installed package. Determining a suitable
debuginfopackage automatically is impossible in that case. - Your package management tools are misconfigured.
-
The file is not installed from any package. In such a case, no corresponding
debuginfopackage exists.
Because further steps depend on this one, you must resolve this situation or stop this procedure. Describing the exact troubleshooting steps is beyond the scope of this procedure.
Install the
debuginfopackages using thednf debuginfo-installutility. In the command, use the package name and other details you determined during the previous step:# dnf debuginfo-install zlib-ng-compat-2.2.3-1.el10.x86_64
3.2. Inspecting Application Internal State with GDB Copy linkLink copied to clipboard!
To identify why an application fails, you can control its execution and examine its internal state with the GNU Debugger (GDB).
3.2.1. GNU debugger (GDB) Copy linkLink copied to clipboard!
Use the GNU Debugger (GDB) to inspect program execution and post-crash states. Also, you can analyze internal data and control execution flow when tracking down runtime errors. This command-line tool shows the detailed application state needed to identify and fix bugs in complex applications.
- GDB capabilities
A single GDB session can debug the following types of programs:
- Multithreaded and forking programs
- Multiple programs at once
-
Programs on remote machines or in containers with the
gdbserverutility connected over a TCP/IP network connection
- Debugging requirements
To debug any executable code, GDB requires debugging information for that particular code:
- For programs developed by you, you can create the debugging information while building the code.
- For system programs installed from packages, you must install their debuginfo packages.
3.2.2. Attaching GDB to a process Copy linkLink copied to clipboard!
To examine a system process, attach the GNU Debugger (GDB) to the process.
Prerequisites
Procedure
Start a program with GDB.
Launch a program using GDB:
$ gdb programReplace program with a file name or path to the program.
GDBsets up to start execution of the program. You can set up breakpoints and thegdbenvironment before beginning the execution of the process with theruncommand. For more information on setting breakpoints, see Using GDB breakpoints to stop execution at defined code locations.- Attach GDB to an already running process.
Find the process ID (pid) with the
pscommand:$ ps -C program -o pid hpidReplace program with a file name or path to the program.
Attach GDB to this process:
$ gdb -p pidReplace pid with an actual process ID number from the
psoutput.
Attach an active GDB session to a running program:
Use the
shellGDB command to run thepscommand and find the program’s process ID (pid):(gdb) shell ps -C program -o pid hpidReplace program with a file name or path to the program.
Use the
attachcommand to attach GDB to the program:(gdb) attach pidReplace pid by an actual process ID number from the
psoutput.NoteIn some cases, GDB might not be able to find the corresponding executable file. Use the
filecommand to specify the path:(gdb) file path/to/program
3.2.3. Program execution control with GDB Copy linkLink copied to clipboard!
When the GNU Debugger (GDB) has been attached to a program, you can use several commands to control the execution of the program. These commands parse code, set breakpoints, and control program flow during debugging sessions.
To use these GDB commands effectively, you must have debugging information available through these methods:
- The program is compiled and built with debugging information
- The relevant debuginfo packages are installed
- GDB must be attached to the program to be debugged
GDB provides commands for stepping through and controlling program execution:
r (run): Start the execution of the program. If run is executed with any arguments, those arguments are passed on to the executable as if the program has been started normally. Users normally issue this command after setting breakpoints.
start: Start the execution of the program but stop at the beginning of the program’s main function. If start is executed with any arguments, those arguments are passed on to the executable as if the program has been started normally.
c (continue): Continue the execution of the program from the current state. The execution of the program will continue until one of the following becomes true:
- A breakpoint is reached.
- A specified condition is satisfied.
- A signal is received by the program.
- An error occurs.
- The program terminates.
n (next): Continue the execution of the program from the current state, until the next line of code in the current source file is reached. The execution of the program will continue until one of the following becomes true:
- A breakpoint is reached.
- A specified condition is satisfied.
- A signal is received by the program.
- An error occurs.
- The program terminates.
s (step): The step command also halts execution at each sequential line of code in the current source file. However, if the execution is currently stopped at a source line containing a function call, GDB stops the execution after entering the function call (rather than executing it).
until location: Continue the execution until the code location specified by the location option is reached.
fini (finish): Resume the execution of the program and halt when execution returns from a function. The execution of the program will continue until one of the following becomes true:
- A breakpoint is reached.
- A specified condition is satisfied.
- A signal is received by the program.
- An error occurs.
- The program terminates.
q (quit): Terminate the execution and exit GDB.
3.2.4. Showing program internal values with GDB Copy linkLink copied to clipboard!
Displaying the values of a program’s internal variables is important for understanding of what the program is doing. The GNU Debugger (GDB) offers multiple commands that you can use to inspect the internal variables. The following list describes the most useful of these commands.
p(print)Display the value of the argument given. Usually, the argument is the name of a variable of any complexity, from a simple single value to a structure. An argument can also be an expression valid in the current language, including the use of program variables and library functions, or functions defined in the program being tested.
It is possible to extend GDB with pretty-printer Python or Guile scripts for customized display of data structures (such as classes, structs) using the
printcommand.bt(backtrace)Display the chain of function calls used to reach the current execution point, or the chain of functions used up until execution was terminated. This is useful for investigating serious bugs (such as segmentation faults) with elusive causes.
Adding the
fulloption to thebacktracecommand displays local variables, too.It is possible to extend GDB with frame filter Python scripts for customized display of data displayed using the
btandinfo framecommands. The term frame refers to the data associated with a single function call.infoThe
infocommand is a generic command to provide information about various items. It takes an option specifying the item to describe.-
The
info argscommand displays options of the function call that is the currently selected frame. -
The
info localscommand displays local variables in the currently selected frame.
For a list of the possible items, run the command
help infoin a GDB session:(gdb) help info-
The
l(list)- Show program source code. When the program has been started, but is currently stopped, this command lists the source code where the program is currently stopped, along with a few lines of context. Before the program is started, this will list the main function. While not strictly a command to show internal state, list helps the user understand what changes to the internal state will happen in the next step of the program’s execution.
3.2.5. Using GDB breakpoints to stop execution at defined code locations Copy linkLink copied to clipboard!
During a debugging session, you often need to investigate only specific sections of code. Breakpoints are markers that instruct GDB to stop the execution of a program at a defined location. Since breakpoints are typically associated with lines of source code, placing them requires you to specify the correct source file and line number.
Procedure
To place a breakpoint.
Specify the name of the source code file and the line in that file:
(gdb) br file:lineWhen file is not present, name of the source file at the current point of execution is used:
(gdb) br lineAlternatively, use a function name to put the breakpoint on its start:
(gdb) br function_nameA program might encounter an error after a certain number of iterations of a task. To specify an additional condition to halt execution:
(gdb) br file:line if conditionReplace condition with a condition in the C or C++ language. The meaning of file and line is the same as above.
To inspect the status of all breakpoints and watchpoints:
(gdb) info brTo remove a breakpoint by using its number as displayed in the output of
info br:(gdb) delete numberTo remove a breakpoint at a given location:
(gdb) clear file:line
3.2.6. Using GDB watchpoints to stop execution on data access and changes Copy linkLink copied to clipboard!
GDB watchpoints pause execution when data changes or is accessed. Use them to debug unexpected variable corruption when the cause is unknown. Setting a watchpoint stops the program at the exact moment of access to inspect the state and find the root cause.
Procedure
To place a watchpoint for data change (write):
(gdb) watch expressionReplace expression with an expression that describes what you want to watch. For variables, expression is equal to the name of the variable.
To place a watchpoint for data access (read):
(gdb) rwatch expressionTo place a watchpoint for any data access (both read and write):
(gdb) awatch expressionTo inspect the status of all watchpoints and breakpoints:
(gdb) info brTo remove a watchpoint:
(gdb) delete numReplace the num option with the number reported by the
info brcommand.
3.2.7. Debugging forking or threaded programs with GDB Copy linkLink copied to clipboard!
Some programs use forking or threads to achieve parallel code execution. To debug multiple simultaneous execution paths, you can use a variety of commands based on your use case.
- Debugging forked programs with GDB
Forking is a situation when a program (parent) creates an independent copy of itself (child). Use the following settings and commands to affect what GDB does when a fork occurs:
The
follow-fork-modesetting controls whether GDB follows the parent or the child after the fork.set follow-fork-mode parent- After a fork, debug the parent process. This is the default.
set follow-fork-mode child- After a fork, debug the child process.
show follow-fork-mode-
Display the current setting of
follow-fork-mode.
The
set detach-on-forksetting controls whether the GDB keeps control of the other (not followed) process or leaves it to run.set detach-on-fork on-
The process which is not followed (depending on the value of
follow-fork-mode) is detached and runs independently. This is the default. set detach-on-fork off-
GDB keeps control of both processes. The process which is followed (depending on the value of
follow-fork-mode) is debugged as usual, while the other is suspended. show detach-on-fork-
Display the current setting of
detach-on-fork.
- Debugging Threaded Programs with GDB
-
GDB has the ability to debug individual threads, and to manipulate and examine them independently. To make GDB stop only the thread that is examined, use the commands
set non-stop onandset target-async on. You can add these commands to the.gdbinitfile. After that functionality is turned on, GDB is ready to conduct thread debugging.
GDB uses a concept of current thread. By default, commands apply to the current thread only.
info threads-
Display a list of threads with their
idandgidnumbers, indicating the current thread. thread id-
Set the thread with the specified
idas the current thread. thread apply ids command-
Apply the command
commandto all threads listed byids. Theidsoption is a space-separated list of thread ids. A special valueallapplies the command to all threads. break location thread id if condition-
Set a breakpoint at a certain
locationwith a certainconditiononly for the thread numberid. watch expression thread id-
Set a watchpoint defined by
expressiononly for the thread numberid. command&-
Run command
commandand return immediately to the gdb prompt(gdb), continuing any code execution in the background. interrupt- Halt execution in the background.
Additional resources
3.3. Recording Application Interactions Copy linkLink copied to clipboard!
The executable code of applications interacts with the code of the operating system and shared libraries. Recording an activity log of these interactions can provide enough insight into the application’s behavior without debugging the actual application code. Alternatively, analyzing an application’s interactions can help pinpoint the conditions in which a bug manifests.
3.3.1. Tools for recording application interactions Copy linkLink copied to clipboard!
To record application interactions, you can use several tools are available in RHEL. For system calls use strace, for library calls use ltrace, and for advanced probing SystemTap. Select the appropriate tool to log specific runtime behaviors and diagnose integration issues.
- strace
The
stracetool primarily enables logging of system calls (kernel functions) used by an application.-
The
stracetool can provide a detailed output about calls, becausestraceinterprets parameters and results with knowledge of the underlying kernel code. Numbers are turned into the corresponding constant names, bitwise combined flags expanded to flag list, pointers to character arrays dereferenced to provide the actual string, and more. Support for more recent kernel features might be lacking. - To reduce the amount of captured data, filter the traced calls.
-
The use of
stracedoes not require any particular setup except for setting up the log filter. -
Tracing the application code with
straceresults in significant slowdown of the application’s execution. As a result,straceis not suitable for many production deployments. As an alternative, consider usingltraceor SystemTap. -
The version of
straceavailable in Red Hat Developer Toolset can also perform system call tampering. This capability is useful for debugging.
-
The
- ltrace
The
ltracetool enables logging of an application’s user space calls into shared objects (dynamic libraries).-
The
ltracetool enables tracing calls to any library. - To reduce the amount of captured data, filter the traced calls.
-
The use of
ltracedoes not require any particular setup except for setting up the log filter. -
The
ltracetool is lightweight and fast, offering an alternative tostrace: it is possible to trace the corresponding interfaces in libraries such asglibcwithltraceinstead of tracing kernel functions withstrace. -
Because
ltracedoes not handle a known set of calls such asstrace, it does not attempt to explain the values passed to library functions. Theltraceoutput contains only raw numbers and pointers. The interpretation ofltraceoutput requires consulting the actual interface declarations of the libraries present in the output.
NoteIn Red Hat Enterprise Linux 10, a known issue prevents
ltracefrom tracing system executable files. This limitation does not apply to executable files built by users.-
The
- SystemTap
SystemTap is an instrumentation platform for probing running processes and kernel activity on the Linux system. SystemTap uses its own scripting language for programming custom event handlers.
-
Compared to using
straceandltrace, scripting the logging means more work in the initial setup phase. However, the scripting capabilities extend SystemTap’s usefulness beyond just producing logs. - SystemTap works by creating and inserting a kernel module. The use of SystemTap is efficient and does not create a significant slowdown of the system or application execution on its own.
- SystemTap includes a set of usage examples.
-
Compared to using
- GDB
The GNU Debugger (GDB) is primarily meant for debugging, not logging. However, some of its features make it useful even in the scenario where an application’s interaction is the primary activity of interest.
- With GDB, it is possible to conveniently combine the capture of an interaction event with immediate debugging of the subsequent execution path.
- GDB is best suited for analyzing response to infrequent or singular events, after the initial identification of problematic situation by other tools. Using GDB in any scenario with frequent events becomes inefficient or even impossible.
Additional resources
3.3.2. Monitoring an application’s system calls with strace Copy linkLink copied to clipboard!
To monitor the system (kernel) calls performed by an application, use the strace tool.
Prerequisites
Procedure
Identify the system calls to monitor.
Start
straceand attach it to the program.If the program you want to monitor is not running, start
straceand specify the program:$ strace -fvttTyy -s 256 -e trace=call programIf the program is already running, find its process id (pid):
$ ps -C programAttach
straceto the process:$ strace -fvttTyy -s 256 -e trace=call -ppid
-
Replace call with the system calls to be displayed. You can use the
-e trace=calloption multiple times. If left out,stracewill display all system call types. See the strace(1) manual page for more information. -
If you do not want to trace any forked processes or threads, omit the
-foption.
The
stracetool displays the system calls made by the application and their details.In most cases, an application and its libraries make a large number of calls and
straceoutput displays immediately, if no filter for system calls is set.The
stracetool exits when the program exits.To terminate the monitoring before the traced program exits, press .
-
If
stracestarted the program, the program terminates together withstrace. -
If you attached
straceto an already running program, the program terminates together withstrace.
-
If
Analyze the list of system calls done by the application.
- Problems with resource access or availability are present in the log as calls returning errors.
- Values passed to the system calls and patterns of call sequences provide insight into the causes of the application’s behaviour.
- If the application crashes, the important information is probably at the end of log.
The output contains a large amount of unnecessary information. However, you can construct a more precise filter for the system calls of interest and repeat the procedure.
NoteIt is advantageous to both see the output and save it to a file. Use the
teecommand to achieve this:$ strace ... |& tee your_log_file.log
3.3.3. Monitoring application’s library function calls with ltrace Copy linkLink copied to clipboard!
To monitor an application’s calls to functions available in libraries (shared objects), use the ltrace tool.
Prerequisites
Procedure
- Identify the libraries and functions of interest, if possible.
Start
ltraceand attach it to the program.NoteIn Red Hat Enterprise Linux 10, a known issue prevents
ltracefrom tracing system executable files. This limitation does not apply to executable files built by users.If the program you want to monitor is not running, start
ltraceand specify program:$ ltrace -f -l library -e function programIf the program is already running, find its process id (pid):
$ ps -C programAttach
ltraceto the process:$ ltrace -f -l library -e function -ppid programUse the
-e,-fand-loptions to filter the output:-
Supply the function names to be displayed as function. The
-e functionoption can be used multiple times. If left out,ltracedisplays calls to all functions. -
Instead of specifying functions, you can specify whole libraries with the
-l libraryoption. This option behaves similarly to the-e functionoption. -
If you do not want to trace any forked processes or threads, omit the
-foption.
See the ltrace(1)_ manual page for more information.
-
Supply the function names to be displayed as function. The
ltracedisplays the library calls made by the application.In most cases, an application makes a large number of calls and
ltraceoutput displays immediately, if no filter is set.ltraceexits when the program exits.To terminate the monitoring before the traced program exits, press .
-
If
ltracestarted the program, the program terminates together withltrace. -
If you attached
ltraceto an already running program, the program terminates together withltrace.
-
If
-
Stop
ltraceby pressing Ctrl+C. Analyze the list of library calls done by the application.
- If the application crashes, the important information is probably at the end of log.
The output contains a large amount of unnecessary information. However, you can construct a more precise filter and repeat the procedure.
NoteIt is advantageous to both see the output and save it to a file. Use the
teecommand to achieve this:$ ltrace ... |& tee your_log_file.log
3.3.4. Monitoring application’s system calls with SystemTap Copy linkLink copied to clipboard!
To register custom event handlers for kernel events, use the SystemTap tool. SystemTap is more efficient than the strace tool, but requires more setup. The included strace.stp script provides strace-like functionality. Installing SystemTap also installs the strace.stp script, which provides an approximation of the strace functionality when using SystemTap.
Procedure
Find the process ID (pid) of the process you want to monitor:
$ ps -auxRun SystemTap with the
strace.stpscript:# stap /usr/share/systemtap/examples/process/strace.stp -x pidThe value of pid is the process id.
The script is compiled to a kernel module, which is then loaded. This introduces a slight delay between entering the command and getting the output.
- When the process performs a system call, the call name and its parameters are printed to the terminal.
-
The script exits when the process terminates, or when you press
Ctrl+C.
3.3.5. Using GDB to intercept application system calls Copy linkLink copied to clipboard!
To stop program execution when the program performs specific system calls, use GNU Debugger (GDB) catchpoints, then inspect the program state and system call parameters at those points.
Prerequisites
Procedure
Set the catchpoint:
(gdb) catch syscall syscall-nameThe command
catch syscallsets a special type of breakpoint that halts execution when the program performs a system call.The
syscall-nameoption specifies the name of the call. You can specify multiple catchpoints for various system calls. Leaving out thesyscall-nameoption causes GDB to stop on any system call.Start execution of the program.
If the program has not started execution, start it:
(gdb) rIf the program execution is halted, resume it:
(gdb) c
- GDB halts execution after the program performs any specified system call.
- Use further GDB commands to examine the program state and advance execution, according to the particular situation.
To exit the GDB debugging session:
(gdb) q
3.3.6. Using GDB to intercept handling of signals by applications Copy linkLink copied to clipboard!
To stop the execution of a program under specific circumstances, you can use the GNU Debugger (GDB). To stop the execution when the program receives a signal from the operating system, use a GDB catchpoint.
Prerequisites
Procedure
Set the catchpoint:
(gdb) catch signal signal-typeThe command
catch signalsets a special type of a breakpoint that halts execution when a signal is received by the program. Thesignal-typeoption specifies the type of the signal. Use the special value'all'to catch all signals.Let the program run.
If the program has not started execution, start it:
(gdb) rIf the program execution is halted, resume it:
(gdb) c
- GDB halts execution after the program receives any specified signal.
- Continue by debugging the program code handling the signal, or resume execution with the knowledge of the signal being received.
Later, to exit the GDB debugging session:
(gdb) q
3.4. Debugging a Crashed Application Copy linkLink copied to clipboard!
Sometimes, it is not possible to debug an application directly. In these situations, you can collect information about the application at the moment of its termination and analyze it afterwards.
3.4.1. Core dumps: what they are and how to use them Copy linkLink copied to clipboard!
A core dump records parts of an application’s memory when the application stops. After an application fails, you can analyze the core dump, along with the executable and debuginfo, by using a debugger.
The Linux operating system kernel can record core dumps automatically, if this functionality is enabled. Alternatively, send a signal to any running application to generate a core dump regardless of its actual state.
Some limits might affect the ability to generate a core dump. To see the current limits:
$ ulimit -a
3.4.2. Recording application crashes with core dumps Copy linkLink copied to clipboard!
To record application crashes, set up core dump saving and add information about the system.
Procedure
To enable core dumps, ensure that the
/etc/systemd/system.conffile contains the following lines:DumpCore=yes DefaultLimitCORE=infinityYou can also add comments describing if these settings were previously present, and what the previous values were. This will enable you to reverse these changes later, if needed. Comments are lines starting with the
#character.Changing the file requires administrator level access.
Apply the new configuration:
# systemctl daemon-reexecRemove the limits for core dump sizes:
# ulimit -c unlimitedTo reverse this change, run the command with value
0instead ofunlimited.Install the
sospackage which provides thesosreportutility for collecting system information:# dnf install sos-
When an application crashes, a core dump is generated and handled by
systemd-coredump. Create an SOS report to provide additional information about the system:
# sosreportThis creates a
.tararchive containing information about your system, such as copies of configuration files.Locate the core dump:
$ coredumpctl list executable-nameExport the core dump:
$ coredumpctl dump executable-name > /path/to/file-for-exportIf the application crashed multiple times, output of the first command lists more captured core dumps. In that case, construct for the second command a more precise query by using the other information. See the coredumpctl(1) manual page for details.
Transfer the core dump and the SOS report to the computer where the debugging will take place. Transfer the executable file, too, if it is known.
ImportantWhen the executable file is not known, subsequent analysis of the core file identifies it.
- Optional: Remove the core dump and SOS report after transferring them, to free up disk space.
3.4.3. Inspecting application crash states with core dumps Copy linkLink copied to clipboard!
To inspect the state of an application at the moment it terminated unexpectely, use core dumps.
Prerequisites
- You must have a core dump file and sosreport from the system where the crash occurred.
- GDB and elfutils must be installed on your system.
Procedure
To identify the executable file where the crash occurred, run the
eu-unstripcommand with the core dump file:$ eu-unstrip -n --core=./core.98140x400000+0x207000 2818b2009547f780a5639c904cded443e564973e@0x400284 /usr/bin/sleep /usr/lib/debug/bin/sleep.debug [exe] 0x7fff26fff000+0x1000 1e2a683b7d877576970e4275d41a6aaec280795e@0x7fff26fff340 . - linux-vdso.so.1 0x35e7e00000+0x3b6000 374add1ead31ccb449779bc7ee7877de3377e5ad@0x35e7e00280 /usr/lib64/libc-2.14.90.so /usr/lib/debug/lib64/libc-2.14.90.so.debug libc.so.6 0x35e7a00000+0x224000 3ed9e61c2b7e707ce244816335776afa2ad0307d@0x35e7a001d8 /usr/lib64/ld-2.14.90.so /usr/lib/debug/lib64/ld-2.14.90.so.debug ld-linux-x86-64.so.2The output contains details for each module on a line, separated by spaces. The information is listed in this order:
- The memory address where the module was mapped
- The build-id of the module and where in the memory it was found
-
The module’s executable file name, displayed as
-when unknown, or as.when the module has not been loaded from a file -
The source of debugging information, displayed as a file name when available, as
.when contained in the executable file itself, or as-when not present at all -
The shared library name (soname) or
[exe]for the main module
In this example, the important details are the file name
/usr/bin/sleepand the build-id2818b2009547f780a5639c904cded443e564973eon the line containing the text[exe]. With this information, you can identify the executable file required for analyzing the core dump.Get the executable file that crashed.
- If possible, copy it from the system where the crash occurred. Use the file name extracted from the core file.
You can also use an identical executable file on your system. Each executable file built on Red Hat Enterprise Linux contains a note with a unique build-id value. Determine the build-id of the relevant locally available executable files:
$ eu-readelf -n executable_fileUse this information to match the executable file on the remote system with your local copy. The build-id of the local file and build-id listed in the core dump must match.
-
Finally, if the application is installed from an RPM package, you can get the executable file from the package. Use the
sosreportoutput to find the exact version of the package required.
- Get the shared libraries used by the executable file. Use the same steps as for the executable file.
- If the application is distributed as a package, load the executable file in GDB, to display hints for missing debuginfo packages. For more details, see Getting debuginfo packages for an application or library using GDB.
To examine the core file in detail, load the executable file and core dump file with GDB:
$ gdb -e executable_file -c core_fileFurther messages about missing files and debugging information help you identify what is missing for the debugging session. Return to the previous step if needed.
If the application’s debugging information is available as a file instead of as a package, load this file in GDB with the
symbol-filecommand:(gdb) symbol-file program.debugReplace program.debug with the actual file name.
NoteIt might not be necessary to install the debugging information for all executable files contained in the core dump. Most of these executable files are libraries used by the application code. These libraries might not directly contribute to the problem you are analyzing, and you do not need to include debugging information for them.
Use the GDB commands to inspect the state of the application at the moment it crashed. See Inspecting Application Internal State with GDB.
NoteWhen analyzing a core file, GDB is not attached to a running process. Commands for controlling execution have no effect.
To see only the stack of the application at the moment it terminated, open the core file with the
eu-stackutility:$ eu-stack --core core-fileThis will display listing of the application’s stack.
Additional resources
3.4.4. Creating and accessing a core dump with coredumpctl Copy linkLink copied to clipboard!
To manage and analyze core dumps directly on the affected system, use coredumpctl. This tool simplifies finding, capturing, and inspecting crash data. Identify an unresponsive process, force a core dump, and verify its successful capture to diagnose application failures.
Prerequisites
The system must be configured to use
systemd-coredumpfor core dump handling. To verify this is true:$ sysctl kernel.core_patternThe configuration is correct if the output starts with the following:
kernel.core_pattern = |/usr/lib/systemd/systemd-coredump
Procedure
Find the PID of the hung process, based on a known part of the executable file name:
$ pgrep -a executable-name-fragmentThis command will output a line in the form
PID command-lineUse the command-line value to verify that the PID belongs to the intended process.
For example:
$ pgrep -a bc5459 bcSend an abort signal to the process:
# kill -ABRT PIDVerify that the core has been captured by
coredumpctl:$ coredumpctl list PIDFor example:
$ coredumpctl list 5459TIME PID UID GID SIG COREFILE EXE Thu 2019-11-07 15:14:46 CET 5459 1000 1000 6 present /usr/bin/bcFurther examine or use the core file as needed.
You can specify the core dump by PID and other values. See the coredumpctl(1) manual page for further details.
Next steps
To show details of the core file, run:
$ coredumpctl info PIDTo load the core file in the GDB debugger, run:
$ coredumpctl debug PIDDepending on the availability of debugging information, GDB might suggest commands to run, such as:
Missing separate debuginfos, use: dnf debuginfo-install bc-1.07.1-23.el10.x86_64For more details on this process, see Getting debuginfo packages for an application or library using GDB.
To export the core file for further processing elsewhere, run:
$ coredumpctl dump PID > /path/to/file_for_exportReplace /path/to/file_for_export with the file where you want to put the core dump.
$ coredumpctl dump PID > /path/to/file_for_export
3.4.5. Dumping process memory with gcore Copy linkLink copied to clipboard!
To capture the memory state of a running process without terminating it, use the gcore utility. This creates a core dump file for offline analysis. The result is a snapshot of the application’s memory that helps you investigate issues while the service remains available.
Prerequisites
Procedure
Identify the process id (pid). Use tools such as
ps,pgrep, andtop:$ ps -C some-programDump the memory of this process:
$ gcore -o filename pidThis creates a file filename and dumps the process memory in it. While the memory is being dumped, the execution of the process is halted.
- After the core dump is finished, the process resumes normal execution.
Create an SOS report to provide additional information about the system:
# sosreportThis creates a tar archive containing information about your system, such as copies of configuration files.
- Transfer the program’s executable file, core dump, and the SOS report to the computer where the debugging will take place.
- Optional: Remove the core dump and SOS report after transferring them, to free up disk space.
Additional resources
3.4.6. Dumping protected process memory with GDB Copy linkLink copied to clipboard!
To dump protected process memory, configure GNU Debugger (GDB) to ignore core dump filters. Capture memory regions flagged as non-dumpable, such as those conserving resources or holding sensitive data. Use the gcore command within GDB to generate the complete core file.
Prerequisites
Procedure
Set GDB to ignore the settings in the
/proc/PID/coredump_filterfile:(gdb) set use-coredump-filter offSet GDB to ignore the memory page flag
VM_DONTDUMP:(gdb) set dump-excluded-mappings onDump the memory:
(gdb) gcore core-fileReplace core-file with name of file where you want to dump the memory.
3.5. Debugging applications in containers Copy linkLink copied to clipboard!
To troubleshoot container applications, you can use various command-line tools.
This is not a complete list of command-line tools. The choice of tool for debugging a container application is heavily based on the container image and your use case.
For instance, the systemctl, journalctl, ip, netstat, ping, traceroute, perf, iostat tools might need root access because they interact with system-level resources such as networking, systemd services, or hardware performance counters, which are restricted in rootless containers for security reasons.
Rootless containers operate without requiring elevated privileges, running as non-root users within user namespaces to provide improved security and isolation from the host system. They offer limited interaction with the host, reduced attack surface, and enhanced security by mitigating the risk of privilege escalation vulnerabilities.
Rootful containers run with elevated privileges, typically as the root user, granting full access to system resources and capabilities. While rootful containers offer greater flexibility and control, they pose security risks due to their potential for privilege escalation and exposure of the host system to vulnerabilities.
For more information about rootful and rootless containers, see Creating a rootless container with bind mount by using the podman RHEL system role and Special considerations for rootless containers.
- Systemd and Process Management Tools
-
systemctl: Controls systemd services within containers, allowing start, stop, enable, and disable operations.journalctl: Views logs generated by systemd services, aiding in troubleshooting container issues. - Networking Tools
-
ip: Manages network interfaces, routing, and addresses within containers.netstat: Displays network connections, routing tables, and interface statistics.ping: Verifies network connectivity between containers or hosts.traceroute: Identifies the path packets take to reach a destination, useful for diagnosing network issues. - Process and Performance Tools
-
ps: Lists currently running processes within containers.top: Provides real-time insights into resource usage by processes within containers.htop: Interactive process viewer for monitoring resource utilization.perf: CPU performance profiling, tracing, and monitoring, aiding in pinpointing performance bottlenecks within the system or applications.vmstat: Reports virtual memory statistics within containers, aiding in performance analysis.iostat: Monitors input/output statistics for block devices within containers.gdb(GNU Debugger): A command-line debugger that helps in examining and debugging programs by allowing users to track and control their execution, inspect variables, and analyze memory and registers during runtime. For more information, see the Debugging applications within Red Hat OpenShift containers article.strace: Intercepts and records system calls made by a program, aiding in troubleshooting by revealing interactions between the program and the operating system. - Security and Access Control Tools
-
sudo: Enables executing commands with elevated privileges.chroot: Changes the root directory for a command, helpful in testing or troubleshooting within a different root directory. - Podman-Specific Tools
-
podman logs: Batch-retrieves whatever logs are present for one or more containers at the time of execution.podman inspect: Displays the low-level information on containers and images as identified by name or ID.podman events: Monitor and print events that occur in Podman. Each event includes a timestamp, a type, a status, a name (if applicable), and an image (if applicable). The default logging mechanism isjournald.podman run --health-cmd: Use the health check to determine the health or readiness of the process running inside the container.podman top: Display the running processes of the container.podman exec: Running commands in or attaching to a running container is extremely useful to get a better understanding of what is happening in the container.podman export: When the container fails, it is difficult to know the reasons. Exporting the filesystem structure from the container will allow for checking other logs files that might not be in the mounted volumes.