10.4. Performance Tools
Red Hat Enterprise Linux 7 includes updates to the most recent versions of several performance tools, such as oprofile, papi, and elfutils, bringing performance, portability, and functionality improvements.
Moreover, Red Hat Enterprise Linux 7 premiers:
- Support for Performance Co-Pilot;
- SystemTap support for (DynInst-based) instrumentation that runs entirely in unprivileged user space, as well as efficient (Byteman-based) pinpoint probing of Java applications;
- Valgrind support for hardware transactional memory and improvements in modeling vector instructions.
10.4.1. Performance Co-Pilot
Red Hat Enterprise Linux 7 introduces support for Performance Co-Pilot (PCP), a suite of tools, services, and libraries for acquisition, archiving, and analysis of system-level performance measurements. Its light-weight, distributed architecture makes it particularly well suited to centralized analysis of complex systems.
Performance metrics can be added using the Python, Perl, C++ and C interfaces. Analysis tools can use the client APIs (Python, C++, C) directly, and rich web applications can explore all available performance data using a JSON interface.
For further information, see the Index of Performance Co-Pilot (PCP) articles, solutions, tutorials and white papers on the Customer Portal, or consult the extensive manual pages in the pcp and pcp-libs-devel packages. The pcp-doc package installs documentation in the
/usr/share/doc/pcp-doc/*
directory, which also includes the Performance Co-Pilot User's and Administrator's Guide as well as the Performance Co-Pilot Programmer's Guide.
10.4.2. SystemTap
Red Hat Enterprise Linux 7 includes systemtap version 2.4, which brings several new capabilities. These include optional pure user-space script execution, richer and more efficient Java probing, virtual machine probing, improved error messages, and a number of bug fixes and new features. In particular, the following:
- Using the
dyninst
binary-editing library, SystemTap can now execute some scripts purely at user-space level; no kernel or root privileges are used. This mode, selected using thestap --dyninst
i command, enables only those types of probes or operations that affect only the user's own processes. Note that this mode is incompatible with programs that throw C++ exceptions; - A new way of injecting probes into Java applications is supported in conjunction with the byteman tool. New SystemTap probe types,
java("com.app").class("class_name").method("name(signature)").*
, enable probing of individual methodenter
andexit
events in an application, without system-wide tracing; - A new facility has been added to the SystemTap driver tooling to enable remote execution on a libvirt-managed KVM instance running on a server. It enables automated and secure transfer of a compiled SystemTap script to a virtual machine guest across a dedicated secure virtio-serial link. A new guest-side daemon loads the scripts and transfers their output back to the host. This way is faster and does not require IP-level networking connection between the host and the guest. To test this function, run the following command:
stap --remote=libvirt://MyVirtualMachine
- In addition, a number of improvements have been made to SystemTap's diagnostic messages:
- Many error messages now contain cross-references to the related manual pages. These pages explain the errors and suggest corrections;
- If a script input is suspected to contain typographic errors, a sorted suggestion list is offered to the user. This suggestion facility is used in a number of contexts when user-specified names may mismatch acceptable names, such as probed function names, markers, variables, files, aliases, and others;
- Diagnostic duplicate-elimination has been improved;
- ANSI coloring has been added to make messages easier to understand.
10.4.3. Valgrind
Red Hat Enterprise Linux 7 includes Valgrind, an instrumentation framework that includes a number of tools to profile applications. This version is based on the Valgrind 3.9.0 release and includes numerous improvements relative to the Red Hat Enterprise Linux 6 and Red Hat Developer Toolset 2.1 counterparts, which were based on Valgrind 3.8.1.
Notable new features of Valgrind included in Red Hat Enterprise Linux 7 are the following:
- Support for IBM System z Decimal Floating Point instructions on hosts that have the DFP facility installed;
- Support for IBM POWER8 (Power ISA 2.07) instructions;
- Support for Intel AVX2 instructions. Note that this is available only on 64-bit architectures;
- Initial support for Intel Transactional Synchronization Extensions, both Restricted Transactional Memory (RTM) and Hardware Lock Elision (HLE);
- Initial support for Hardware Transactional Memory on IBM PowerPC;
- The default size of the translation cache has been increased to 16 sectors, reflecting the fact that large applications require instrumentation and storage of huge amounts of code. For similar reasons, the number of memory mapped segments that can be tracked has been increased by a factor of 6. The maximum number of sectors in the translation cache can be controlled by the new flag
--num-transtab-sectors
; - Valgrind no longer temporarily creates a mapping of the entire object to read from it. Instead, reading is done through a small fixed sized buffer. This avoids virtual memory spikes when Valgrind reads debugging information from large shared objects;
- The list of used suppressions (displayed when the
-v
option is specified) now shows, for each used suppression, the file name and line number where the suppression is defined; - A new flag,
--sigill-diagnostics
can now be used to control whether a diagnostic message is printed when the just-in-time (JIT) compiler encounters an instruction it cannot translate. The actual behavior — delivery of the SIGILL signal to the application — is unchanged. - The Memcheck tool has been improved with the following features:
- Improvements in handling of vector code, leading to significantly fewer false error reports. Use the
--partial-loads-ok=yes
flag to get the benefits of these changes; - Better control over the leak checker. It is now possible to specify which kind of leaks (definite, indirect, possible, and reachable) should be displayed, which should be regarded as errors, and which should be suppressed by a given leak suppression. This is done using the options
--show-leak-kinds=kind1,kind2,..
,--errors-for-leak-kinds=kind1,kind2,..
and an optionalmatch-leak-kinds:
line in suppression entries, respectively;Note that generated leak suppressions contain this new line and are therefore more specific than in previous releases. To get the same behavior as previous releases, remove thematch-leak-kinds:
line from generated suppressions before using them; - Reduced
possible leak
reports from the leak checker by the use of better heuristics. The available heuristics provide detection of valid interior pointers to std::stdstring, to new[] allocated arrays with elements having destructors, and to interior pointers pointing to an inner part of a C++ object using multiple inheritance. They can be selected individually using the--leak-check-heuristics=heur1,heur2,...
option; - Better control of stacktrace acquisition for heap-allocated blocks. Using the
--keep-stacktraces
option, it is possible to control independently whether a stack trace is acquired for each allocation and deallocation. This can be used to create better "use after free" errors or to decrease Valgrind's resource consumption by recording less information; - Better reporting of leak suppression usage. The list of suppressions used (shown when the
-v
option is specified) now shows, for each leak suppression, how many blocks and bytes it suppressed during the last leak search.
- The Valgrind GDB server integration has been improved with the following monitoring commands:
- A new monitor command,
v.info open_fds
, that gives the list of open file descriptors and additional details; - A new monitor command,
v.info execontext
, that shows information about the stack traces recorded by Valgrind; - A new monitor command,
v.do expensive_sanity_check_general
, to run certain internal consistency checks.