Linux Performance Tools
Introduction
This page lists a number of performance tools that are used by the LTC Kernel Performance Team.
Publicly Available Tools
-
The SGI Kernprof Profile Patch
-
The SGI Lockmeter Patch
-
Gcov Coverage Support
-
The Strace System Call Tracing Facility
-
The Linux Trace Toolkit (LTT)
-
The Dynamic Probe Facility from IBM (DProbes)
Tools in Development
-
The Kparam visual tool for tunable kernel parameters
Publicly Available Tools
The SGI Kernprof Patch supports a variety of hierarchical profile modes.
The most unique is the "annotated call-graph" mode. This mode provides
kernel data that is analyzed by the gprof program; however it requires
that the kernel be recompiled using a patched version of gcc. In addition,
the -pg option is required and the -fomit-frame-pointers option to gcc
is removed. Obviously the resulting, instrumented kernel can have significantly
different performance than the base kernel. Nonetheless, the SGI Kernprof
patch provides a wealth of information about the time spent in the various
kernel routines and the call patterns among those routines.
It should be noted that the gprof output from annotated call-graph mode
combines raw profile data (obtained by recording the instruction address
at each clock tick) with information about function entry as generated
by mcount. The data about time spent in each function, thus, is not an
exact measurement (since there is no recording at function exit) but an
approximation based upon the data collected by the flat profiling.
Kernprof has other interesting modes as well. Among those are the some
that allow profiling by programming one of the Pentium III performance
counters. This can be combined with call-graph profiling (the flat profiling
information is collected when the performance counter overflows rather
than when the timer ticks) and, of course with flat profiling. For example,
one can cause an interrupt to occur every million instructions or every
100,000 cache-line misses. Just as time-based profiling shows where the
kernel spends its time, an instruction-based profile shows where the kernel
executes most of its instruction and a cache-line based profile shows where
the kernel takes most of its cache-line misses. These latter types of profiling
can provide additional insight into kernel performance.
For more information on kernprof, visit SGI-kernprof.
The SGI Lockmeter Patch provides a facility to measure usage statistics
for kernel spin locks. High contention for a particular lock may indicate
that a portion of the kernel needs to be rewritten to use more fine-grained
locking.
For more information on Lockmeter, visit SGI-lockmeter.
Gcov is a test coverage
program, which helps discover where your optimization efforts will best
affect your code. Using gcov one can find out some basic performance statistics
on a per source file level such as
-
how often each line of code execute
-
what lines of code are actually executed
-
how much computing time each section of code uses
gcov is already available for user level applications. We implemented gcov
support for the linux kernel, by providing coverage support infrastructure
to the kernel and a dynamic module (gcov-prof.o) to produce the basic block
profile information, which gives the statistics for the running kernel
and modules.
For more information on gcov, visit http://www-es.fernuni-hagen.de/cgi-bin/info2html?(gcc)Gcov
Strace is a very useful tool for analyzing the system call
activity on a system. A variety of options allows for detailed as well
as summarized reports.
For more information on Strace, visit The
Strace Homepage .
The Linux Trace Tool from OperSys is a comprehensive trace facility for
Linux. The IBM LTC RAS team is using it for its prototype logging facility
and the IBM DProbes tool is coordinated with LTT.
LTT collects trace from a fixed (but expandable with DProbes) instrumentation
of the kernel. Post processing is by a graphical tool, which does not appear
to be widely used for performance analysis. The tool lacks post processing
tools for more detailed analysis of the data collected by the tracing.
For more information on LTT, visit OperSys.
The following is excerpts from the
DProbes page at the external LTC site :
Dynamic Probes is a generic and pervasive debugging facility that will
operate under the most extreme software conditions such as debugging a
deep rooted operating system problem in a live environment, for example
in the page-manager of the kernel or perhaps a problem that will not re-create
easily in either a lab or production environment. For such inaccessible
problem scenarios Dynamic Probes not only offers a technique for gathering
diagnostic information but has a high probability of successful outcome
without the need to build custom modules for debugging purposes.
The DProbes facility can be used to insert software probes dynamically
into executing code modules. When a probe is fired, a user written probe-handler
is executed. The probe-handler is a program written in an assembly-like
language, based on the Reverse Polish Notation. Instructions are provided
to enable the probe-handler to access all the hardware registers, system
data structures and memory.
Some of the unique aspects of the Dynamic Probes facility are:
-
Probes can be placed almost anywhere.
-
Probes can be placed in any executable code, including the kernel, even
in interrupt handlers, kernel modules etc.
-
Read access to all the hardware registers and write access to most of them.
-
Read/write access to any area in the virtual address space that is currently
resident in physical memory.
-
Probes placed on an executable program or shared library are active globally
under the context of all processes executing it.
-
Probes can be placed on programs that are being run under a debugger.
-
External debugging facilities (e.g. kdb, crash dump) can be triggered from
a probe handler.
-
Probes can be placed on specific types of memory accesses using h/w watchpoints.
You can now use IBM's DProbes with Opersys' Linux Trace Toolkit to provide
a universal (dynamic) tracing capability for Linux. It is universal because
it provides a common tracing mechanism for all executables whether in user
or kernel space. It is dynamic because tracepoints are defined and applied
dynamically to object modules as probepoints using DProbes - no source
code modification is required.
Tools in Development
This tool by Eric Wu from IBM Research is available from http://enterpriselinux.watson.ibm.com/kparam
. The tool is still in development but can certainly be used as is. It
provides a GUI with a hierarchical view of the parameters, and lets you
change any parameter if you are logged in as root.
last updated: 10/01/2002