LTC home

Linux Kernel Performance
 LTC Home | Linux Scalability Effort | LinuxKernelPerformance Project | Performance Tools | Disk I/OIOZONE | LMBench | Netbench | Netperf  | SPEC SDET | SPECWeb99 | tiobench | VolanoMark

Linux Performance Tools


This page lists a number of performance tools that are used by the LTC Kernel Performance Team. 

Publicly Available Tools


The SGI Kernprof Patch supports a variety of hierarchical profile modes. The most unique is the "annotated call-graph" mode. This mode provides kernel data that is analyzed by the gprof program; however it requires that the kernel be recompiled using a patched version of gcc. In addition, the -pg option is required and the -fomit-frame-pointers option to gcc is removed. Obviously the resulting, instrumented kernel can have significantly different performance than the base kernel. Nonetheless, the SGI Kernprof patch provides a wealth of information about the time spent in the various kernel routines and the call patterns among those routines.

It should be noted that the gprof output from annotated call-graph mode combines raw profile data (obtained by recording the instruction address at each clock tick) with information about function entry as generated by mcount. The data about time spent in each function, thus, is not an exact measurement (since there is no recording at function exit) but an approximation based upon the data collected by the flat profiling.

Kernprof has other interesting modes as well. Among those are the some that allow profiling by programming one of the Pentium III performance counters. This can be combined with call-graph profiling (the flat profiling information is collected when the performance counter overflows rather than when the timer ticks) and, of course with flat profiling. For example, one can cause an interrupt to occur every million instructions or every 100,000 cache-line misses. Just as time-based profiling shows where the kernel spends its time, an instruction-based profile shows where the kernel executes most of its instruction and a cache-line based profile shows where the kernel takes most of its cache-line misses. These latter types of profiling can provide additional insight into kernel performance.

For more information on kernprof, visit SGI-kernprof.


The SGI Lockmeter Patch provides a facility to measure usage statistics for kernel spin locks. High contention for a particular lock may indicate that a portion of the kernel needs to be rewritten to use more fine-grained locking.

For more information on Lockmeter, visit SGI-lockmeter.


Gcov is a test coverage program, which helps discover where your optimization efforts will best affect your code. Using gcov one can find out some basic performance statistics on a per source file level such as gcov is already available for user level applications. We implemented gcov support for the linux kernel, by providing coverage support infrastructure to the kernel and a dynamic module (gcov-prof.o) to produce the basic block profile information, which gives the statistics for the running kernel and modules.

For more information on gcov, visit 


Strace is a very useful tool for analyzing the system call activity on a system. A variety of options allows for detailed as well as summarized reports.

For more information on Strace, visit The Strace Homepage .


The Linux Trace Tool from OperSys is a comprehensive trace facility for Linux. The IBM LTC RAS team is using it for its prototype logging facility and the IBM DProbes tool is coordinated with LTT.

LTT collects trace from a fixed (but expandable with DProbes) instrumentation of the kernel. Post processing is by a graphical tool, which does not appear to be widely used for performance analysis. The tool lacks post processing tools for more detailed analysis of the data collected by the tracing.

For more information on LTT, visit OperSys.


The following is excerpts from the DProbes page at the external LTC site :

Dynamic Probes is a generic and pervasive debugging facility that will operate under the most extreme software conditions such as debugging a deep rooted operating system problem in a live environment, for example in the page-manager of the kernel or perhaps a problem that will not re-create easily in either a lab or production environment. For such inaccessible problem scenarios Dynamic Probes not only offers a technique for gathering diagnostic information but has a high probability of successful outcome without the need to build custom modules for debugging purposes.

The DProbes facility can be used to insert software probes dynamically into executing code modules. When a probe is fired, a user written probe-handler is executed. The probe-handler is a program written in an assembly-like language, based on the Reverse Polish Notation. Instructions are provided to enable the probe-handler to access all the hardware registers, system data structures and memory.

Some of the unique aspects of the Dynamic Probes facility are:

You can now use IBM's DProbes with Opersys' Linux Trace Toolkit to provide a universal (dynamic) tracing capability for Linux. It is universal because it provides a common tracing mechanism for all executables whether in user or kernel space. It is dynamic because tracepoints are defined and applied dynamically to object modules as probepoints using DProbes - no source code modification is required.

Tools in Development


This tool by Eric Wu from IBM Research is available from . The tool is still in development but can certainly be used as is. It provides a GUI with a hierarchical view of the parameters, and lets you change any parameter if you are logged in as root.

last updated: 10/01/2002