Debug and Profiling Guide
From RidgeRun Developer Connection
This document is intent for users of the RidgeRun SDK on any hardware platform who want to learn about the features and usage of debugging and profiling tools provided on the software, or how to integrate with hardware debugging tools.
This document provides information on the configuration of your SDK for debugging proposes along with examples for use. Debugging covers user space Linux applications, and profiling covers complete system profiling (kernel, drivers, applications).
Introduction to debugging with RidgeRun SDK
Whenever there is a software development task, there might be the need to reproduce certain scenarios that are known to make the process fail or stop working.
In most open source operating systems, the most widely used development tools are those released by the Free Software Foundation, including the GNU Compiler Collection (gcc) and the GNU Debugger (gdb).
The RidgeRun SDK provides those utilities as part of the toolchain, and many get installed in the target hardware as specified by the SDK configuration.
The tools are available only in the executing environment, that is, once the Linux kernel has loaded and the system is up and running. Because of this, they won't be available in other stages of the system, for example in the bootloader (rrload) or during kernel initialization.
For debugging early stage system execution, some hardware tools should be used, like JTAG emulators, that can emulate and / or execute the embedded processor instructions and provide the required software and tools for the developer to determine what is causing the behavior being inspected.
How to debug different components of the SDK
It isn't possible to debug the rrload bootloader with software tools, the only way to do this is with hardware assisted debugging, using a JTAG that can allow to step through the assembler instructions one by one until the error is reproduced.
Some open source bootloaders like u-boot may support gdb remote debugging, in such cases please contact RidgeRun for further details and instructions.
It is possible to debug the Linux kernel using software tools, particularly KGDB (http://kgdb.linsyssoft.com/).
KGDB is a debugger for the Linux kernel. It requires two machines that are inter-connected. The connection may either be an RS-232 interface using a null modem cable, or via the UDP/IP networking protocol (KGDB over Ethernet, KGDBoE), or on ARM processors is also possible to use the Debug Communications Channel (DCC). KGDB is implemented as a patch to Linux kernel. The target machine (the one being debugged) runs the patched kernel and the other (host) machine runs gdb. The GDB remote protocol is used between the two machines.
Applications in RidgeRun SDK works by using the standard ELF (Executable and Linking Format) file format, which is the the most common format used in Linux for executable files and has several characteristics, many which have a direct impact on how to proceed with debugging.
The ELF format is widely used in most, if not all, of the UNIX variants, and in many of other specialized devices, such as consumer electronic platforms. Such flexibility could only be attained due to ELF file format design and architecture independence.
ELF allows the use of shared objects between binary files. An implication of this is the use of libraries, for example, many of the functions available in the C programming language are actually part of the C Standard Library, of which there are many implementations.
It is common for a desktop computer using Linux, to have the GNU C Library (glibc) as the implementation used to supply all the functions of the C Standard Library. Because the glibc library was designed to supply all the necessary functions at the expense of compatibility, it is bloated and sometimes slower than expected.
However, in embedded devices it is a better choice to use the uClibc implementation, which was designed from the ground up to supply the C Standard Library to micro controllers and other devices that don't have a Memory Management Unit, and also that usually provides a very restricted environment regarding available memory in both storage and system memory. Other restricted implementations for embedded devices include dietlibc and newlib.
Most RidgeRun SDK makes use of the uClibc implementation, providing a fast, small and efficient alternative to glibc. However some custom SDK's may use glibc, contact RidgeRun if you have doubts about the specific C library used in your SDK.
Other libraries that might be different to the ones used in desktop systems are the POSIX thread implementation, which is usually a part of the C Standard Library, and as such, it may be different when using uClibc instead of glibc.
There are many factors involved in the debugging process. If an application fails it could be due to several different factors.
For example, if an application takes a lot of time accomplishing a task, there may be a need to use a profiler. The SDK includes Oprofile for this purpose. A profiler allows you to see how much of the execution time was used in which functions or shared objects and libraries.
Another situation might be the introduction of new libraries to a project, for example, a graphics library, which might be the cause of an error, then the best choice would be to use ltrace to check if the use of the provided functionality is the one causing the error.
If there is an strange behavior, like segmentation faults in certain operations, the best choice would be to use strace to see which were the last system calls before the error to check if there is a system situation external to the application that could be the cause of the error.
In the end, if there are many factors and the error occurs in a specific environment, the best alternative would be to use gdb to step through the source code statement by statement until the error is reproduced again.
Hardware assisted debugging with JTAG tools
It is possible to perform symbolic debugging assisted by hardware on ARM devices using JTAG tools. Many third party tools offers Linux-aware debugging solutions that may be adapted on the SDK.
This section details how to perform debugging with common JTAG solutions, and their scope and limitations under different debugging scenarios. Contact RidgeRun if you are interested on more information or support for hardware assisted debugging.
Debugging with Lauterbach T32
RidgeRun has extensive experience using T32 (Lauterbach) JTAG debuggers to instrument and debug bootloader, kernel, applications or modules.
T32 has great support for RTOSs including Linux, and is capable of debug, instrument and profile the complete system. Contact RidgeRun to get training or applications notes and sample scripts for your target hardware usage with T32 (RidgeRun T32 scripts for the target hardware are a good complement to T32 RTOS debugging documentation). Embedded Trace Macrocell Support ARM targets that support ETM technology may be used along with debugging tools like T32 for performance analysis, or application tracing. Some SystemOnChip include Embedded Trace Buffers integrated to the ETM interface, given a lower cost option for ETM debugging support.
When using T32 and a proper ETM license, the ETB functionality from T32 can be used on platforms that support it to debug and trace Linux applications.
For more information go to Getting Started Guide for Lauterbach.
Debugging with Texas Instruments Code Composer Studio (CCS)
CCS is a powerful IDE for DSP development, but also includes support for the ARM cores of the TI's SoCs. The DSP debugging and tracing functions are out of the scope of this document.
CCS supports the following debug features: Advance ARM control: MMU table listing, Exception Vector traps, SW/HW breakpoints Symbolic Assembly or Source Code Debugging
How to perform symbolic debugging with CCS
Symbolic debugging with CCS requires the following steps:
- Compile your code with debugging support enabled and code optimizations disabled (see section Requirements for debugging support. For kernel debugging see Kernel debugging techniques).
- The SDK generates executable files in ELF format; CCS supports loading code and symbols from ELF formats. The following table details the location of typical images locations:
|bootloader|| bootloader/<bootloader version>/src/|
|Linux kernel||kernel kernel/$(KERNEL)/vmlinux|
|Linux applications||The application executable itself|
- Connect CCS to the target hardware (consult with RidgeRun if you need specific instructions to setup your hardware with CCS). CCS can only connect to the target and reset it, it doesn't support attaching to the board without producing a reset.
- Proceed to load the symbols following the menus: File -> Load Symbols -> Load Symbols Only... CCS will only look for .out and .sym files, but you may change the filter settings to select the ELF image file that you are using. If the ELF file was compiled properly with debug symbols enabled.
- CCS will report that can't find source files and ask for a location to find them. This happens because the debugging files include path locations on Unix format (which are invalid on Windows), you will need to point CCS to the right file location for the code that is debugging.
Debugging with low-cost OpenOCD based JTAG solutions
The Open On-Chip Debugger (openocd) aims to provide debugging, in-system programming and boundary-scan testing for embedded target devices. The targets are interfaced using JTAG (IEEE 1149.1) compliant hardware, but this may be extended to other connection types in the future.
Openocd currently supports Wiggler (clones), FTDI FT2232 based JTAG interfaces, the Amontec JTAG Accelerator, and the Gateworks GW1602. It allows ARM7 (ARM7TDMI and ARM720t), ARM9 (ARM920t, ARM922t, ARM926ej-s, ARM966e-s), XScale (PXA25x, IXP42x) and Cortex-M3 (Luminary Stellaris LM3 and ST STM32) based cores to be debugged.
For more information visit: http://openfacts.berlios.de/index-en.phtml?title=Open_On-Chip_Debugger
OpenOCD is available on Ubuntu distributions from 7.10, but can compiled for other host systems.
OpenOCD systems doesn't provide powerful debugging solutions as others JTAG tools, but provide a effective low-cost JTAG solution for big developer teams, or manufacturing support. RidgeRun can provide applications notes and scripts for your particular hardware usage with OpenOCD, contact RidgeRun for specific information on your board.
Software debugging with GDB
GDB, the GNU Project debugger, allows you to see what is going on `inside' another program while it executes -- or what another program was doing at the moment it crashed.
GDB supports four main features to help you catch bugs:
- Start your program, specifying anything that might affect its behavior.
- Make your program stop on specified conditions.
- Examine what has happened, when your program has stopped.
- Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.
GDB also permits remote debugging by running a program (gdbserver) on the remote target, making it suitable for debugging embedded systems.
For more information on GDB visit:
The RidgeRun SDK includes a version of GDB that run on the host machine as well a version of gdbserver for the target hardware.
Requirements for debugging support
To obtain the most precise results while debugging, it is necessary to set a few flags to the compiler to tell it to annotate the binary file with information about the corresponding source file, and also to turn off all the unnecessary optimization flags to the compiler. This allows the generated binary files to be free from changes that may be induced from the compiler.
Set the SDK toolchain tools in the PATH
Usually embedded developers may have several different toolchains installed on their machines, therefore the SDK doesn't relay on finding the tools on the system PATH, as it may mistakenly use the wrong tools , but instead sets the PATH during the build process according to the information stored on the SDK configuration system (which include the path to the toolchain tools).
When performing debugging operations it may be useful to have the toolchain tools on the system PATH, so the SDK provides a method to export the right tools quickly on your console session:
$ cd <RidgeRunSDK Devdir> $ `make env`
The env target of the makefile will print the shell exports to set the variables DEVDIR and PATH properly to work with the SDK and the enclosure of reversed quotes will execute those commands on the shell. The DEVDIR variable is useful to be able to perform make commands inside subdirectories of the SDK.
Produce debugging information (GCC flag -g)
The flag required to produce debugging information is “-g”. This will include references to the source code such as line numbers and name of the functions that will allow to add breakpoints so that the developer could step to and from them when trying to pinpoint errors.
To add this switch flag run “make config” and go to:
RidgeRun SDK Configuration Toolchain configurations Toolchain optimize flags
Press ENTER to select the option, then a text field will appear, then add the string “-g”.
Size Optimization (GCC flag -Os)
One optimization flag that should be removed altogether is the “optimize for size” flag. It is enabled by default because it produces smaller binary files. That are desirable in embedded applications due to the restrained storage space. At the time of debugging, optimized binary file code will not match the source code, making debugging more difficult.
To remove this switch flag run “make config” and go to: RidgeRun SDK Configuration
Toolchain configurations Toolchain optimize flags
A text field will appear where the option is enabled by default, move to where this option is, and delete the text.
Keep the frame pointer (remove GCC flag -fomit-frame-pointer)
Keep the frame pointer in a register even for functions that don’t need one. This is required to make proper debugging possible.
To add this switch flag run “make config” and go to:
RidgeRun SDK Configuration Toolchain configurations Toolchain optimize flags
Press ENTER to select the option, then a text field will appear, then be sure the string “-fomit-frame-pointer” is not present on the field.
Don't strip symbols from binaries (SDK Option)
This option allows the binaries (executables and shared object files) to be stripped from many symbols that aren't needed for execution, including debugging information. By default this option is disabled, and should be used mostly in the latest stages of development and quality assurance to guarantee a small footprint of the file system.
To check the status of this option:
RidgeRun SDK Configuration File System Configuration ---> [ ] Strip the target file system binaries (may render debug useless)
(Check that the option is not marked)
Using gdb and gdbserver for cross debugging
Gdbserver is a program that allows you to run GDB on a different machine than the one which is running the program being debugged.
In this particular situation, we will run gdbserver in the target hardware, then we will connect gdb running in the host to that gdbserver instance and start the debugging process.
First you need gdbserver to be installed on your target. On the configuration screen (make config) go to File System Configuration and select the option Instal GDB server on targer file system.
Gdbserver requires three arguments to be set, the host IP address, a listening port, and the program to be debugged with its own arguments. The command line format is:
gdbserver host_address:listening_port command [arguments]
We will open the target listening port number 2345, and the program to be executed is named “sleeptest”, which is in the same directory and requires no arguments, the example command line will be:
gdbserver :2345 sleeptest Process ./sleeptest created; pid = 251 Listening on port 2345
If you are trying to debug a process that is already running, you can attach using the Process ID (PID)
ps gdbserver :2345 --attach $PROCESS_ID_TO_DEBUG
where $PROCESS_ID_TO_DEBUG is the number you obtained by running ps.
Example output is:
gdbserver :2345 --attach 24601 Attached; pid = 24601 Listening on port 2345
By itself, the gdbserver isn't very functional. The debugging process followed by gdbserver is guided by gdb running in the host computer.
Setting up the host gdb.
We need to execute a gdb program that understands how to debug ARM code, because of this, the use of an stock gdb (As provided by any Linux distribution package system) isn't possible.
The SDK provides an appropriate gdb in the toolchain, called arm-linux-gnueabi-gdb in the case of ARM platforms. For documentation purposes we will refer to this command with just “gdb”.
After starting gdb, it will provide you with a console with the prefix command prompt (gdb). This command line interface is where the debugging will take place.
First, we must tell gdb where to look for the shared object and libraries, this is necessary because the target system will be using those libraries when executing the program being debugged.
(gdb) set solib-absolute-prefix <path to devdir>/fs/fs
After setting that path, we need to point the host gdb to where the binary file being debugged is located with its full path, so that gdb can load the symbols and other annotations appended at compile time, this is possible with the line:
(gdb) file <path to file>
As a tip, gdb allows auto completion of pathnames just as in the shell command line interface, start writing the path and press TAB to automatically complete it if there is a match available. After this, we need to connect to the target hardware with the following command:
(gdb) target remote <ip address:port number>
$ cd <RidgeRunSDK> $ `make env` $ arm-linux-gdb GNU gdb 184.108.40.20680318 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=x86_64-pc-linux-gnu --target=arm-linux-uclibcgnueabi". (gdb) set solib-absolute-prefix ./fs/fs (gdb) file ./fs/fs/examples/simple_threads Reading symbols from ./fs/fs/examples/simple_threads...done. (gdb) target remote 192.168.200.245:2345 Remote debugging using 192.168.200.245:2345 [New Thread 267] 0x40003c58 in _start () from ./lib/ld-uClibc.so.0
After connecting to the hardware you will see be able to perform all the standard debugging commands of gdb, including multithreading debugging.
For more information on gdb command line usage please visit:
Integration of gdb with ddd graphical debugger
If you would like to use a graphical front end with gdb, there are several compatible front ends available. One popular simple front end is the ddd utility (http://www.gnu.org/software/ddd). Below is how you would invoke the gdb, using the ddd graphical front end:
$ ddd --debugger arm-linux-gdb
Generating and reading core dump files
When a process or application crashes, a corresponding signal is generated and each one of these has a current disposition which determines how the process behaves when it is delivered the signal. Thus, there are signals which default action is to terminate the process, generate a dump core, stop a process or continue the process (if stopped).
As mentioned, the default action of certain Linux kernel signals is to cause a process to terminate and produce a core dump file, a disk file containing an image of the process's memory at the time of termination.
The generated dump file can be read using gdb on host side, among other debugging tasks, the user can check on the state of the processor's register in the moment of the process' crash; a back trace of the functions executed prior to the crash is also available. All this information becomes very handy when executing debugging tasks on applications / processes that do not provide debugging information while in runtime or that crash with no further information.
The procedure to enable core dumping on Linux kernel is pretty simple, and should be limited to setting the maximum size of the dump file.
By default, the core file is called core and is created in the current directory. There are several reasons for a core file not to be generated or to be empty:
- The process does not have permission to write the core file. Writing the core file will fail if the directory in which it is to be created is non-writable.
- The directory in which the core dump file is to be created does not exist.
- RLIMIT_CORE or RLIMIT_FSIZE resources limits for a process are set to zero (can be verified through the getrlimit() function).
- The process is executing a set-user-ID (set-group-ID) program that is owned by a user (group) other than the real user (group) ID of the process (i.e. a process that belongs to root).
Generating the core dump file (TARGET side)
- Once the target has been successfully booted, proceed set the dump file size to a value different from zero (can also be set to unlimited, however, this is not recommended).
/#ulimit -c 10000 /#ulimit -a time(seconds) unlimited file(blocks) unlimited data(kb) unlimited stack(kb) 8192 coredump(blocks) 10000 memory(kb) unlimited locked memory(kb) unlimited process 944 nofiles 1024 vmemory(kb) unlimited locks unlimited
- Proceed to create a temporary directory to store the dump files with open privileges.
# mkdir -m 777 /tmp/dumps
- Set the core file name. By default, a core dump file is name core, but the /proc/sys/kernel/core_pattern file can be set to define a template that is used to name core dump files. The basic specifiers for the dump name:
%% A single % character.
%p PID of dumped process.
%u Real UID of dumped process.
%g Real GID of dumped process.
%s Number of signal causing dump
%t Time of dump (EPOC time)
%e Executable filename
The following line has the following objectives:
- set the /tmp/dumps as the destination path for the dump files.
- set the dump file's name as <name_of_the_application>.core
/ # echo "/tmp/dumps/%e.core" > /proc/sys/kernel/core_pattern
- Run the application. The crashing application/process will generate an output similar to the following (using the “Hello World” example included in RidgeRun's user applications examples and in case a Segmentation Fault is generated).
# ./hello Segmentation fault (core dumped)
- Before running gdb on host side, both the application's binary file and the generated dump file should be placed on the same directory and should be accessible from the host (a suitable directory would be target's /opt directory if using NFS file system).
Running gdb on host side
Once the dump file has been generated, gdb can be executed on the host side.
- Compile the application with the -ggdb flag for debugging purposes.
- Locate the directory where both the application's binary and dump file are located and proceed to run gdb (if using NFS filesystem and with both files on /opt): <DEVDIR>/fs/fs/opt$ arm-linux-gnueabi-gdb
<DEVDIR>/fs/fs/opt$ arm-linux-gnueabi-gdb GNU gdb 220.127.116.1180318 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=i686-linux –target=arm-linux-gnueabi". (gdb)
- Load the application as the program to be debugged
(gdb) file hello Reading symbols from <DEVDIR>/fs/fs/opt/hello...done.
- Load dump's memory and processor's registers data
(gdb) core hello.core Core was generated by `./hello'. Program terminated with signal 11, Segmentation fault. [New process 366] #0 0x000083a8 in ?? ()
- Display the backtrace of the stack (gdb) info stack
(gdb) info stack #0 main (argc=1, argv=0xbed38eb4, envp=0xbed38ebc) at hello.c:41
- List the processor's registers and the content of these.
(gdb) info all-registers r0 0x1 1 r1 0xbed38eb4 3201535668 r2 0xbed38ebc 3201535676 r3 0x1072c 67372 r4 0x0 0 r5 0x1 1 r6 0x0 0 r7 0x0 0 r8 0x0 0 r9 0x0 0 r10 0x40024000 1073889280 r11 0x0 0 r12 0x0 0 sp 0xbed38d50 0xbed38d50 lr 0x400399cc 1073977804 pc 0x83a8 0x83a8 <main+24> f0 0 (raw 0x000000000000000000000000) f1 0 (raw 0x000000000000000000000000) f2 0 (raw 0x000000000000000000000000) f3 0 (raw 0x000000000000000000000000) f4 0 (raw 0x000000000000000000000000) f5 0 (raw 0x000000000000000000000000) f6 0 (raw 0x000000000000000000000000) f7 0 (raw 0x000000000000000000000000) fps 0x0 0 cpsr 0x60000010 1610612752
Eclipse Integrated Development Environment (IDE)
If you would like to use the eclipse integrated development environment for application development, it is easiest to start with an existing, mostly empty, SDK application directory that builds properly.
For debugging description below, it is assumed you are using root NFS mount so when you rebuild the code it is available on the target. The steps are similar for other configurations, but you need to get the update d executable being debugged on the target hardware each time your rebuild the application.
Create SDK application directory
For the description below, DEVDIR is set to your development directory path and the name of the application being created is ticker. Run the following commands to get an SDK application directory populated.
cd $DEVDIR/myapps mkdir ticker cp hello/Config hello/Makefile ticker cp hello/hello.c ticker/ticker.c
Edit the files in the ticker directory to rename from hello to ticker. In the Makefile, enter the following lines after the include statement, if they are not already there:
CFLAGS += $(EXTRA_CFLAGS)
Verify the application builds correctly by issuing the make command.
cd $DEVDIR/myapps/ticker make
Create eclipse project
Launch eclipse and specify the DEVDIR directory as the eclipse workspace.
Create the ticker project as follows (pressing Next and changing tabs as needed):
File -> New -> Project -> C -> Standard Make C Project <Next> Project name: ticker Location: <ticker directory> <Next> Make Builder Build (Incremental Build): build install EXTRA_CFLAGS+=-g \ EXTRA_CFLAGS+=-O0 EXTRA_CFLAGS+=-fno-omit-frame-pointer <Finish>
Note: the second EXTRA_CFLAGS above is capital letter 'O' followed by a zero '0'.
Verify you can build the application from within Eclipse by issuing
Project -> Build All
Create debug configuration
To get remote gdb to work correctly, you need to create a debug configuration using the following steps:
- Create a gdbinit file on the base of your SDK with the following contents:
set solib-absolute-prefix <path to devdir>/fs/fs
This instructs gdb to search any library on the target's root file system at the host.
- Then configure Eclipse as follows:
Run -> Debug... Create,Manage, and run configurations C/C++ Local Application <double click> Name: ticker remote debug Project: ticker <Browse...> C/C++ Application: ticker <Search Project...> Debugger Debugger: gdbserver Debugger Debugger Options Main GDB debugger: /opt/RidgeRun/arm-eabi-uclibc/bin/arm-linux-gdb GDB Command set: Standard (Linux) GDB command file: <Select the previously generated gdbinit file> Verbose console mode: do not check unless you want to debug the gdb connection, this option slows down the responsiveness of the debugger Connection TCP IP address: <target IP address> Port number: 2345
The host path to the GDB dependings on the toolchain you have installed. You can check in:
As describe previously, start gdbserver on the target hardware:
cd /examples # or where every your application is installed gdbserver host_address:2345 ./ticker
Connect to gdbserver
All that is left is to press the Debug button to start debugging.
If everything worked correctly, you will see the ticker.c file displayed with a blue arrow pointing to main() indicating that program counter hit a breakpoint at main(). You can now debug.
If you see the error “Target selection failed” they means the networking parameters on the gdbserver line or in the Eclipse debug configuration are incorrect. Try to establish a telnet session to make sure the desktop and target can exchange TCP packets.
If something isn't working, inspect the gdb console window (enable the verbose console mode is useful to get more information on the problem). After you get remote debugging working correctly, you can disable “Verbose console mode” in the debug configuration to turn off gdb console output.
If the filesystem is build without stripping the debugging information (see previous sections) the debugger should be capable of loading the information of the functions on the system libraries (like libc or libpthread), but to be able to symbolic debug inside such libraries the source code of the toolchain must be installed. The SDK installer provides an option to install the toolchain source code.
On any case, do not hesitate to contact RidgeRun for support.
Is often useful to trace what an application is doing, but on operating systems with dynamic library loaders and plenty of system calls, getting a good trace may be difficult beyond the code of the application itself.
RidgeRun SDK includes two open source tools that help to perform system and library tracing for applications: strace and ltrace.
System call tracing with strace
Strace is a debugging utility in Linux to monitor the system calls used by a program and all the signals it receives. A system call is the mechanism used by an application program to request service from the operating system.
This utility can be very helpful when debugging applications that deals with many open files (In Unix terminology, a device, a socket, a pipe and a FIFO queue file are all addressed as open files by the kernel). As a tip, the lsof utility prints the list of open files.
For example, if an application stops working suddenly, it is possible to execute it again with strace to check if it is stalled on opening a file or expecting user input.
The most common system calls are open, read, write, close, wait, exec, fork, exit, and kill. Besides those, Linux has hundreds of system calls.
The SDK includes strace under the applications and can be enabled by running “make config”, under:
RidgeRun SDK Configuration File System Configuration SDK Applications ---> [*] strace-4.5.15
To use this, try the following command line example:
# strace <application> [<application parameters>]
Library tracing with ltrace
ltrace is a debugging utility in Linux to monitor the library calls used by a program and all the signals it receives.
Just like strace, ltrace is used to run an application to check calls, but instead of system calls, ltrace tracks all the library calls.
For example, in the header file string.h there are definitions for many of the most common used library calls in Linux programming, among them memcpy(), memset(), strcmp(), and strcpy(), also the stdio.h header provides many other common library calls, particularly printf(), fopen() and fclose(). Calls to this functions, the parameters being used and the function return values may be monitored with ltrace.
The SDK includes ltrace under the applications and can be enabled by running “make config”, under:
RidgeRun SDK Configuration File System Configuration SDK Applications ---> [*] ltrace-0.5
To use this, try the following command line example:
# ltrace ./application
Linux debugging techniques
Debugging Linux applications or kernel require more than the right tools for the right job, it also needs basic understanding from the programmer on concepts of the subsystem under test. Linux may compromise of several different subsystems and technologies that may induce complex behaviors during debugging.
The best way to approach debugging when observing odd behaviors is to review reference documentation on the topics related. For example if an application relays heavily on threads, it's worth to understand how different function calls behave under multi-threaded applications (signal delivery, priority handling, etc).
Several references can be found that explains in-depth Linux internals, and some recommendations are:
- Understanding the Linux Kernel, Bovet & Cesati, O'Reilly
- Linux Kernel Development, Robert Love, Novell Press
- Linux Devices Drivers, Rubini & Corbet, O'Reilly
- Embedded Linux System Design and Development, P. Raghavan & Co., Auerbach Publications
RidgeRun provides support services with experts that may help you to identify quickly how to best approach the debugging of your product. RidgeRun extensive knowledge of Linux internals
This section provides common procedures that may result useful when debugging problems.
Kernel debugging techniques
Linux kernel provides extensive options to debug and instrument different system functionality. To enable kernel debugging use the configuration menu of the SDK:
$ make config Kernel Configuration --> Kernel Hacking --> [ x ] Kernel Debugging
Two of the more relevant options here are:
- Verbose kernel error messages: provided detailed information and kernel stack trace on kernel errors.
- Compile kernel with debug info: useful when performing symbolic debugging, as allow the debuggers to associate the assembly code with the source code.
Rebuild and reload the kernel is enough for the changes to take effect.
How to interpret a kernel crash log
When the kernel produces an error and the option “verbose kernel error messages is enabled” (review previous section), it will produce a dump of the registers as well of the stack trace that lead to the crash.
The best way to interpret this data is by translating the name of the functions from the address involved on the crash log. There are two possible ways to perform the translation:
- The file $DEVDIR/kernel/linux-<version>/System.map provides a readable table to translate from kernel address ranges to kernel symbols.
- If you desire to trace on the assembly code the instructions that lead to certain problem, you may use the tool objdump to disassembly the kernel image and look for the problem. Follow this steps:
$ `make env` # This put the toolchain on your PATH environment variable $ arm-linux-objdump -d -S $DEVDIR/kernel/linux-18.104.22.168/vmlinux
The last command assumes that your toolchain prefix is arm-linux- (which may change depending on your platform), and that your kernel is 22.214.171.124. The “-d” option will disassemble the binary image, and the “-S” will interleave the source code if the kernel is build with debugging information (see previous section).
How to interpret memory addresses
Refer to the following table to get a better comprehension of the memory address that you see on the logs and how to properly debug them:
|Address||Used by||How to translate it|
|0x0 – 0xBEFF:FFFF||Current user space application||Identify the running process PID and review the contents of /proc/<PID>/maps while the process is running.|
|0xBF00:0000 – 0xBFFF:FFFF||Current user space application||Review the contents of /proc/modules|
|0xC000:0000 to 0xC000:000 + <size of your RAM>||Kernel virtual logical addresses||Look at kernel/<kernel version>/System.map|
| 0xC000:000 + <size of your RAM> to|
|Kernel virtual addresses||May require a complex procedure, as is non-contiguous physical memory builded with vmalloc.|
For further information review the Chapter 8: “Process Address Space” of Understanding the Linux Kernel, 2nd Edition, or contact RidgeRun support.
How to debug “Segmentation Fault” problems
A common problem developers encounter when developing Linux applications is that an application dies with a “Segmentation Fault” message. This error is produced when the application access a illegal memory area, thus is, a memory area that doesn't belong to the process.
This kind of error is typically produced by wrong code, and could be difficult to locate on complex applications. This section explains techniques to locate the source of the problem.
Enabling verbose user crash messages
The first step to debug the problem is to enable verbose user crash messages on the kernel level. This is done by following this procedure:
$ make config Kernel Configuration --> Kernel Hacking --> [ x ] Verbose user fault messages
This will enable verbose error message generation from the kernel, but it requires one more step; the option “user_debug=N” should be passed on the command line to the kernel from the bootloader, where N is the sum of:
- 1 - undefined instruction events
- 2 - system calls
- 4 - invalid data aborts
- 8 - SIGSEGV faults
- 16 - SIGBUS faults
So for example to see detailed information about all the previous errors the parameter “user_debug=31” should be used.
How to interpret application crash log
Refer to the this section on how to interpret memory addresses. User space problems typically lie on memory addresses between 0x0 – 0xBEFF:FFFF. But user process have many different libraries and files mapped on that memory range.
To properly identify the location of a memory address from a crash log, the contents of /proc/<PID>/maps file are useful. This file only exist while the application is running, and <PID> is the process ID of it. Usually adding an early pause() call to the application and running it on the background (with the '&' suffix) can give the programmer the time to dump the mapping contents to use them for further debug.
Profiling software on the SDK
The SDK provides the profiling tool 'OProfile'. This allows the analysis of your program's behavior as it runs. It lets you determine which parts of a program to optimize for speed.
This is done by measuring the frequency and duration of function calls. The output is a statistical summary of the events observed. 'OProfile' uses hardware interrupts to probe the target's program counter register at regular intervals. Sampling profiles may be less accurate and specific than a detailed trace, but allow the target program to run at near full speed. As the summation in a profile is done related to the source code positions where the events occur, the size of measurement data will be linear to the code size of your program.
There are different ways 'OProfile' can help you. Generally, things that could be done intentionally to make a program run longer can also occur unintentionally. One commonly accepted kind of slow-down is a "hot spot", which is a tight inner loop where the program counter spends much of its time. For example, if one often finds at the bottom of the call stack a linear search algorithm instead of binary search, this would be a true hot spot slow-down. However, if another function is called in the search loop, such as string compare, that function would be found at the bottom of the stack, and the call to it in the loop would be at the next level up. In this case, the loop would not be a hot spot, but it would still be a slow-down. In all but the smallest programs, hot spot slugs are rare, but slugs are quite common.
Another way to slow down software is to use data structures that are too general for the problem at hand. For example, if a collection of objects remains small, a simple array with linear search could be much faster than something like a "dictionary" class, complete with hash coding. With this kind of slow-down, the program counter is most often found in system memory allocation and freeing routines as the collections are being constructed and destructed. Another common motive is that a powerful function is written to collect a set of useful information (from a database, for example). Then that function is called multiple times, rather than taking the trouble to save the results from a prior call. A possible explanation for this could be that it is beyond a programmer's comprehension that a function call might take a million times as long to execute as an adjacent assignment statement. A contributing factor could be "information hiding", in which external users of a module can be ignorant of what goes on inside it.
There are certain misconceptions in performance analysis. One is that timing is important. Knowing how much time is spent in functions is good for reporting improvements, but it provides only vague help in finding problems. The information that matters is the fraction of time that individual statements reside on the call stack. Another misconception is that statistical precision matters. Typical slow-downs sit on the call stack between 5 and 95 percent of the time. The larger they are, the fewer samples are needed to find them.
To enable OProfile select it under the following options.
RidgeRun SDK Configuration File System Configuration SDK Applications ---> [*] oprofile 0.9.3
There are a couple of storage considerations regarding the OProfile usage, first, that the file system may be larger than the available flash storage and that OProfile requires fast write speeds for its logs.
OProfile automatically enables several options in the kernel configuration, BusyBox and other applications and libraries, that together will add up to about 10MB of additional required space on the target file system, because of this it would be necessary to enable the NFS root file system on smaller storage systems.
If your hardware has plenty of storage you may not need NFS root file system.
To enable this file system option, it must be selected under:
RidgeRun SDK Configuration File System Configuration File system image target (...) ---> (X) NFS root file system
Sample collection considerations
It is also suggested that the log files (stored in /var/lib/oprofile) should be stored on a storage medium that allows fast writes, so the time based sampling will not be biased by such writes. This isn't the case for NFS. As a solution, it is possible to use the board RAM with the tmpfs file system, for this, the /var/lib/oprofile directory needs to be mounted under that file system type.
An example of how to mount this directory as a tmpfs file system is the following command lines:
# opcontrol –shutdown # verify oprofile is not running # mkdir -p /var/lib/oprofile # mount -t tmpfs none /var/lib/oprofile
Linux kernel considerations
It may be necessary to load the Linux kernel image in the file system so that OProfile could load its symbols.
The RidgeRun SDK generates all the images needed by the bootloader (Kernel, the file system and the bootloader itself) and puts them in the $DEVDIR/images/ directory.
However, this images are not what OProfile expects. The kernel image is located after successfully compiling the kernel in $DEVDIR/kernel/$KERNEL_VERSION in the vmlinux file.
This file can be placed under any directory in the $DEVDIR/fs/fs/ exported directory. If a NFS file system is being used while using the RidgeRun SDK in a development board, it will appear in the target hardware at the moment it has been copied. If CRAMFS or JFFS2 file systems are used, the Kernel image file must be copied in the same directory and then execute “make fs” again to generate the proper image with the Kernel image included.
Kernel module considerations
OProfile profiles kernel modules by default. However, there are a couple of problems you may have when trying to get results. First, you may have booted via an initrd; this means that the actual path for the module binaries cannot be determined automatically. To get around this, you can use the --image-path / -p [paths] (comma-separated list of additional paths to search for binaries) option to the profiling tools to specify where to look for the kernel modules.
In kernel versions 2.6 upwards, the information on where kernel module binaries are located has been removed. This means OProfile needs guiding with the -p option to find your modules. Normally, you can just use your standard module top-level directory for this. Note that due to this problem, OProfile cannot check that the modification times match; it is your responsibility to make sure you do not modify a binary after a profile has been created.
If you have run insmod or modprobe to insert a module in a particular directory, it is important that you specify this directory with the -p option first, so that it over-rides an older module binary that might exist in other directories you've specified with -p. It is up to you to make sure that these values are correct: 2.6 kernels simply do not provide enough information for OProfile.
Capture profiling data
You need to be the root user to use OProfile. To setup OProfile, you can either profile your application with, or without the Linux kernel. If you don't want to profile the Linux kernel, you'll need to do this:
# opcontrol --no-vmlinux
On the other case, let's say you copied the kernel image to the hardware's /kernel directory, then you'll need to do this: # opcontrol --vmlinux=/kernel/vmlinux
You are now ready to start the OProfile daemon process:
# opcontrol --start-daemon
The above step is useful to keep the OProfile startup process away from your profiling data, now, to start the profiling data collection, you need to do this:
# opcontrol --start
Though you could have skipped the start-daemon step. It is now time to start running your application. After running your application through its paces, you'll want to see the collected profile data. There are a couple of ways to do this. you can either shutdown profiling altogether, or you can just tell OProfile to dump the collected data, but it will continue to collect more data. These data are written to the file /var/lib/oprofile/samples/oprofiled.log To just dump the collected data:
# opcontrol --dump
To clear the profile data at any time, you can just do a reset with:
# opcontrol --reset
To shutdown OProfile:
# opcontrol --shutdown
Interpret profiling data
Let's see a usage example of OProfile running it on a RidgeRun SDK:
BusyBox v1.2.1 (2008.01.04-20:55+0000) Built-in shell (ash) Enter 'help' for a list of built-in commands. # opcontrol –no-vmlinux
Let's start profiling and immediately run a time consuming application in the background:
# opcontrol --start; yes > /dev/null & Using 2.6+ OProfile kernel interface. Using log file /var/lib/oprofile/samples/oprofiled.log Daemon started. Profiler running. # ps aux PID Uid VmSize Stat Command 1 root 364 S init . . . . . . . . . . . . . . . 193 root SW< [IRQ 19] 225 root 344 S /sbin/inetd 230 root 412 S /bin/sh /usr/bin/keep_smapp_alive 231 root 412 S -sh 581 root 312 S /usr/bin/oprofiled –session-dir=/var/lib/oprofile 583 root 292 R yes 609 root 344 R ps aux # kill -9 583 #  + Killed yes 1>/dev/null # opcontrol --shutdown Stopping profiling. Killing daemon.
'yes' is a BusyBox tool, so it will appear as such in the report. This is the summarized one:
<font class="Apple-style-span" face="monospace" size="3"><span class="Apple-style-span" style="font-size: 11px; line-height: 11px; white-space: pre;"><font class="Apple-style-span" face="sans-serif" size="3"><span class="Apple-style-span" style="font-size: 13px; line-height: 19px; white-space: normal;"> </span></font></span></font># opreport<br>CPU: CPU with timer interrupt, speed MHz (estimated)<br>Profiling through timer interrupt<br> TIMER:0|<br> samples| %|<br>------------------<br> 4753 no-vmlinux<br> 4670 libuClibc-0.9.29.so<br> 315 busybox<br> 260 ld-uClibc-0.9.29.so<br> 18 oprofiled
An this is the detailed report:
# opreport -l warning: /no-vmlinux could not be found. CPU: CPU with timer interrupt, speed MHz (estimated) Profiling through timer interrupt samples % app name symbol name 4753 .. no-vmlinux (no symbols) 2499 .. libuClibc-0.9.29.so _ppfs_parsespec 717 .. libuClibc-0.9.29.so vfprintf 315 .. busybox (no symbols) 212 .. libuClibc-0.9.29.so _ppfs_init 193 .. libuClibc-0.9.29.so _memcpy 190 .. libuClibc-0.9.29.so memset 151 .. libuClibc-0.9.29.so _ppfs_setargs 150 .. libuClibc-0.9.29.so _charpad 137 .. libuClibc-0.9.29.so strnlen 79 .. libuClibc-0.9.29.so __stdio_fwrite 76 .. ld-uClibc-0.9.29.so _dl_find_hash_mod 72 .. libuClibc-0.9.29.so fwrite_unlocked 59 .. ld-uClibc-0.9.29.so _dl_load_elf_shared_library 56 .. libuClibc-0.9.29.so memcpy 54 .. libuClibc-0.9.29.so fputs_unlocked 46 .. ld-uClibc-0.9.29.so _start 32 .. libuClibc-0.9.29.so strlen 25 .. libuClibc-0.9.29.so _ppfs_prepargs 20 .. libuClibc-0.9.29.so Laligned 17 .. libuClibc-0.9.29.so Llastword 15 .. ld-uClibc-0.9.29.so _dl_add_elf_hash_table 13 .. ld-uClibc-0.9.29.so _dl_load_shared_library 12 .. libuClibc-0.9.29.so strchr 9 .. ld-uClibc-0.9.29.so _dl_memalign 8 .. ld-uClibc-0.9.29.so _dl_allocate_tls_storage 8 .. ld-uClibc-0.9.29.so _dl_get_ready_to_run 6 .. oprofiled odb_update_node 5 .. libuClibc-0.9.29.so fork 4 .. ld-uClibc-0.9.29.so _dl_allocate_static_tls 4 .. libuClibc-0.9.29.so bsearch 4 .. oprofiled opd_process_samples 3 .. ld-uClibc-0.9.29.so _dl_next_tls_modid 3 .. ld-uClibc-0.9.29.so _dl_parse_dynamic_info 3 .. libuClibc-0.9.29.so __heap_free 3 .. libuClibc-0.9.29.so __psfs_parse_spec 3 .. libuClibc-0.9.29.so _uintmaxtostr 2 .. ld-uClibc-0.9.29.so _dl_calloc 2 .. ld-uClibc-0.9.29.so _dl_find_hash 2 .. ld-uClibc-0.9.29.so _dl_strdup 2 .. ld-uClibc-0.9.29.so _dl_unmap_cache 2 .. ld-uClibc-0.9.29.so init_tls 2 .. libuClibc-0.9.29.so .plt 2 .. libuClibc-0.9.29.so __psfs_do_numeric 2 .. libuClibc-0.9.29.so byte_regex_compile 2 .. libuClibc-0.9.29.so free 2 .. libuClibc-0.9.29.so memcmp 2 .. libuClibc-0.9.29.so re_search_2 2 .. libuClibc-0.9.29.so read 2 .. libuClibc-0.9.29.so strcoll 2 .. libuClibc-0.9.29.so strcspn 2 .. libuClibc-0.9.29.so strpbrk 2 .. oprofiled pop_buffer_value 1 .. ld-uClibc-0.9.29.so _dl_allocate_tls_init 1 .. ld-uClibc-0.9.29.so _dl_app_init_array 1 .. ld-uClibc-0.9.29.so _dl_initial_error_catch_tsd 1 .. ld-uClibc-0.9.29.so _dl_linux_resolve 1 .. ld-uClibc-0.9.29.so _dl_linux_resolver 1 .. ld-uClibc-0.9.29.so _dl_map_cache 1 .. libuClibc-0.9.29.so __pgsreader 1 .. libuClibc-0.9.29.so __sigsetjmp 1 .. libuClibc-0.9.29.so __stdio_trans2w_o 1 .. libuClibc-0.9.29.so __stdio_wcommit 1 .. libuClibc-0.9.29.so __uClibc_init 1 .. libuClibc-0.9.29.so _stdlib_strto_ll 1 .. libuClibc-0.9.29.so bsd_signal 1 .. libuClibc-0.9.29.so close 1 .. libuClibc-0.9.29.so exit 1 .. libuClibc-0.9.29.so fclose 1 .. libuClibc-0.9.29.so fflush 1 .. libuClibc-0.9.29.so fflush_unlocked 1 .. libuClibc-0.9.29.so getc_unlocked 1 .. libuClibc-0.9.29.so putc_unlocked 1 .. libuClibc-0.9.29.so strcpy 1 .. libuClibc-0.9.29.so vsscanf 1 .. oprofiled do_match 1 .. oprofiled get_file 1 .. oprofiled odb_do_hash 1 .. oprofiled odb_open_count 1 .. oprofiled op_get_mtime 1 .. oprofiled sfile_log_sample
For more information on 'opreport' capabilities, you can see the OProfile manual at: http://oprofile.sourceforge.net/doc/index.html
Using dmaiperf gstreamer element
Dmaiperf is a gstreamer element to observe the DSP utilization, one of its properties is to define the engine-name. If you setup this property to codecServer, it will display information regarding DSP performance such as usage.
The following is an example pipeline with that property set:
gst-launch -e audiotestsrc num-buffers=600 ! audio/x-raw-int, endianness=123 4, signed=true, width=16, depth=16, rate=44100, channels=1 ! dmaienc_aac copyOutput=true ! dmaiperf enginename=codecServer ! qtmux ! filesink location=test.mp4 -v
This is an example of the dmaiperf element output, the DSP: 21 specifies the average utilization of the DSP, this means that if there are two encoders, it will be the total amount used by both of them.
The other information specifies the sections that the codec server is using such as DDR2, DDRALGHEAP, etc, their sizes, base address, maximum size of the block. This information is useful if you need to debug issues related to the memory map.
INFO: Timestamp: 0:09:36.950987086; bps: 8377; fps: 15; DSP: 21; mem_seg: DDR2; base: 0xc3c98589; size: 0x20000; maxblocklen: 0x13910; used: 0xc6f0; mem_seg: DDRALGHEAP; base: 0xc3100000; size: 0xa00000; maxblocklen: 0x96ef10; used: 0x90f78; mem_seg: IRAM; base: 0x11800000; size: 0x20000; maxblocklen: 0x10000; used: 0x10000; mem_seg: L3_CBA_RAM; base: 0x80000000; size: 0x10000; maxblocklen: 0x10000; used: 0x0;
INFO: Timestamp: 0:09:37.153399169; bps: 11846; fps: 14;
Information such as the timestamp, bps and fps depend on where you place the dmaiperf element. In the pipeline below, I placed one dmaiperf element after the video encoder, this means that the information in italic (see above output) is related to this element. I placed the other dmaiperf element after the audio encoder with the engine-property set, this will allow you to see the overall DSP information (utilization, memory map distribution...) and the specific information of the video encoder: timestamp, bps and fps.
gst-launch videotestsrc ! ffmpegcolorspace ! videorate ! video/x-raw-yuv,framerate=15/1 ! videoscale ! video/x-raw-yuv,width=320,height=240 ! queue ! dmaienc_h264 maxbitrate=200000 targetbitrate=200000 ! dmaiperf ! queue ! rtph264pay ! queue ! udpsink name=vsink host=192.168.0.199 port=10000 audiotestsrc ! audio/x- raw-int,endianness=1234,signed=true,width=16,depth=16,rate=16000,channels=1 ! queue ! dmaienc_aac copyOutput=true bitrate=32000 maxbitrate=64000 ! dmaiperf engine-name=codecServer ! queue ! rtpmp4gpay ! queue ! udpsink name=asink host=192.168.0.199 port=10002
To wrap up, timestamp, fps and bps are related to the element that is before the dmaiperf element. The DSP utilization and memory information are general.
Valgrid is a suite of free software tools for debugging and profiling programs, Valgrind allows you to improve your programs or applications, making them faster and less buggy. One of the most important Valgrind tools is Memcheck. Memcheck identifies common memory issues that are produced by the following reasons:
The program is:
- Accessing memory that it shouldn't: for instance, if the program is accessing memory after it has been freed
- Using values that haven't been defined or initialized
- Doing an incorrect freeing of heap memory
- Overlapping pointers when using functions like memcpy
- Has memory leaks
The standard distribution of Valgrind includes other useful tools, among them:
- Cachegrind: This tool is a cache profiler that can simulate the L1, D1 and L2 caches in your CPU and can identify the source and numbers of cache misses during the execution of your program.
- Callgrind: Callgrind is an extension of cachegrind, but it includes further information that Cachegrind doesn't provide. Optionally, cache simulator information about the memory access behavior of your application could be used.
- Helgrind is a tool that allow you to create in an easier way a correct multi-threaded program, It is a thread error detector.
- DRD: has the same functionality that Helgrind but it use different analysis techniques and as a consequence can find additional issues.
- Massif: It can help you create a program that use less memory that the original, is used as a heap profiler.
If you want more information about Valgrind's tools, read The Valgrind User Manual. Also you can learn how to debug an easy Hello world program with memory leak issues by looking at The Valgrind quick start guide.
Due the advantages offered by Valgrind during the debug process of an application, it is available as one developer tool in the RidgeRun's SDKs. Currently Valgrind only supports in ARMv7 architectures but if you are interested on ARMv5 Valgrind support don't hesitate to contact RidgeRun.
In order to enable Valgrind in your SDK, you must select it under:
RidgeRun SDK Configuration File System Configuration Select target's file system software---> [*] Valgrind
note: Valgrind is only available in the professional SDK