Profiling GNU/Linux applications: Difference between revisions
No edit summary |
|||
Line 1: | Line 1: | ||
= Overview = | = Overview = | ||
This document explains the necessary steps for profiling applications under GNU/Linux. There are utilities such as valgrind that tells how the programs uses memory and gives detailed info about leaks, nevertheless does not provide sufficient information of CPU usage by the program's functions, who called them and how much time does this functions used when they run. This document tells how to profile with '''gprof''', The GNU Profiler. | This document explains the necessary steps for profiling applications under GNU/Linux. There are utilities such as valgrind that tells how the programs uses memory and gives detailed info about leaks, nevertheless does not provide sufficient information of CPU usage by the program's functions, who called them and how much time does this functions used when they run. This document tells how to profile with '''gprof''', The GNU Profiler. | ||
In order to produce the gmon.out file, the application should exit normally, on errors the file is not created | |||
= Application setup = | = Application setup = | ||
== Install == | == Install == | ||
In order to use gprof, install oprofile at file system applications | |||
<br> | |||
== Compile Flags == | == Compile Flags == | ||
gprof needs that the application generates profiling data to work with. To make this available, on the application package some flags need to be set, | gprof needs that the application generates profiling data to work with. To make this available, on the application package some flags need to be set, | ||
Line 23: | Line 27: | ||
LDFLAGS = -static-libgcc -Wl,-Bstatic -lc | LDFLAGS = -static-libgcc -Wl,-Bstatic -lc | ||
With this, the application profiling build setup is ready, now run your application as you would normally. | With this, the application profiling build setup is ready, now run your application as you would normally. | ||
= gprof Usage = | = gprof Usage = | ||
When the application is done, there should be a *.out file (usually '''gmon.out''') that gprof will work with. Run, | When the application is done, there should be a *.out file (usually '''gmon.out''') that gprof will work with. Run, | ||
gprof $EXECUABLE_FILE gmon.out | gprof $EXECUABLE_FILE gmon.out | ||
By now, you should see in the command prompt all the information regard the application profile. For more details, visit the GNU Profiler website. | By now, you should see in the command prompt all the information regard the application profile. For more details, visit the GNU Profiler website. | ||
Other way to use gprof to analyze the information later, | Other way to use gprof to analyze the information later, | ||
gprof $EXECUABLE_FILE gmon.out | gprof $EXECUABLE_FILE gmon.out >> app_gprof_data.txt | ||
= Call graph = | = Call graph = | ||
There is a way to show the profile information in a graph (generate a *.png file also), for that you'll need some extra packages, plus a python script. | There is a way to show the profile information in a graph (generate a *.png file also), for that you'll need some extra packages, plus a python script. | ||
sudo apt-get install python graphviz xdot | sudo apt-get install python graphviz xdot | ||
Now, you'll need gprof2dot to convert the profiling output to a dot graph. This script can be download from: | Now, you'll need gprof2dot to convert the profiling output to a dot graph. This script can be download from: | ||
https://code.google.com/p/jrfonseca/wiki/Gprof2Dot | https://code.google.com/p/jrfonseca/wiki/Gprof2Dot | ||
or checkout from the git repo. | or checkout from the git repo. | ||
git clone https://code.google.com/p/jrfonseca.gprof2dot/ gprof2dot | git clone https://code.google.com/p/jrfonseca.gprof2dot/ gprof2dot | ||
Enter the script directory: | Enter the script directory: | ||
gprof2dot.py app_gprof_data.txt | gprof2dot.py app_gprof_data.txt > app_call_graph.dot | ||
To visualize the data: | To visualize the data: | ||
xdot app_call_graph.dot | xdot app_call_graph.dot | ||
To convert the .dot file to an image: | To convert the .dot file to an image: | ||
dot -Tpng app_call_graph.dot -o app_call_graph.png | dot -Tpng app_call_graph.dot -o app_call_graph.png | ||
= Example 1 video stabilization = | = Example 1 video stabilization = | ||
A video stabilization algorithm was ported to the RidgeRun SDK and run on the ARM processor. | A video stabilization algorithm was ported to the RidgeRun SDK and run on the ARM processor. gprof was used to identify the time consuming routines so they could be moved to the DSP. The follkowing is the results of running gprof before any optimiziations were performed. | ||
APP=yuv_tester | APP=yuv_tester | ||
* Normal command execution | *Normal command execution | ||
$APP -f 1 coastguard_352x288.yuv | $APP -f 1 coastguard_352x288.yuv | ||
* Capture profile data | *Capture profile data | ||
gprof $APP -f 1 coastguard_352x288.yuv $APP.gprof.out >> $APP.gprof.txt | |||
<pre> | with first part of output being: | ||
Flat profile: | <pre>Flat profile: | ||
Each sample counts as 0.01 seconds. | Each sample counts as 0.01 seconds. | ||
% cumulative self self total | |||
time seconds seconds calls ms/call ms/call name | time seconds seconds calls ms/call ms/call name | ||
32.58 1.01 1.01 45467136 0.00 0.00 interpolateBiLin | 32.58 1.01 1.01 45467136 0.00 0.00 interpolateBiLin | ||
Line 93: | Line 95: | ||
1.30 2.96 0.04 memcpy | 1.30 2.96 0.04 memcpy | ||
0.97 2.99 0.03 __write_nocancel | 0.97 2.99 0.03 __write_nocancel | ||
</pre> | </pre> | ||
*Create call graph | |||
* Create call graph | |||
graph2dot $APP.gprof.txt > $APP.call_graph.dot | |||
<pre> | with first part of call graph data file being: | ||
<pre> Call graph (explanation follows) | |||
granularity: each sample hit covers 2 byte(s) for 0.32% of 3.09 seconds | granularity: each sample hit covers 2 byte(s) for 0.32% of 3.09 seconds | ||
index % time self children called name | index % time self children called name | ||
<spontaneous> | |||
[1] 95.0 0.00 2.93 filter_video [1] | [1] 95.0 0.00 2.93 filter_video [1] | ||
0.00 1.62 300/300 motionDetection [2] | 0.00 1.62 300/300 motionDetection [2] | ||
Line 115: | Line 114: | ||
0.00 0.00 300/300 transformPrepare [35] | 0.00 0.00 300/300 transformPrepare [35] | ||
0.00 0.00 300/300 transformFinish [34] | 0.00 0.00 300/300 transformFinish [34] | ||
</pre> | </pre> | ||
*Visualize call graph | |||
* Visualize call graph | |||
xdot $APP.call_graph.dot | xdot $APP.call_graph.dot | ||
* Create PNG image file of call graph | *Create PNG image file of call graph | ||
dot -T$APP.call_graph.dot -o $APP.call_graph.png | dot -T$APP.call_graph.dot -o $APP.call_graph.png | ||
[[ | [[Image:Yuv tester gprof graph.png|500px|Yuv tester gprof graph.png]] | ||
= References = | = References = | ||
You can find more information about profiling in the following links: | You can find more information about profiling in the following links: | ||
The GNU Profiler: http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html | The GNU Profiler: http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html | ||
Gprof call-graph visualization: http://redmine.epfl.ch/projects/python_cookbook/wiki/Gprof_call-graph_visualization | Gprof call-graph visualization: http://redmine.epfl.ch/projects/python_cookbook/wiki/Gprof_call-graph_visualization | ||
Gprof2Dot: https://code.google.com/p/jrfonseca/wiki/Gprof2Dot | Gprof2Dot: https://code.google.com/p/jrfonseca/wiki/Gprof2Dot |
Revision as of 22:29, 19 September 2013
Overview
This document explains the necessary steps for profiling applications under GNU/Linux. There are utilities such as valgrind that tells how the programs uses memory and gives detailed info about leaks, nevertheless does not provide sufficient information of CPU usage by the program's functions, who called them and how much time does this functions used when they run. This document tells how to profile with gprof, The GNU Profiler.
In order to produce the gmon.out file, the application should exit normally, on errors the file is not created
Application setup
Install
In order to use gprof, install oprofile at file system applications
Compile Flags
gprof needs that the application generates profiling data to work with. To make this available, on the application package some flags need to be set,
Profiling and debugging flags:
CFLAGS = -pg
gprof generates profiling data better for static linked libraries, have it in mind for your application:
LDFLAGS = -static-libgcc -Wl,-Bstatic -lc
With this, the application profiling build setup is ready, now run your application as you would normally.
gprof Usage
When the application is done, there should be a *.out file (usually gmon.out) that gprof will work with. Run,
gprof $EXECUABLE_FILE gmon.out
By now, you should see in the command prompt all the information regard the application profile. For more details, visit the GNU Profiler website.
Other way to use gprof to analyze the information later,
gprof $EXECUABLE_FILE gmon.out >> app_gprof_data.txt
Call graph
There is a way to show the profile information in a graph (generate a *.png file also), for that you'll need some extra packages, plus a python script.
sudo apt-get install python graphviz xdot
Now, you'll need gprof2dot to convert the profiling output to a dot graph. This script can be download from:
https://code.google.com/p/jrfonseca/wiki/Gprof2Dot
or checkout from the git repo.
git clone https://code.google.com/p/jrfonseca.gprof2dot/ gprof2dot
Enter the script directory:
gprof2dot.py app_gprof_data.txt > app_call_graph.dot
To visualize the data:
xdot app_call_graph.dot
To convert the .dot file to an image:
dot -Tpng app_call_graph.dot -o app_call_graph.png
Example 1 video stabilization
A video stabilization algorithm was ported to the RidgeRun SDK and run on the ARM processor. gprof was used to identify the time consuming routines so they could be moved to the DSP. The follkowing is the results of running gprof before any optimiziations were performed.
APP=yuv_tester
- Normal command execution
$APP -f 1 coastguard_352x288.yuv
- Capture profile data
gprof $APP -f 1 coastguard_352x288.yuv $APP.gprof.out >> $APP.gprof.txt
with first part of output being:
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 32.58 1.01 1.01 45467136 0.00 0.00 interpolateBiLin 32.09 2.00 0.99 300 3.30 3.30 boxblur_vert_C 11.67 2.36 0.36 300 1.20 1.20 boxblur_hori_C 8.43 2.62 0.26 300 0.87 4.22 transformYUV 8.43 2.88 0.26 600990 0.00 0.00 compareSubImg_thr 1.46 2.92 0.05 300 0.15 0.15 lowPassTransforms 1.30 2.96 0.04 memcpy 0.97 2.99 0.03 __write_nocancel
- Create call graph
graph2dot $APP.gprof.txt > $APP.call_graph.dot
with first part of call graph data file being:
Call graph (explanation follows) granularity: each sample hit covers 2 byte(s) for 0.32% of 3.09 seconds index % time self children called name <spontaneous> [1] 95.0 0.00 2.93 filter_video [1] 0.00 1.62 300/300 motionDetection [2] 0.26 1.01 300/300 transformYUV [4] 0.05 0.00 300/300 lowPassTransforms [11] 0.00 0.00 300/300 transformPrepare [35] 0.00 0.00 300/300 transformFinish [34]
- Visualize call graph
xdot $APP.call_graph.dot
- Create PNG image file of call graph
dot -T$APP.call_graph.dot -o $APP.call_graph.png
References
You can find more information about profiling in the following links:
The GNU Profiler: http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html
Gprof call-graph visualization: http://redmine.epfl.ch/projects/python_cookbook/wiki/Gprof_call-graph_visualization
Gprof2Dot: https://code.google.com/p/jrfonseca/wiki/Gprof2Dot