RidgeRun Developer Manual/Profiling Tools/Linux Perf: Difference between revisions

From RidgeRun Developer Wiki
(Created page with "<noinclude> {{RidgeRun Developer Manual/Head|Profiling_Tools|next=Methodologies|metakeywords=}} </noinclude> == Introduction to Linux Perf == <noinclude> {{RidgeRun Developer Manual/Foot|Profiling_Tools|Methodologies}} </noinclude>")
 
Line 6: Line 6:
== Introduction to Linux Perf ==
== Introduction to Linux Perf ==


Linux Perf is one of the most complete and accurate tools for profiling an application since it has access to kernel ABI/API. It can intercept calls and get access to the hardware counters. It is possible to determine:


* Branching: how much does the branching affect the performance
* Cache miss rates: how much the application has a good memory access pattern
* Alignment faults: same as before but related to cache line trashing
* Context switches: how much the application hides
* CPU clocks and migration: how much time the application uses the CPU in active or waiting mode, and the migration amongst cores
* Construct the call graph: description of how each function is called
In this case, we cover the tool for application optimisation, where we want to optimise only specific parts. This is crucial for applications that are unknown and need to be accelerated.


<noinclude>
<noinclude>
{{RidgeRun Developer Manual/Foot|Profiling_Tools|Methodologies}}
{{RidgeRun Developer Manual/Foot|Profiling_Tools|Methodologies}}
</noinclude>
</noinclude>

Revision as of 19:43, 2 November 2023





  Index Next: Methodologies





Introduction to Linux Perf

Linux Perf is one of the most complete and accurate tools for profiling an application since it has access to kernel ABI/API. It can intercept calls and get access to the hardware counters. It is possible to determine:

  • Branching: how much does the branching affect the performance
  • Cache miss rates: how much the application has a good memory access pattern
  • Alignment faults: same as before but related to cache line trashing
  • Context switches: how much the application hides
  • CPU clocks and migration: how much time the application uses the CPU in active or waiting mode, and the migration amongst cores
  • Construct the call graph: description of how each function is called

In this case, we cover the tool for application optimisation, where we want to optimise only specific parts. This is crucial for applications that are unknown and need to be accelerated.


Previous: Profiling_Tools Index Next: Methodologies