NXP i.MX95/Tools/Arm Performance Studio: Difference between revisions

From RidgeRun Developer Wiki
(Created page with "<noinclude> {{NXP i.MX95/Head|previous=Tools|next=Contact_Us|metakeywords=}} </noinclude> <noinclude> {{NXP i.MX95/Foot|Tools|Contact_Us}} </noinclude>")
 
No edit summary
Line 3: Line 3:
</noinclude>
</noinclude>


Arm Performance Studio is a free suite of profiling tools for performance optimization of applications running on devices with Arm CPUs, and Arm GPUs. Arm Performance Studio is the new name for Arm Mobile Studio. These tools includes:
* '''Streamline''': Capture a performance profile, using all the CPU, GPU, and memory system performance data in the system
* '''Performance Advisor''': Generates an easy-to-read performance report from an annotated Streamline profile
* '''Frame Advisor''': Capture the API calls and rendering and get comprehensive geometry metrics
* '''Mali Offline Compiler''': Compile your shader programs and check how they will perform across on any Mali GPUs
* '''RenderDoc for Arm GPUs''': Tool for debugging Vulkan graphics applications
Performance Advisor and Frame Advisor are only available for Android devices.
{{message|Find all the documentation related to Arm Performance Studio [https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio here].}}
== Installation ==
1. Download the software from this [https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio#Software-Download page].
2. Install Arm Performance Studio with the following command:
<pre>
tar xvzf Arm_Performance_Studio_<version>_linux.tgz
</pre>
This is all you need to do, all the necessary binaries are already included.
== Streamline ==
=== Target preparation ===
Some kernel options are required. The kernel configuration are usually located at <code>/proc/config.gz</code>, if the file is not visible, using this command you can create it:
<pre>
sudo modprobe configs
</pre>
The options you need in your kernel configuration are:
* General Setup -> Profiling Support (CONFIG_PROFILING)
* General Setup -> Kernel Performance Events And Counters -> Kernel performance events and counters (CONFIG_PERF_EVENTS)
* General Setup -> Timers subsystem -> High Resolution Timer Support (CONFIG_HIGH_RES_TIMERS)
* Kernel Features -> Enable hardware performance counter support for perf events (CONFIG_HW_PERF_EVENTS).
'''Note''': If you can't find this option in menuconfig, verify instead that the option Device Drivers -> Performance monitor support -> ARM PMU framework (CONFIG_ARM_PMU) is enabled, CONFIG_HW_PERF_EVENTS is enabled by default but has this dependency.
You can verify if the options are enabled with this command:
<pre>
zcat /proc/config.gz | grep <OPTION>
# For example
zcat /proc/config.gz | grep CONFIG_PROFILING
</pre>
=== Install gatord ===
A target agent is required to run on the Arm Linux target in order for Arm Streamline to operate. This agent is gator, and gatord is the daemon you need to execute in your target board.
The pre-built gatord binaries are available in the Arm Performance Studio under the path: <code><Install directory>/streamline/bin/linux</code>. Copy the gatord binary to your target board and ensure you have execution permission, use this command to add this permission:
<pre>
chmod +x gatord
</pre>
Now you can execute the binary with the following command:
<pre>
./gatord -a
</pre>
{{message|The <code>-a</code> (or --allow-command) flag allows to execute a command in Streamline, from the host computer.}}
('''Optional''') By default, '''gatord uses port 8080''', but you can specify a different port by using the <code>-p</code> flag.
<pre>
./gatord -a -p 5050
</pre>
=== Capture a Streamline profile ===
You are now ready to capture the CPU and GPU metrics from Streamline, to do this execute these commands to start Streamline in the host computer:
<pre>
cd <Install directory>/streamline/
./Streamline
</pre>
The Streamline window will open and you should see something like in the following picture.
<br>
[[File:Streamline-first-screen.png|900px|center|thumb|Fig. 1. Out of the box view of Streamline.]]
<br>
In the Start tab, in the option ''Select device type'' you have to select '''TCP'''. Just below, in the ''Select Target'' option, make sure to mark the box ''Enter target details:'' and write the target IP followed by the port used by gatord in the form: <code><Target IP>:<Port></code>.
{{Message|type=info|text=The target board and host computer must be in the same network|title='''Attention'''}}
You can check the target IP address using this command:
<pre>
ifconfig
</pre>
And the result should be similar to this:
<syntaxhighlight lang=bash>
...
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.100.79  netmask 255.255.255.0  broadcast 192.168.100.255
        inet6 fe80::622:4478:31d9:c881  prefixlen 64  scopeid 0x20<link>
        ether f8:dc:7a:e4:45:e2  txqueuelen 1000  (Ethernet)
        RX packets 35053  bytes 4051181 (3.8 MiB)
        RX errors 0  dropped 340  overruns 0  frame 0
        TX packets 19008  bytes 254895043 (243.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
...
</syntaxhighlight>
If using the gatord default port, for this example you have to write: <code>192.168.100.79:8080</code>. You can optionally specify the command to execute in the board within the ''Configure application'' section, if no command is specified, it can be executed directly from the target.
<br>
[[File:Streamline-start-screen.png|900px|center|thumb|Fig. 2. TCP capture configuration.]]
<br>
Streamline gathers numerous events from the target. Before initiating the capture, choose the appropriate template to select and organize the necessary events effectively. This is shown in the next picture.
<br>
[[File:Streamline-gpu-template.png|900px|center|thumb|Fig. 3. Counters template selection.]]
<br>
Click on the bottom left button ''Select counters''. In the window displayed, click on the button at the upper right corner marked in the figure above. Select the appropriate template, whether you want to profile the CPU of the GPU workload, and click on ''Save''. Now you can start the capture by clicking in the ''Start capture'' button located at the bottom right corner, it is marked in the next picture.
<br>
[[File:Streamline-start-record.png|900px|center|thumb|Fig. 4. Start capture.]]
<br>
When the capture starts, you should see a similar result like in the following picture, based on the template selected.
<br>
[[File:Streamline-capture-screen.png|900px|center|thumb|Fig. 5. Capture screen.]]
<br>
The metrics graphs start to display in the central part of the window. While in the lower part, the CPU usage of the processes present in the target is displayed. In the upper left corner the capture control buttons are located, marked in the picture below, these buttons from left to right are:
* Save the current capture and restart
* Restart capturing, discarding the current capture
* Stop capture from target and analyze collected data
* Stop capture from target and discard collected data
[[Category:Imx95 Exploration RnD]]





Revision as of 15:24, 29 August 2024



Previous: Tools Index Next: Contact_Us





Arm Performance Studio is a free suite of profiling tools for performance optimization of applications running on devices with Arm CPUs, and Arm GPUs. Arm Performance Studio is the new name for Arm Mobile Studio. These tools includes:

  • Streamline: Capture a performance profile, using all the CPU, GPU, and memory system performance data in the system
  • Performance Advisor: Generates an easy-to-read performance report from an annotated Streamline profile
  • Frame Advisor: Capture the API calls and rendering and get comprehensive geometry metrics
  • Mali Offline Compiler: Compile your shader programs and check how they will perform across on any Mali GPUs
  • RenderDoc for Arm GPUs: Tool for debugging Vulkan graphics applications

Performance Advisor and Frame Advisor are only available for Android devices.


Info
Find all the documentation related to Arm Performance Studio here.


Installation

1. Download the software from this page.

2. Install Arm Performance Studio with the following command:

tar xvzf Arm_Performance_Studio_<version>_linux.tgz

This is all you need to do, all the necessary binaries are already included.

Streamline

Target preparation

Some kernel options are required. The kernel configuration are usually located at /proc/config.gz, if the file is not visible, using this command you can create it:

sudo modprobe configs

The options you need in your kernel configuration are:

  • General Setup -> Profiling Support (CONFIG_PROFILING)
  • General Setup -> Kernel Performance Events And Counters -> Kernel performance events and counters (CONFIG_PERF_EVENTS)
  • General Setup -> Timers subsystem -> High Resolution Timer Support (CONFIG_HIGH_RES_TIMERS)
  • Kernel Features -> Enable hardware performance counter support for perf events (CONFIG_HW_PERF_EVENTS).

Note: If you can't find this option in menuconfig, verify instead that the option Device Drivers -> Performance monitor support -> ARM PMU framework (CONFIG_ARM_PMU) is enabled, CONFIG_HW_PERF_EVENTS is enabled by default but has this dependency.

You can verify if the options are enabled with this command:

zcat /proc/config.gz | grep <OPTION>

# For example

zcat /proc/config.gz | grep CONFIG_PROFILING

Install gatord

A target agent is required to run on the Arm Linux target in order for Arm Streamline to operate. This agent is gator, and gatord is the daemon you need to execute in your target board.

The pre-built gatord binaries are available in the Arm Performance Studio under the path: <Install directory>/streamline/bin/linux. Copy the gatord binary to your target board and ensure you have execution permission, use this command to add this permission:

chmod +x gatord

Now you can execute the binary with the following command:

./gatord -a


Info
The -a (or --allow-command) flag allows to execute a command in Streamline, from the host computer.


(Optional) By default, gatord uses port 8080, but you can specify a different port by using the -p flag.

./gatord -a -p 5050

Capture a Streamline profile

You are now ready to capture the CPU and GPU metrics from Streamline, to do this execute these commands to start Streamline in the host computer:

cd <Install directory>/streamline/
./Streamline

The Streamline window will open and you should see something like in the following picture.


Fig. 1. Out of the box view of Streamline.


In the Start tab, in the option Select device type you have to select TCP. Just below, in the Select Target option, make sure to mark the box Enter target details: and write the target IP followed by the port used by gatord in the form: <Target IP>:<Port>.


Attention
The target board and host computer must be in the same network


You can check the target IP address using this command:

ifconfig

And the result should be similar to this:

...

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.100.79  netmask 255.255.255.0  broadcast 192.168.100.255
        inet6 fe80::622:4478:31d9:c881  prefixlen 64  scopeid 0x20<link>
        ether f8:dc:7a:e4:45:e2  txqueuelen 1000  (Ethernet)
        RX packets 35053  bytes 4051181 (3.8 MiB)
        RX errors 0  dropped 340  overruns 0  frame 0
        TX packets 19008  bytes 254895043 (243.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

...

If using the gatord default port, for this example you have to write: 192.168.100.79:8080. You can optionally specify the command to execute in the board within the Configure application section, if no command is specified, it can be executed directly from the target.


Fig. 2. TCP capture configuration.


Streamline gathers numerous events from the target. Before initiating the capture, choose the appropriate template to select and organize the necessary events effectively. This is shown in the next picture.


Fig. 3. Counters template selection.


Click on the bottom left button Select counters. In the window displayed, click on the button at the upper right corner marked in the figure above. Select the appropriate template, whether you want to profile the CPU of the GPU workload, and click on Save. Now you can start the capture by clicking in the Start capture button located at the bottom right corner, it is marked in the next picture.


Fig. 4. Start capture.


When the capture starts, you should see a similar result like in the following picture, based on the template selected.


Fig. 5. Capture screen.


The metrics graphs start to display in the central part of the window. While in the lower part, the CPU usage of the processes present in the target is displayed. In the upper left corner the capture control buttons are located, marked in the picture below, these buttons from left to right are:

  • Save the current capture and restart
  • Restart capturing, discarding the current capture
  • Stop capture from target and analyze collected data
  • Stop capture from target and discard collected data



Previous: Tools Index Next: Contact_Us