AM5728 Multimedia Performance Testbench: Difference between revisions

AM5728 Multimedia Performance Testbench (view source)

Revision as of 20:04, 7 June 2016

1,738 bytes added , 7 June 2016

→‎AAC audio encode

Dgarbanzo

1,433

edits

@@ Line 133: / Line 133: @@
 In the chart above, it can be seen that when using hardware acceleration (NEON&VFPv4 extension), more memory bandwidth consumption by memory writings is obtained. The average difference is 206.8 MB/s for sequential reads and 79 MB/s for aleatory reads.
+== <span style="color:#008080">H264 video encode</span><br>  ==
+In this section you will find a comparison of H264 video encode GStreamer pipelines performance results between hardware accelerated and only software implementation. The hardware accelerated implementation uses gst-plugins-ducati (ducatih264enc element), and on the other side, the only software implementation uses the openh264 plugin (openh264enc element). The test pipelines only differ in H264 encode GStreamer element, using in one case the hardware accelerated implementation, and in the other case using the non hardware accelerated implementation.
+=== <span style="color:#0931C6">CPU load % per core</span><br>  ===
+'''''Test pipeline (ducatih264enc):'''''
+<pre style="background:#d6e4f1">
+GST_TRACER_PLUGINS="cpuusage" gst-launch-1.0 -e videotestsrc num-buffers=640 is-live=true ! 'video/x-raw,format=(string)NV12,width=720,height=420,framerate=(fraction)30/1' ! ducatih264enc ! fakesink sync=true
+</pre>
+'''''Test pipeline (openh264enc):'''''
+<pre style="background:#d6e4f1">
+GST_TRACER_PLUGINS="cpuusage" gst-launch-1.0 -e videotestsrc num-buffers=640 is-live=true ! 'video/x-raw,format=(string)I420,width=720,height=420,framerate=(fraction)30/1' ! openh264enc ! fakesink sync=true</pre>
+'''''Obtained Results:'''''
+[[Image:AM572x-testbench-H264-enc-cpuload.png|center|700px|AM572x-testbench-H264-enc-cpuload.png]]<br>
+In the chart above, is clearly shown that when using hardware acceleration, a substantial reduction in CPU workload is achieved. The average difference between CPU_0_accel and CPU_1_unaccel is 20.2 % less load for CPU_0_accel. In both cases the another corresponding core is practically off, and there is no difference between them.