AM5728 Multimedia Performance Testbench: Difference between revisions

AM5728 Multimedia Performance Testbench (view source)

Revision as of 22:33, 7 June 2016

5,172 bytes added , 7 June 2016

→‎MPEG4 video encode

Dgarbanzo

1,433

edits

@@ Line 234: / Line 234: @@
 In the chart above, is clearly shown that when using hardware acceleration, a substantial reduction in CPU workload is achieved. The average difference between CPU_1_accel and CPU_1_unaccel is 48.8 % less load for CPU_1_accel. In both cases the CPU_0 has the same average  workload percentage, so there is no difference between them.
+=== <span style="color:#0931C6">Frame-rate</span><br>  ===
+'''''Test pipeline (ducatimpeg4enc):'''''
+<pre style="background:#d6e4f1">
+GST_TRACER_PLUGINS="framerate" gst-launch-1.0 -e videotestsrc num-buffers=640 is-live=true ! 'video/x-raw,format=(string)NV12,width=720,height=420,framerate=(fraction)30/1' ! ducatimpeg4enc ! fakesink sync=true
+</pre>
+'''''Test pipeline (avenc_mpeg4):'''''
+<pre style="background:#d6e4f1">
+GST_TRACER_PLUGINS="framerate" gst-launch-1.0 -e videotestsrc num-buffers=640 is-live=true ! 'video/x-raw,format=(string)I420,width=720,height=420,framerate=(fraction)30/1' ! avenc_mpeg4 ! fakesink sync=true
+</pre>
+'''''Obtained Results:'''''
+[[Image:AM572x-testbench-MPEG4-enc-framerate.png|center|700px|AM572x-testbench-MPEG4-enc-framerate.png]]<br>
+In the chart above, it can be seen in a general way that in both cases, the frame-rate reaches the expected value of 30 fps and then remains stable.
+=== <span style="color:#0931C6">Memory consumption</span><br>  ===
+'''''Test pipeline (ducatimpeg4enc):'''''
+<pre style="background:#d6e4f1">
+gst-launch-1.0 -e videotestsrc num-buffers=640 is-live=true ! 'video/x-raw,format=(string)NV12,width=720,height=420,framerate=(fraction)30/1' ! ducatimpeg4enc ! fakesink sync=true
+</pre>
+'''''Test pipeline (avenc_mpeg4):'''''
+<pre style="background:#d6e4f1">
+gst-launch-1.0 -e videotestsrc num-buffers=640 is-live=true ! 'video/x-raw,format=(string)I420,width=720,height=420,framerate=(fraction)30/1' ! avenc_mpeg4 ! fakesink sync=true
+</pre>
+'''''Obtained Results:'''''
+[[Image:AM572x-testbench-MPEG4-enc-memuse.png|center|700px|AM572x-testbench-MPEG4-enc-memuse.png]]<br>
+In the chart above, it can be seen that when using hardware acceleration, a big reduction is achieved in memory consumption. The average difference is 4514 KB of less consumption when hardware acceleration is used.
+=== <span style="color:#0931C6">Memory bandwidth consumption</span><br>  ===
+'''''Test pipeline (ducatimpeg4enc):'''''
+<pre style="background:#d6e4f1">
+gst-launch-1.0 -e videotestsrc is-live=true ! 'video/x-raw,format=(string)NV12,width=720,height=420,framerate=(fraction)30/1' ! ducatimpeg4enc ! fakesink sync=true
+</pre>
+'''''Test pipeline (avenc_mpeg4):'''''
+<pre style="background:#d6e4f1">
+gst-launch-1.0 -e videotestsrc is-live=true ! 'video/x-raw,format=(string)I420,width=720,height=420,framerate=(fraction)30/1' ! avenc_mpeg4 ! fakesink sync=true
+</pre>
+Note: In both charts the memory bandwidth consumption is presented separately in sequential (seq) and aleatory (al) memory access.
+'''''Memory bandwidth consumption by memory readings obtained results:'''''
+[[Image:AM572x-testbench-MPEG4-enc-readsbandwidth.png|center|700px|AM572x-testbench-MPEG4-enc-readbandwidth.png]]<br>
+In the chart above, it can be seen that when using hardware acceleration, less memory bandwidth consumption by memory readings is obtained. The average difference is 358.1 MB/s for sequential reads and 446.9 MB/s for aleatory reads.
+'''''Memory bandwidth consumption by memory writings obtained results:'''''
+[[Image:AM572x-testbench-MPEG4-enc-writebandwidth.png|center|700px|AM572x-testbench-MPEG4-enc-writebandwidth.png]]<br>
+In the chart above, it can be seen that when using hardware acceleration, less memory bandwidth consumption by memory writings is obtained. The average difference is 1832.7 MB/s for sequential writes and 499.9 MB/s for aleatory writes.
+== <span style="color:#008080">H264 video decode</span><br>  ==
+In this section you will find a comparison of H264 video decode GStreamer pipelines performance results between hardware accelerated and only software implementation. The hardware accelerated implementation uses gst-plugins-ducati (ducatih264dec element), and on the other side, the only software implementation uses the gst-plugins-libav (avdec_h264 element). The test pipelines only differ in H264 decode GStreamer element, using in one case the hardware accelerated, and in the other case using the non hardware accelerated implementation.
+=== <span style="color:#0931C6">CPU load % per core</span><br>  ===
+'''''Test pipeline (ducatih264dec):'''''
+<pre style="background:#d6e4f1">
+GST_TRACER_PLUGINS="cpuusage" gst-launch-1.0 filesrc location=/am5728-gst-tests/video-samples/TearOfSteel-Short-1920x800-H264.mov ! qtdemux name=demux demux.video_0 ! queue ! h264parse ! ducatih264dec ! fakesink sync=true -e
+</pre>
+'''''Test pipeline (avdec_h264):'''''
+<pre style="background:#d6e4f1">
+GST_TRACER_PLUGINS="cpuusage" gst-launch-1.0 filesrc location=/am5728-gst-tests/video-samples/TearOfSteel-Short-1920x800-H264.mov ! qtdemux name=demux demux.video_0 ! queue ! h264parse ! avdec_h264 ! fakesink sync=true -e
+</pre>
+'''''Obtained Results:'''''
+[[Image:AM572x-testbench-MPEG4-enc-cpuload.png|center|700px|AM572x-testbench-MPEG4-enc-cpuload.png]]<br>
+In the chart above, is clearly shown that when using hardware acceleration, a substantial reduction in CPU workload is achieved. The average difference between CPU_0_accel and CPU_0_unaccel is 49.2% less load for CPU_0_accel. The average difference between CPU_1_accel and CPU_1_unaccel is 39% less load for CPU_1_accel.