Jump to content

AM5728 Multimedia Performance Testbench: Difference between revisions

Line 234: Line 234:


In the chart above, is clearly shown that when using hardware acceleration, a substantial reduction in CPU workload is achieved. The average difference between CPU_1_accel and CPU_1_unaccel is 48.8 % less load for CPU_1_accel. In both cases the CPU_0 has the same average  workload percentage, so there is no difference between them.
In the chart above, is clearly shown that when using hardware acceleration, a substantial reduction in CPU workload is achieved. The average difference between CPU_1_accel and CPU_1_unaccel is 48.8 % less load for CPU_1_accel. In both cases the CPU_0 has the same average  workload percentage, so there is no difference between them.
=== <span style="color:#0931C6">Frame-rate</span><br>  ===
'''''Test pipeline (ducatimpeg4enc):'''''
<pre style="background:#d6e4f1">
GST_TRACER_PLUGINS="framerate" gst-launch-1.0 -e videotestsrc num-buffers=640 is-live=true ! 'video/x-raw,format=(string)NV12,width=720,height=420,framerate=(fraction)30/1' ! ducatimpeg4enc ! fakesink sync=true
</pre>
'''''Test pipeline (avenc_mpeg4):'''''
<pre style="background:#d6e4f1">
GST_TRACER_PLUGINS="framerate" gst-launch-1.0 -e videotestsrc num-buffers=640 is-live=true ! 'video/x-raw,format=(string)I420,width=720,height=420,framerate=(fraction)30/1' ! avenc_mpeg4 ! fakesink sync=true
</pre>
'''''Obtained Results:'''''
[[Image:AM572x-testbench-MPEG4-enc-framerate.png|center|700px|AM572x-testbench-MPEG4-enc-framerate.png]]<br>
In the chart above, it can be seen in a general way that in both cases, the frame-rate reaches the expected value of 30 fps and then remains stable.
=== <span style="color:#0931C6">Memory consumption</span><br>  ===
'''''Test pipeline (ducatimpeg4enc):'''''
<pre style="background:#d6e4f1">
gst-launch-1.0 -e videotestsrc num-buffers=640 is-live=true ! 'video/x-raw,format=(string)NV12,width=720,height=420,framerate=(fraction)30/1' ! ducatimpeg4enc ! fakesink sync=true
</pre>
'''''Test pipeline (avenc_mpeg4):'''''
<pre style="background:#d6e4f1">
gst-launch-1.0 -e videotestsrc num-buffers=640 is-live=true ! 'video/x-raw,format=(string)I420,width=720,height=420,framerate=(fraction)30/1' ! avenc_mpeg4 ! fakesink sync=true
</pre>
'''''Obtained Results:'''''
[[Image:AM572x-testbench-MPEG4-enc-memuse.png|center|700px|AM572x-testbench-MPEG4-enc-memuse.png]]<br>
In the chart above, it can be seen that when using hardware acceleration, a big reduction is achieved in memory consumption. The average difference is 4514 KB of less consumption when hardware acceleration is used.
=== <span style="color:#0931C6">Memory bandwidth consumption</span><br>  ===
'''''Test pipeline (ducatimpeg4enc):'''''
<pre style="background:#d6e4f1">
gst-launch-1.0 -e videotestsrc is-live=true ! 'video/x-raw,format=(string)NV12,width=720,height=420,framerate=(fraction)30/1' ! ducatimpeg4enc ! fakesink sync=true
</pre>
'''''Test pipeline (avenc_mpeg4):'''''
<pre style="background:#d6e4f1">
gst-launch-1.0 -e videotestsrc is-live=true ! 'video/x-raw,format=(string)I420,width=720,height=420,framerate=(fraction)30/1' ! avenc_mpeg4 ! fakesink sync=true
</pre>
Note: In both charts the memory bandwidth consumption is presented separately in sequential (seq) and aleatory (al) memory access.
'''''Memory bandwidth consumption by memory readings obtained results:'''''
[[Image:AM572x-testbench-MPEG4-enc-readsbandwidth.png|center|700px|AM572x-testbench-MPEG4-enc-readbandwidth.png]]<br>
In the chart above, it can be seen that when using hardware acceleration, less memory bandwidth consumption by memory readings is obtained. The average difference is 358.1 MB/s for sequential reads and 446.9 MB/s for aleatory reads.
'''''Memory bandwidth consumption by memory writings obtained results:'''''
[[Image:AM572x-testbench-MPEG4-enc-writebandwidth.png|center|700px|AM572x-testbench-MPEG4-enc-writebandwidth.png]]<br>
In the chart above, it can be seen that when using hardware acceleration, less memory bandwidth consumption by memory writings is obtained. The average difference is 1832.7 MB/s for sequential writes and 499.9 MB/s for aleatory writes.
== <span style="color:#008080">H264 video decode</span><br>  ==
In this section you will find a comparison of H264 video decode GStreamer pipelines performance results between hardware accelerated and only software implementation. The hardware accelerated implementation uses gst-plugins-ducati (ducatih264dec element), and on the other side, the only software implementation uses the gst-plugins-libav (avdec_h264 element). The test pipelines only differ in H264 decode GStreamer element, using in one case the hardware accelerated, and in the other case using the non hardware accelerated implementation.
=== <span style="color:#0931C6">CPU load % per core</span><br>  ===
'''''Test pipeline (ducatih264dec):'''''
<pre style="background:#d6e4f1">
GST_TRACER_PLUGINS="cpuusage" gst-launch-1.0 filesrc location=/am5728-gst-tests/video-samples/TearOfSteel-Short-1920x800-H264.mov ! qtdemux name=demux demux.video_0 ! queue ! h264parse ! ducatih264dec ! fakesink sync=true -e
</pre>
'''''Test pipeline (avdec_h264):'''''
<pre style="background:#d6e4f1">
GST_TRACER_PLUGINS="cpuusage" gst-launch-1.0 filesrc location=/am5728-gst-tests/video-samples/TearOfSteel-Short-1920x800-H264.mov ! qtdemux name=demux demux.video_0 ! queue ! h264parse ! avdec_h264 ! fakesink sync=true -e
</pre>
'''''Obtained Results:'''''
[[Image:AM572x-testbench-MPEG4-enc-cpuload.png|center|700px|AM572x-testbench-MPEG4-enc-cpuload.png]]<br>
In the chart above, is clearly shown that when using hardware acceleration, a substantial reduction in CPU workload is achieved. The average difference between CPU_0_accel and CPU_0_unaccel is 49.2% less load for CPU_0_accel. The average difference between CPU_1_accel and CPU_1_unaccel is 39% less load for CPU_1_accel.




1,433

edits

Cookies help us deliver our services. By using our services, you agree to our use of cookies.