Streaming RAW Video with GStreamer: Difference between revisions

From RidgeRun Developer Wiki
mNo edit summary
mNo edit summary
 
(One intermediate revision by the same user not shown)
Line 8: Line 8:
| {{NXP-PartnerProgram-Logo}}
| {{NXP-PartnerProgram-Logo}}
|}
|}
{|
 
| {{GStreamer debug}}
{{GStreamer debug}}
|}
 
<br>
==IMX6 RAW Streaming Performance==
=IMX6 RAW Streaming Performance=


Test to measure latency when capturing from a sensor at 720x576 resolution using an IMX6 platform and streaming the IMX6 RAW video to an x86 platform.
Test to measure latency when capturing from a sensor at 720x576 resolution using an IMX6 platform and streaming the IMX6 RAW video to an x86 platform.


==Glass to glass test==
===Glass to glass test===


For this test VAR-SOM-MX6 board was used.
For this test VAR-SOM-MX6 board was used.


===IMX6 Transmitter to X86 Receiver===
====IMX6 Transmitter to X86 Receiver====


====IMX6 Transmitter Pipeline====
=====IMX6 Transmitter Pipeline=====


  gst-launch-1.0 imxv4l2videosrc imx-capture-mode=3 ! rtpvrawpay ! udpsink host=10.251.101.212 port=5001 sync=false async=false -v
  gst-launch-1.0 imxv4l2videosrc imx-capture-mode=3 ! rtpvrawpay ! udpsink host=10.251.101.212 port=5001 sync=false async=false -v
   
   
====x86 Receiver Pipeline====
=====x86 Receiver Pipeline=====


  gst-launch-1.0 udpsrc port=5001 caps = "application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=(string)YCbCr-4:2:2, depth=(string)8, \
  gst-launch-1.0 udpsrc port=5001 caps = "application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=(string)YCbCr-4:2:2, depth=(string)8, \
Line 32: Line 31:
  a-framerate=(string)30" ! rtpvrawdepay ! videoconvert ! queue ! xvimagesink sync=false
  a-framerate=(string)30" ! rtpvrawdepay ! videoconvert ! queue ! xvimagesink sync=false


====Results====
=====Results=====


{| width="500" border="1" cellpadding="1" cellspacing="1"
{| width="500" border="1" cellpadding="1" cellspacing="1"
Line 157: Line 156:
From those test the average latency obtained is approximately '''~470 ms''' and the CPU usage in the IMX6 is about '''43%'''. On x86 platform the CPU usage is between '''5%-15%''' when receiving the streaming.
From those test the average latency obtained is approximately '''~470 ms''' and the CPU usage in the IMX6 is about '''43%'''. On x86 platform the CPU usage is between '''5%-15%''' when receiving the streaming.


===x86 Transmitter to IMX6 Receiver===
====x86 Transmitter to IMX6 Receiver====


====x86 Transmitter Pipeline====
=====x86 Transmitter Pipeline=====


* 720x576
* 720x576
Line 170: Line 169:
</pre>
</pre>


====IMX6 Receiver Pipeline====
=====IMX6 Receiver Pipeline=====


* 720x576
* 720x576
Line 182: Line 181:
</pre>
</pre>


====Results====
=====Results=====


{| width="500" border="1" cellpadding="1" cellspacing="1"
{| width="500" border="1" cellpadding="1" cellspacing="1"
Line 300: Line 299:
|}
|}


=Build udpsrc for IMX6=
==Build udpsrc for IMX6==


<pre>
<pre>
Line 325: Line 324:
</pre>
</pre>


=UDP traffic tuning=
==UDP traffic tuning==


Linux places very restrictive limits on the performance of UDP protocols by limiting the size of the UDP traffic that is allowed to buffer on the receive socket. Since we have high bitrate requirements we need to tune the limits of the socket. First check the current UDP/IP receive buffer default and limit on your IMX6:  
Linux places very restrictive limits on the performance of UDP protocols by limiting the size of the UDP traffic that is allowed to buffer on the receive socket. Since we have high bitrate requirements we need to tune the limits of the socket. First check the current UDP/IP receive buffer default and limit on your IMX6:  
Line 360: Line 359:
After this change, the CPU usage jumps to around 74% on the IMX6 for a single pipeline but that's ok; we will perform a trick on the next section to decrease that value .
After this change, the CPU usage jumps to around 74% on the IMX6 for a single pipeline but that's ok; we will perform a trick on the next section to decrease that value .


=UDP pipeline tunning=
==UDP pipeline tunning==


After the previous change, the CPU usage jumps from 54% to around 74% on the IMX6. This since we corrected the capabilities of the socket and there are no drops on the quality at the expense of having a higher CPU (since more packets are received). A possible way out is to increase the '''MTU''' size of the rtp packets created by '''rtpvrawpay''', this will cause that the partition of this big package to happen on the physical network layer of the PC , and it's reconstruction on the IMX6 will also be performed by the physical layer.  
After the previous change, the CPU usage jumps from 54% to around 74% on the IMX6. This since we corrected the capabilities of the socket and there are no drops on the quality at the expense of having a higher CPU (since more packets are received). A possible way out is to increase the '''MTU''' size of the rtp packets created by '''rtpvrawpay''', this will cause that the partition of this big package to happen on the physical network layer of the PC , and it's reconstruction on the IMX6 will also be performed by the physical layer.  




== UDP receiving only on IMX6 ==
=== UDP receiving only on IMX6 ===


=== Reference pipelines without using MTU tnning===
==== Reference pipelines without using MTU tnning====


*Pipeline on Host-PC
*Pipeline on Host-PC
Line 383: Line 382:
On the IMX6 the  pipeline has a CPU usage around ~'''66%'''
On the IMX6 the  pipeline has a CPU usage around ~'''66%'''


=== Reference pipelines after MTU tuning===
==== Reference pipelines after MTU tuning====


*Pipeline on Host-PC
*Pipeline on Host-PC
Line 402: Line 401:
On the IMX6 the pipeline has a CPU usage of around ~'''15%''' to ~ '''20%'''
On the IMX6 the pipeline has a CPU usage of around ~'''15%''' to ~ '''20%'''


== UDP receiving on IMX6 + depayloader ==
=== UDP receiving on IMX6 + depayloader ===


*Pipeline on Host-PC
*Pipeline on Host-PC
Line 417: Line 416:
On the IMX6 the pipeline has a CPU usage of around ~ '''29%''' to ~'''37%'''.
On the IMX6 the pipeline has a CPU usage of around ~ '''29%''' to ~'''37%'''.


== UDP receiving on IMX6 + payloader + display==
=== UDP receiving on IMX6 + payloader + display===


*Pipeline on Host-PC
*Pipeline on Host-PC
Line 491: Line 490:


For a preliminar test the latency is around '''133,4'''ms . Please confirm these results on your side, also; it seems like currently the most amount of CPU bound operations are being spent on the '''rtpvrawdepay''' (Since as of our previous tests, just adding the depayloader adds around 15% of CPU usage ) we recommend that checking this element should be our next step.
For a preliminar test the latency is around '''133,4'''ms . Please confirm these results on your side, also; it seems like currently the most amount of CPU bound operations are being spent on the '''rtpvrawdepay''' (Since as of our previous tests, just adding the depayloader adds around 15% of CPU usage ) we recommend that checking this element should be our next step.
{{ContactUs}}


[[Category:iMX6]][[Category:GStreamer]]
[[Category:iMX6]][[Category:GStreamer]]

Latest revision as of 19:12, 6 January 2025



Problems running the pipelines shown on this page? Please see our GStreamer Debugging guide for help.


IMX6 RAW Streaming Performance

Test to measure latency when capturing from a sensor at 720x576 resolution using an IMX6 platform and streaming the IMX6 RAW video to an x86 platform.

Glass to glass test

For this test VAR-SOM-MX6 board was used.

IMX6 Transmitter to X86 Receiver

IMX6 Transmitter Pipeline
gst-launch-1.0 imxv4l2videosrc imx-capture-mode=3 ! rtpvrawpay ! udpsink host=10.251.101.212 port=5001 sync=false async=false -v

x86 Receiver Pipeline
gst-launch-1.0 udpsrc port=5001 caps = "application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=(string)YCbCr-4:2:2, depth=(string)8, \
width=(string)720, height=(string)576, colorimetry=(string)BT601-5, payload=(int)96, ssrc=(uint)155528026, timestamp-offset=(uint)2270520902, seqnum-offset=(uint)27437, \ 
a-framerate=(string)30" ! rtpvrawdepay ! videoconvert ! queue ! xvimagesink sync=false
Results
Latency Results: Test 1
Captured time (s) Received time (s) Latency (ms)
0.788 0.346 442
3.537 3.062 475
4.978 4.541 437
5.867 5.377 490
7.559 7.106 453
8.938 8.487 451
9.913 9.428 485
10.660 10.172 488
11.623 11.186 437
13.328 12.886 442
Latency Results: Test 2
Captured time (s) Received time (s) Latency (ms)
2.545 2.069 476
3.107 2.632 475
4.934 4.499 435
6.595 6.160 435
7.332 6.811 521
8.328 7.808 520
9.410 8.934 476
10.059 9.584 475
11.239 10.758 481
11.673 11.194 479
14.619 14.186 433

From those test the average latency obtained is approximately ~470 ms and the CPU usage in the IMX6 is about 43%. On x86 platform the CPU usage is between 5%-15% when receiving the streaming.

x86 Transmitter to IMX6 Receiver

x86 Transmitter Pipeline
  • 720x576
gst-launch-1.0 v4l2src ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  ! rtpvrawpay ! udpsink host=10.251.101.92 port=5001 sync=false async=false -v 
  • 176x144
gst-launch-1.0 v4l2src ! videoconvert ! videoscale ! "video/x-raw,width=176,height=144,format=UYVY" ! rtpvrawpay ! udpsink host=10.251.101.92 port=5001 sync=false async=false -v
IMX6 Receiver Pipeline
  • 720x576
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! rtpvrawdepay ! imxg2dvideosink window-width=720 window-height=576 sync=true -v
  • 176x144
gst-launch-1.0 udpsrc buffer-size=38016 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)176, height=(string)144,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! rtpvrawdepay ! imxg2dvideosink window-width=720 window-height=576 sync=true -v
Results
Latency Results 176x144
Captured time (s) Received time (s) Latency (ms)
51.730 51.648 82
50.050 49.926 124
48.118 48.039 79
46.225 46.139 86
44.549 44.420 129
42.635 42.542 93
40.593 40.568 25
38.142 38.056 86
35.218 35.132 86
09.632 09.547 85
Latency Results 720x576
Captured time (s) Received time (s) Latency (ms)
1.936 1.764 172
2.227 2.056 171
3.727 3.648 79
4.452 4.240 212
5.660 5.487 173
6.123 5.994 126
7.016 6.844 172
8.662 8.533 129
11.107 10.938 169
11.925 11.756 169

Build udpsrc for IMX6

sudo apt-get install gawk wget git-core diffstat unzip texinfo gcc-multilib \
build-essential chrpath socat cpio python python3 python3-pip python3-pexpect \
xz-utils debianutils iputils-ping libsdl1.2-dev xterm

sudo apt-get install autoconf libtool libglib2.0-dev libarchive-dev python-git \
sed cvs subversion coreutils texi2html docbook-utils python-pysqlite2 \
help2man make gcc g++ desktop-file-utils libgl1-mesa-dev libglu1-mesa-dev \
mercurial automake groff curl lzop asciidoc u-boot-tools dos2unix mtd-utils pv \
libncurses5 libncurses5-dev libncursesw5-dev libelf-dev zlib1g-dev

mkdir ~/var-fslc-yocto
cd ~/var-fslc-yocto

repo init -u https://github.com/varigit/variscite-bsp-platform.git -b rocko
repo sync -j4

MACHINE=var-som-mx6 DISTRO=fslc-framebuffer . setup-environment build_fb

bitbake gstreamer1.0-plugins-imx gstreamer1.0-plugins-good

UDP traffic tuning

Linux places very restrictive limits on the performance of UDP protocols by limiting the size of the UDP traffic that is allowed to buffer on the receive socket. Since we have high bitrate requirements we need to tune the limits of the socket. First check the current UDP/IP receive buffer default and limit on your IMX6:

root@var-som-mx6:~# sysctl net.core.rmem_max
net.core.rmem_max = 163840
root@var-som-mx6:~# sysctl net.core.rmem_default
net.core.rmem_default = 163840

That's around 160 kB (one buffer in the 720x576 res is around 607 kB). I recommend to increase the size to at least 8MB (in the future since we are planning to run several streams we might need to increase it even further). As a sudo user you need to run the following commands before executing the pipelines


$sysctl -w net.core.rmem_max=8388608
$sysctl -w net.core.wmem_max=8388608


Please check that when receiving the stream the "quality" issues no longer appear when executing the following pipelines :

  • Pipeline on Host-PC
gst-launch-1.0 v4l2src ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  ! rtpvrawpay ! udpsink host=<IMX6-IP> port=5001 sync=false async=false -v 
  • Pipeline on IMX6
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! rtpvrawdepay !  imxg2dvideosink window-width=720 window-height=576 sync=true -v

After this change, the CPU usage jumps to around 74% on the IMX6 for a single pipeline but that's ok; we will perform a trick on the next section to decrease that value .

UDP pipeline tunning

After the previous change, the CPU usage jumps from 54% to around 74% on the IMX6. This since we corrected the capabilities of the socket and there are no drops on the quality at the expense of having a higher CPU (since more packets are received). A possible way out is to increase the MTU size of the rtp packets created by rtpvrawpay, this will cause that the partition of this big package to happen on the physical network layer of the PC , and it's reconstruction on the IMX6 will also be performed by the physical layer.


UDP receiving only on IMX6

Reference pipelines without using MTU tnning

  • Pipeline on Host-PC
gst-launch-1.0 v4l2src device=/dev/video0 ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  !  rtpvrawpay  ! udpsink host=<IMX6-IP>  port=5001 sync=false async=false -v
  • Pipeline on IMX6
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! fakesink -v


On the IMX6 the pipeline has a CPU usage around ~66%

Reference pipelines after MTU tuning

  • Pipeline on Host-PC
gst-launch-1.0 v4l2src device=/dev/video0 ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  !  rtpvrawpay mtu=60000  ! udpsink host=<IMX6-IP>  port=5001 sync=false async=false -v
  • Pipeline on IMX6
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! fakesink -v


The following report was generated by Wireshark on the PC, shows that the network traffic is still fragmented by the physical layer.

Wireshark packet fragmentation

On the IMX6 the pipeline has a CPU usage of around ~15% to ~ 20%

UDP receiving on IMX6 + depayloader

  • Pipeline on Host-PC
gst-launch-1.0 v4l2src device=/dev/video0 ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  !  rtpvrawpay  mtu=6000  ! udpsink host=<IMX6-IP>  port=5001 sync=false async=false -v
  • Pipeline on IMX6
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! rtpvrawdepay !  fakesink -v

On the IMX6 the pipeline has a CPU usage of around ~ 29% to ~37%.

UDP receiving on IMX6 + payloader + display

  • Pipeline on Host-PC
gst-launch-1.0 v4l2src device=/dev/video0 ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  !  rtpvrawpay  mtu=6000  ! udpsink host=<IMX6-IP>  port=5001 sync=false async=false -v
  • Pipeline on IMX6
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! rtpvrawdepay !  imxg2dvideosink window-width=720 window-height=576 sync=true -v

On the IMX6 the pipeline has a CPU usage of around ~32% to ~ 40% for the pipeline.

Latency Results 720x576
Captured time (s) Received time (s) Latency (ms)
12.733 12.606 127
15.038 14.907 131
16.335 16.205 130
17,629 17.466 163
19.414 19.282 132
21.659 21.525 134
28.816 28.686 143
32.593 32.450 143
34.057 33.938 119
35.656 35.544 112

For a preliminar test the latency is around 133,4ms . Please confirm these results on your side, also; it seems like currently the most amount of CPU bound operations are being spent on the rtpvrawdepay (Since as of our previous tests, just adding the depayloader adds around 15% of CPU usage ) we recommend that checking this element should be our next step.



For direct inquiries, please refer to the contact information available on our Contact page. Alternatively, you may complete and submit the form provided at the same link. We will respond to your request at our earliest opportunity.


Links to RidgeRun Resources and RidgeRun Artificial Intelligence Solutions can be found in the footer below.