Streaming RAW Video with GStreamer

From RidgeRun Developer Wiki


IMX6 RAW Streaming Performance

Test to measure latency when capturing from a sensor at 720x576 resolution using an IMX6 platform and streaming the IMX6 RAW video to an x86 platform.

Glass to glass test

For this test VAR-SOM-MX6 board was used.

IMX6 Transmitter to X86 Receiver

IMX6 Transmitter Pipeline

gst-launch-1.0 imxv4l2videosrc imx-capture-mode=3 ! rtpvrawpay ! udpsink host=10.251.101.212 port=5001 sync=false async=false -v

x86 Receiver Pipeline

gst-launch-1.0 udpsrc port=5001 caps = "application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=(string)YCbCr-4:2:2, depth=(string)8, \
width=(string)720, height=(string)576, colorimetry=(string)BT601-5, payload=(int)96, ssrc=(uint)155528026, timestamp-offset=(uint)2270520902, seqnum-offset=(uint)27437, \ 
a-framerate=(string)30" ! rtpvrawdepay ! videoconvert ! queue ! xvimagesink sync=false

Results

Latency Results: Test 1
Captured time (s) Received time (s) Latency (ms)
0.788 0.346 442
3.537 3.062 475
4.978 4.541 437
5.867 5.377 490
7.559 7.106 453
8.938 8.487 451
9.913 9.428 485
10.660 10.172 488
11.623 11.186 437
13.328 12.886 442
Latency Results: Test 2
Captured time (s) Received time (s) Latency (ms)
2.545 2.069 476
3.107 2.632 475
4.934 4.499 435
6.595 6.160 435
7.332 6.811 521
8.328 7.808 520
9.410 8.934 476
10.059 9.584 475
11.239 10.758 481
11.673 11.194 479
14.619 14.186 433

From those test the average latency obtained is approximately ~470 ms and the CPU usage in the IMX6 is about 43%. On x86 platform the CPU usage is between 5%-15% when receiving the streaming.

x86 Transmitter to IMX6 Receiver

x86 Transmitter Pipeline

  • 720x576
gst-launch-1.0 v4l2src ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  ! rtpvrawpay ! udpsink host=10.251.101.92 port=5001 sync=false async=false -v 
  • 176x144
gst-launch-1.0 v4l2src ! videoconvert ! videoscale ! "video/x-raw,width=176,height=144,format=UYVY" ! rtpvrawpay ! udpsink host=10.251.101.92 port=5001 sync=false async=false -v

IMX6 Receiver Pipeline

  • 720x576
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! rtpvrawdepay ! imxg2dvideosink window-width=720 window-height=576 sync=true -v
  • 176x144
gst-launch-1.0 udpsrc buffer-size=38016 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)176, height=(string)144,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! rtpvrawdepay ! imxg2dvideosink window-width=720 window-height=576 sync=true -v

Results

Latency Results 176x144
Captured time (s) Received time (s) Latency (ms)
51.730 51.648 82
50.050 49.926 124
48.118 48.039 79
46.225 46.139 86
44.549 44.420 129
42.635 42.542 93
40.593 40.568 25
38.142 38.056 86
35.218 35.132 86
09.632 09.547 85
Latency Results 720x576
Captured time (s) Received time (s) Latency (ms)
1.936 1.764 172
2.227 2.056 171
3.727 3.648 79
4.452 4.240 212
5.660 5.487 173
6.123 5.994 126
7.016 6.844 172
8.662 8.533 129
11.107 10.938 169
11.925 11.756 169

Build udpsrc for IMX6

sudo apt-get install gawk wget git-core diffstat unzip texinfo gcc-multilib \
build-essential chrpath socat cpio python python3 python3-pip python3-pexpect \
xz-utils debianutils iputils-ping libsdl1.2-dev xterm

sudo apt-get install autoconf libtool libglib2.0-dev libarchive-dev python-git \
sed cvs subversion coreutils texi2html docbook-utils python-pysqlite2 \
help2man make gcc g++ desktop-file-utils libgl1-mesa-dev libglu1-mesa-dev \
mercurial automake groff curl lzop asciidoc u-boot-tools dos2unix mtd-utils pv \
libncurses5 libncurses5-dev libncursesw5-dev libelf-dev zlib1g-dev

mkdir ~/var-fslc-yocto
cd ~/var-fslc-yocto

repo init -u https://github.com/varigit/variscite-bsp-platform.git -b rocko
repo sync -j4

MACHINE=var-som-mx6 DISTRO=fslc-framebuffer . setup-environment build_fb

bitbake gstreamer1.0-plugins-imx gstreamer1.0-plugins-good

UDP traffic tuning

Linux places very restrictive limits on the performance of UDP protocols by limiting the size of the UDP traffic that is allowed to buffer on the receive socket. Since we have high bitrate requirements we need to tune the limits of the socket. First check the current UDP/IP receive buffer default and limit on your IMX6:

root@var-som-mx6:~# sysctl net.core.rmem_max
net.core.rmem_max = 163840
root@var-som-mx6:~# sysctl net.core.rmem_default
net.core.rmem_default = 163840

That's around 160 kB (one buffer in the 720x576 res is around 607 kB). I recommend to increase the size to at least 8MB (in the future since we are planning to run several streams we might need to increase it even further). As a sudo user you need to run the following commands before executing the pipelines


$sysctl -w net.core.rmem_max=8388608
$sysctl -w net.core.wmem_max=8388608


Please check that when receiving the stream the "quality" issues no longer appear when executing the following pipelines :

  • Pipeline on Host-PC
gst-launch-1.0 v4l2src ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  ! rtpvrawpay ! udpsink host=<IMX6-IP> port=5001 sync=false async=false -v 
  • Pipeline on IMX6
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! rtpvrawdepay !  imxg2dvideosink window-width=720 window-height=576 sync=true -v

After this change, the CPU usage jumps to around 74% on the IMX6 for a single pipeline but that's ok; we will perform a trick on the next section to decrease that value .

UDP pipeline tunning

After the previous change, the CPU usage jumps from 54% to around 74% on the IMX6. This since we corrected the capabilities of the socket and there are no drops on the quality at the expense of having a higher CPU (since more packets are received). A possible way out is to increase the MTU size of the rtp packets created by rtpvrawpay, this will cause that the partition of this big package to happen on the physical network layer of the PC , and it's reconstruction on the IMX6 will also be performed by the physical layer.


UDP receiving only on IMX6

Reference pipelines without using MTU tnning

  • Pipeline on Host-PC
gst-launch-1.0 v4l2src device=/dev/video0 ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  !  rtpvrawpay  ! udpsink host=<IMX6-IP>  port=5001 sync=false async=false -v
  • Pipeline on IMX6
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! fakesink -v


On the IMX6 the pipeline has a CPU usage around ~66%

Reference pipelines after MTU tuning

  • Pipeline on Host-PC
gst-launch-1.0 v4l2src device=/dev/video0 ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  !  rtpvrawpay mtu=60000  ! udpsink host=<IMX6-IP>  port=5001 sync=false async=false -v
  • Pipeline on IMX6
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! fakesink -v


The following report was generated by Wireshark on the PC, shows that the network traffic is still fragmented by the physical layer.

Wireshark packet fragmentation

On the IMX6 the pipeline has a CPU usage of around ~15% to ~ 20%

UDP receiving on IMX6 + depayloader

  • Pipeline on Host-PC
gst-launch-1.0 v4l2src device=/dev/video0 ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  !  rtpvrawpay  mtu=6000  ! udpsink host=<IMX6-IP>  port=5001 sync=false async=false -v
  • Pipeline on IMX6
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! rtpvrawdepay !  fakesink -v

On the IMX6 the pipeline has a CPU usage of around ~ 29% to ~37%.

UDP receiving on IMX6 + payloader + display

  • Pipeline on Host-PC
gst-launch-1.0 v4l2src device=/dev/video0 ! videoconvert ! videoscale ! videorate ! "video/x-raw,width=720,height=576,format=I420,framerate=25/1"  !  rtpvrawpay  mtu=6000  ! udpsink host=<IMX6-IP>  port=5001 sync=false async=false -v
  • Pipeline on IMX6
gst-launch-1.0 udpsrc buffer-size=622080 port=5001 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)RAW, sampling=YCbCr-4:2:0,depth=(string)8,width=(string)720, height=(string)576,colorimetry=(string)BT601-5, payload=(int)96, a-framerate=25" ! rtpvrawdepay !  imxg2dvideosink window-width=720 window-height=576 sync=true -v

On the IMX6 the pipeline has a CPU usage of around ~32% to ~ 40% for the pipeline.

Latency Results 720x576
Captured time (s) Received time (s) Latency (ms)
12.733 12.606 127
15.038 14.907 131
16.335 16.205 130
17,629 17.466 163
19.414 19.282 132
21.659 21.525 134
28.816 28.686 143
32.593 32.450 143
34.057 33.938 119
35.656 35.544 112

For a preliminar test the latency is around 133,4ms . Please confirm these results on your side, also; it seems like currently the most amount of CPU bound operations are being spent on the rtpvrawdepay (Since as of our previous tests, just adding the depayloader adds around 15% of CPU usage ) we recommend that checking this element should be our next step.