RidgeRun D3 NVIDIA Partner Showcase Jetson Xavier Multi-Camera AI Demo
|
|
|
Introduction
This demo shows the capabilities of the NVIDIA®Jetson Xavier™ by performing multi-camera capture through FPD-LINK III with Virtual Channels support, display of each individual camera stream on a grid, application of CUDA video processing filters, classification and detection inference, video stabilization processing, picture/text overlay and video streaming through the network.
Demo setup
- NVIDIA® Jetson AGX Xavier™ platform.
- Jetpack 4.2.0 System Image.
- D3 Engineering 16 FPD-Link III channels : JETSON™ AGX XAVIER™ FPD-LINK™ III INTERFACE CARD.
- D3RCM-OV10640-953 Rugged Camera Modules - 8 units.
- Internet access to the Xavier board.
Demo features
This demo captures from 8x D3RCM-OV10640-953 Rugged Camera Modules at 1280x1080 @30fps using a serialized FDP-LINK III interface and virtual channels on NVIDIA Jetson AGX Xavier platform. The 8 camera streams are downscaled to 480x480 resolution and displayed on a grid of 1920x960 resolution. Extra processing is applied to different camera streams, below you will find a summary of it:
- Camera_1: This camera stream has no extra processing, just normal camera stream. Intended to be used as a point of comparison/reference against the streams with CUDA video processing filters (cameras 2,3 and 4).
- Camera_2: This camera stream has a Sobel in X-axis CUDA video filter applied with GstCUDA plugin. GstCUDA is a RidgeRun developed GStreamer plug-in enabling easy CUDA algorithm integration into GStreamer pipelines. The Sobel filter is used in image processing and computer vision, particularly within edge detection algorithms where it creates an image emphasizing edges. The Sobel in X-axis filter is particularly used to detect and enhance vertical edges in the processed image. The output image is the result of applying a 3x3 kernel size convolution to the incoming image.
- Camera_3: This camera stream has a Border Enhancement CUDA video filter applied with GstCUDA plugin. The Border Enhancement filter consists of a Sobel in X-axis and Sobel in Y-axis filters applied to incoming images. The Sobel operator calculates the approximate image gradient of each pixel by convolving the image with a pair of 3×3 filters. These filters estimate the gradients in the horizontal (x) and vertical (y) directions and the magnitude of the gradient is simply the sum of these 2 gradients.
- Camera_4: This camera stream has a Grayscale CUDA video filter applied with GstCUDA plugin. The Grayscale filter is a very simple filter that just turns off the chromas (UV planes) from the I420 color-space format incoming images and leaves the lumas (Y plane) untouched.
- Camera_5: This camera stream has no extra processing, just a normal camera stream. Intended to be used as a point of comparison/reference against the stream with video stabilization processing (camera 6).
- Camera_6: This camera stream has a video stabilization processing algorithm applied with GstNvStabilize plugin. GstNvStabilize is a GStreamer plug-in that performs video stabilization on a sequence of images accelerated with the GPU. Specifically, the algorithm uses internal modules provided by the NVIDIA VisionWorks toolkit and OpenVX to compute the Harris feature detection and the Lucas-Kanade sparse pyramidal optical flow that provides the interframe's motion estimation.
- Camera_7: This camera stream is processed by a classification Deep Learning AI network. Particularly, it performs InceptionV1 Classification Inference applied with GstInference plugin using GPU accelerated TensorFlow. GstInference is an ongoing open-source project from Ridgerun Engineering that allows easy integration of deep learning networks into your existing pipeline. This example classifies each incoming frame into one of 1000 possible classes (ImageNet classes). Simultaneously, the pipeline displays the captured frames with the associated label in a window.
- Camera_8: This camera stream is processed by a detection Deep Learning AI network. Particularly, TinyYoloV2 Detection Inference applied with GstInference plugin using GPU accelerated TensorFlow. This example detects objects in each buffer, and the possible objects to detect correspond to the TinyYOLO classes. Simultaneously, the pipeline displays the captured frames with every detected object marked with a label and a square.
The CUDA filters and Video stabilization processing are applied to the downscaled 480x480 streams. The Inference is applied to a downscaled video to the resolution required for each model, but the inference result overlay is applied to 1280x1080 resolution frames.
The company logo is overlayed on the grid display using the GStreamer embedded overlay plugin. Emboverlay is a GStreamer element that can be used to overlay: images, text, and/or time and date over video streams or photos without using lots of floating point arithmetic. This is necessary to get a good performance on your device.
One individual camera stream selected by the user from the demo menu is streamed to the network using the GstWebRTC plugin and an OpenWebRTC demo application running in a server that allows accessing the selected camera stream from every device with an internet connection (PC, tablet, smartphone), by just accessing the following URL: https://webrtc.ridgerun.com:8443/
The Multimedia server of the demo was built using GStreamer Daemon as the base of the script. GStreamer Daemon is an open-source GStreamer framework that allows building a multimedia server in under 30 minutes, so it is a very practical option to be considered for multimedia applications.
Demo code
This section shows the code script used in the demo. From this script, you can notice how easy, in just a few lines of code you can have such a complex multimedia server application running up. This can be achieved thanks to GStreamer Daemon Framework and the different GStreamer plugins developed by RidgeRun.
#!/bin/bash echo -e "\n ====== RidgeRun D3 Nvidia Partner Showcase Jetson Xavier Multi-Camera AI Demo ====== \n" # Configure the system export DISPLAY=:0 echo > demo.log echo > nvargus.log sudo service nvargus-daemon stop sudo pkill -9 nvargus-daemon sudo jetson_clocks sleep 1 sudo enableCamInfiniteTimeout=1 nvargus-daemon &>nvargus.log & sleep 1 # Configure Demo Variables ROOM_ID=rrdemo SERVER_URL='https://webrtc.ridgerun.com:8443' CAPTURE_CAPS='video/x-raw(memory:NVMM),width=1280,height=1080,framerate=30/1,format=NV12' DOWNSCALE_CAPS='video/x-raw,width=480,height=480,format=I420,pixel-aspect-ratio=1/1' DOWNSCALE_NVMM_CAPS='video/x-raw(memory:NVMM),width=480,height=480,format=I420,pixel-aspect-ratio=1/1' DOWNSCALE_CAPS_RGBA='video/x-raw,width=480,height=480,format=RGBA' DOWNSCALE_NVMM_CAPS_RGBA='video/x-raw(memory:NVMM),width=480,height=480,format=RGBA' GSTCUDA_IN_CAPS='video/x-raw(memory:NVMM),width=480,height=480,format=I420,pixel-aspect-ratio=1/1' GSTCUDA_OUT_CAPS='video/x-raw,width=480,height=480,format=I420,pixel-aspect-ratio=1/1' DISPLAY_CAPS='video/x-raw(memory:NVMM),width=1920,height=960,format=I420' NVMM_CAPS='video/x-raw(memory:NVMM)' TY_MODEL_LOCATION='/home/nvidia/jetson-multi-camera-demo/TinyYoloV2_TensorFlow/graph_tinyyolov2_tensorflow.pb' TY_INPUT_LAYER='input/Placeholder' TY_OUTPUT_LAYER='add_8' TY_LABELS='/home/nvidia/jetson-multi-camera-demo/TinyYoloV2_TensorFlow/labels.txt' TY_OUT_CAPS='video/x-raw,width=1280,height=1080,format=RGBA' INC_MODEL_LOCATION='/home/nvidia/jetson-multi-camera-demo/InceptionV1_TensorFlow/graph_inceptionv1_tensorflow.pb' INC_INPUT_LAYER='input' INC_OUTPUT_LAYER='InceptionV1/Logits/Predictions/Reshape_1' INC_LABELS='/home/nvidia/jetson-multi-camera-demo/InceptionV1_TensorFlow/imagenet_labels.txt' INC_OUT_CAPS='video/x-raw,width=1280,height=1080,format=BGRx' END_MSJ="!!! Jetson Xavier Multi-Camera AI Demo Finished !!!" # Graceful cleanup upon CTRL-C trap \ "killall gstd;"\ "echo -e $END_MSJ; exit" SIGINT # Launch GStreamer Daemon echo -e "\n ====> Launching GStreamer Daemon \n" killall gstd &> /dev/null sleep 1 gstd &>> demo.log& sleep 1 # Create pipelines # Camera_1 Pipeline: No extra processing, just normal camera stream. CAM1="nvarguscamerasrc sensor-id=0 name=cam1 aelock=true awblock=true wbmode=0 ! $CAPTURE_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! nvvidconv ! $GSTCUDA_OUT_CAPS ! \ interpipesink name=psink_cam1 caps=$GSTCUDA_OUT_CAPS sync=false qos=false async=false \ enable-last-sample=false drop=true max-buffers=2 forward-eos=false forward-events=false \ " # Camera_2 Pipeline: Sobel in X-axis CUDA video filter applied with GstCUDA plugin. CAM2="nvarguscamerasrc sensor-id=1 name=cam2 aelock=true awblock=true wbmode=0 ! $CAPTURE_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! nvvidconv ! $GSTCUDA_IN_CAPS ! queue max-size-buffers=1 leaky=downstream ! \ cudafilter in-place=false location=/home/nvidia/jetson-multi-camera-demo/GSTCUDA_FILTERS/sobelx.so ! $GSTCUDA_OUT_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! interpipesink name=psink_cam2 caps=$GSTCUDA_OUT_CAPS sync=false qos=false async=false \ enable-last-sample=false drop=true max-buffers=2 forward-eos=false forward-events=false \ " # Camera_3 Pipeline: Border Enhancement CUDA video filter applied with GstCUDA] plugin. CAM3="nvarguscamerasrc sensor-id=2 name=cam3 aelock=true awblock=true wbmode=0 ! $CAPTURE_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! nvvidconv ! $GSTCUDA_IN_CAPS ! queue max-size-buffers=1 leaky=downstream ! \ cudafilter in-place=false location=/home/nvidia/jetson-multi-camera-demo/GSTCUDA_FILTERS/border-enhancement.so ! $GSTCUDA_OUT_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! interpipesink name=psink_cam3 caps=$GSTCUDA_OUT_CAPS sync=false qos=false async=false \ enable-last-sample=false drop=true max-buffers=2 forward-eos=false forward-events=false \ " # Camera_4 Pipeline: Grayscale CUDA video filter applied with GstCUDA plugin. CAM4="nvarguscamerasrc sensor-id=3 name=cam4 aelock=true awblock=true wbmode=0 ! $CAPTURE_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! nvvidconv ! $GSTCUDA_IN_CAPS ! queue max-size-buffers=1 leaky=downstream ! \ cudafilter in-place=false location=/home/nvidia/jetson-multi-camera-demo/GSTCUDA_FILTERS/gray-scale-filter.so ! $GSTCUDA_OUT_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! interpipesink name=psink_cam4 caps=$GSTCUDA_OUT_CAPS sync=false qos=false async=false \ enable-last-sample=false drop=true max-buffers=2 forward-eos=false forward-events=false \ " # Camera_5 Pipeline: No extra processing, just normal camera stream. Intended to be used as a point of comparison against the stream with video stabilization processing. CAM5="nvarguscamerasrc sensor-id=4 name=cam5 aelock=true awblock=true wbmode=0 ! $CAPTURE_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! nvvidconv ! $GSTCUDA_OUT_CAPS ! \ interpipesink name=psink_cam5 caps=$GSTCUDA_OUT_CAPS sync=false qos=false async=false \ enable-last-sample=false drop=true max-buffers=2 forward-eos=false forward-events=false \ " # Camera_6 Pipeline: Video stabilization processing applied with GstNvStabilize plugin. CAM6="nvarguscamerasrc sensor-id=5 name=cam6 aelock=true awblock=true wbmode=0 ! $CAPTURE_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! nvvidconv ! $DOWNSCALE_CAPS_RGBA ! nvstabilize crop-margin=0.2 queue-size=6 ! \ videoconvert ! $GSTCUDA_OUT_CAPS ! interpipesink name=psink_cam6 caps=$GSTCUDA_OUT_CAPS sync=false qos=false async=false \ enable-last-sample=false drop=true max-buffers=2 forward-eos=false forward-events=false \ " # Camera_7 Pipeline: InceptionV1 Classification Inference applied with GstInference plugin using GPU accelerated TensorFlow. CAM7="nvarguscamerasrc sensor-id=6 name=cam7 aelock=true awblock=true wbmode=0 ! $CAPTURE_CAPS ! \ tee name=tee_7 tee_7. ! queue max-size-buffers=1 leaky=downstream ! nvvidconv ! video/x-raw ! inc_7.sink_bypass tee_7. ! \ queue max-size-buffers=1 leaky=downstream ! nvvidconv ! inc_7.sink_model inceptionv1 name=inc_7 backend=tensorflow \ model-location=$INC_MODEL_LOCATION backend::input-layer=$INC_INPUT_LAYER backend::output-layer=$INC_OUTPUT_LAYER inc_7.src_bypass ! \ queue max-size-buffers=1 leaky=downstream ! $INC_OUT_CAPS ! classificationoverlay labels="$(cat $INC_LABELS)" font-scale=4 thickness=4 ! \ queue max-size-buffers=1 leaky=downstream ! nvvidconv ! $DOWNSCALE_NVMM_CAPS ! nvvidconv ! $DOWNSCALE_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! interpipesink name=psink_cam7 caps=$DOWNSCALE_CAPS sync=false qos=false async=false \ enable-last-sample=false drop=true max-buffers=2 forward-eos=false forward-events=false \ " # Camera_8 Pipeline: TinyYoloV2 Detection Inference applied with GstInference plugin using GPU accelerated TensorFlow. CAM8="nvarguscamerasrc sensor-id=7 name=cam8 aelock=true awblock=true wbmode=0 ! $CAPTURE_CAPS ! \ tee name=tee_8 tee_8. ! queue max-size-buffers=1 leaky=downstream ! nvvidconv ! video/x-raw ! ty_8.sink_bypass tee_8. ! \ queue max-size-buffers=1 leaky=downstream ! nvvidconv ! ty_8.sink_model tinyyolov2 name=ty_8 backend=tensorflow \ model-location=$TY_MODEL_LOCATION backend::input-layer=$TY_INPUT_LAYER backend::output-layer=$TY_OUTPUT_LAYER ty_8.src_bypass ! \ queue max-size-buffers=1 leaky=downstream ! detectionoverlay labels="$(cat $TY_LABELS)" font-scale=4 thickness=4 ! $TY_OUT_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! nvvidconv ! $DOWNSCALE_NVMM_CAPS ! nvvidconv ! $DOWNSCALE_CAPS ! \ queue max-size-buffers=1 leaky=downstream ! interpipesink name=psink_cam8 caps=$DOWNSCALE_CAPS sync=false qos=false async=false \ enable-last-sample=false drop=true max-buffers=2 forward-eos=false forward-events=false \ " # Display Grid Pipeline GRID="interpipesrc name=grid_cam1 listen-to=psink_cam1 format=time max-bytes=0 block=false max-latency=0 is-live=true allow-renegotiation=false \ enable-sync=true accept-events=false accept-eos-event=false ! queue ! \ perf name=perf1 ! textoverlay name=textoverlay1 text=Normal-camera_1 shaded-background=true ! mixer.sink_0 \ interpipesrc name=grid_cam2 listen-to=psink_cam2 format=time max-bytes=0 block=false max-latency=0 is-live=true allow-renegotiation=false \ enable-sync=true accept-events=false accept-eos-event=false ! queue ! \ perf name=perf2 ! textoverlay name=textoverlay2 text=SobelX-Filter-camera_2 shaded-background=true ! mixer.sink_1 \ interpipesrc name=grid_cam3 listen-to=psink_cam3 format=time max-bytes=0 block=false max-latency=0 is-live=true allow-renegotiation=false \ enable-sync=true accept-events=false accept-eos-event=false ! queue ! \ perf name=perf3 ! textoverlay name=textoverlay3 text=Border-Enhancement-Filter-camera_3 shaded-background=true ! mixer.sink_2 \ interpipesrc name=grid_cam4 listen-to=psink_cam4 format=time max-bytes=0 block=false max-latency=0 is-live=true allow-renegotiation=false \ enable-sync=true accept-events=false accept-eos-event=false ! queue ! \ perf name=perf4 ! textoverlay name=textoverlay4 text=Grayscale-Filter-camera_4 shaded-background=true ! mixer.sink_3 \ interpipesrc name=grid_cam5 listen-to=psink_cam5 format=time max-bytes=0 block=false max-latency=0 is-live=true allow-renegotiation=false \ enable-sync=true accept-events=false accept-eos-event=false ! queue ! \ perf name=perf5 ! textoverlay name=textoverlay5 text=Stabilization-Reference-camera_5 shaded-background=true ! mixer.sink_4 \ interpipesrc name=grid_cam6 listen-to=psink_cam6 format=time max-bytes=0 block=false max-latency=0 is-live=true allow-renegotiation=false \ enable-sync=true accept-events=false accept-eos-event=false ! queue ! \ perf name=perf6 ! textoverlay name=textoverlay6 text=Video-Stabilization-camera_6 shaded-background=true ! mixer.sink_5 \ interpipesrc name=grid_cam7 listen-to=psink_cam7 format=time max-bytes=0 block=false max-latency=0 is-live=true allow-renegotiation=false \ enable-sync=true accept-events=false accept-eos-event=false ! queue ! \ perf name=perf7 ! textoverlay name=textoverlay7 text=Clasification-Inference-camera_7 shaded-background=true ! mixer.sink_6 \ interpipesrc name=grid_cam8 listen-to=psink_cam8 format=time max-bytes=0 block=false max-latency=0 is-live=true allow-renegotiation=false \ enable-sync=true accept-events=false accept-eos-event=false ! queue ! \ perf name=perf8 ! textoverlay name=textoverlay8 text=Detection-Inference-camera_8 shaded-background=true ! mixer.sink_7 \ videomixer \ name=mixer sink_0::xpos=0 sink_0::ypos=0 sink_1::xpos=480 \ sink_1::ypos=0 sink_2::xpos=960 sink_2::ypos=0 sink_3::xpos=1440 sink_3::ypos=0 sink_4::xpos=0 sink_4::ypos=480 \ sink_5::xpos=480 sink_5::ypos=480 sink_6::xpos=960 sink_6::ypos=480 sink_7::xpos=1440 sink_7::ypos=480 ! \ queue max-size-buffers=1 leaky=downstream ! videoconvert ! \ emboverlay logo=\"RR-D3-logo.png\" logo-offsetv=0 logo-offseth=1440 logo-transparency=0 logo-enable=true ! \ queue max-size-buffers=1 leaky=downstream ! \ nvvidconv ! nvoverlaysink enable-last-sample=false sync=false async=false \ " # GstWebRTC OpenWebRTC Stream Pipeline GSTWEBRTC_STREAM_OPENWEBRTC="rrwebrtcbin start-call=true signaler=GstOwrSignaler signaler::server_url=$SERVER_URL signaler::session_id=$ROOM_ID name=web \ interpipesrc name=psrc_webrtc listen-to=psink_cam1 format=time max-bytes=0 block=false max-latency=0 is-live=true allow-renegotiation=true \ enable-sync=true accept-events=false accept-eos-event=false ! queue ! \ nvvidconv ! omxh264enc insert-sps-pps=true ! h264parse ! rtph264pay ! \ capssetter caps=application/x-rtp,profile-level-id=(string)42c01f ! queue ! web.video_sink" echo -e "\n ====> Creating Pipelines \n" gstd-client pipeline_create camera1 $CAM1 &>> demo.log gstd-client pipeline_create camera2 $CAM2 &>> demo.log gstd-client pipeline_create camera3 $CAM3 &>> demo.log gstd-client pipeline_create camera4 $CAM4 &>> demo.log gstd-client pipeline_create camera5 $CAM5 &>> demo.log gstd-client pipeline_create camera6 $CAM6 &>> demo.log gstd-client pipeline_create camera7 $CAM7 &>> demo.log gstd-client pipeline_create camera8 $CAM8 &>> demo.log gstd-client pipeline_create grid $GRID &>> demo.log gstd-client pipeline_create stream $GSTWEBRTC_STREAM_OPENWEBRTC &>> demo.log # Change pipelines to PLAYING STATE echo -e "\n ====> Starting Pipelines \n" gstd-client pipeline_play camera1 &>> demo.log gstd-client pipeline_play camera2 &>> demo.log gstd-client pipeline_play camera3 &>> demo.log gstd-client pipeline_play camera4 &>> demo.log gstd-client pipeline_play camera5 &>> demo.log gstd-client pipeline_play camera6 &>> demo.log gstd-client pipeline_play camera7 &>> demo.log gstd-client pipeline_play camera8 &>> demo.log gstd-client pipeline_play grid &>> demo.log gstd-client pipeline_play stream &>> demo.log echo -e "\n ====> By default camera_1 feed is being streamed to the network using GstWebRTC ! \n"
Performance profiling
Bellow, you will find a summary of the demo´s performance stats. It is important to highlight that this demo was not optimized, so it is possible to get better performance results:
- Framerate:
- Normal camera streams, CUDA video processing filter streams and video stabilization stream keep the framerate at 30 fps.
- InceptionV1 Classification Inference stream shows 10-15 fps.
- TinyYoloV2 Detection Inference shows 20-25 fps.
- ARM CPU Load:
- Each of the 8 ARM CPU cores shows an average load of 40-50 %
- GPU Load:
- 90-95% average load
RidgeRun products & technologies used in the demo
- Camera drivers and custom boards setup support:
- GstCUDA
- GstNvStabilize
- GstInference
- GstWebRTC
- GStreamer Daemon
- GstInterpipe
- Fast GStreamer overlay element
- Perf: Easy framerate and CPU load profiling tool (github link)
For direct inquiries, please refer to the contact information available on our Contact page. Alternatively, you may complete and submit the form provided at the same link. We will respond to your request at our earliest opportunity.
Links to RidgeRun Resources and RidgeRun Artificial Intelligence Solutions can be found in the footer below.