GstInference and ONNXRT backend

Microsoft ONNX Runtime is an inference engine focused on performance for ONNX (Open Neural Network Exchange) models.

ONNX Runtime provides scalability and high performance in order to support very heavy workloads, including extensibility options for compatibility with emerging hardware from NVIDIA, Intel, Xilinx, and Rockchip. It supports many of the most popular machine learning frameworks (Pytorch, TensorFlow, Keras, or any other framework that supports interoperability with ONNX standard).

Installation

GstInference depends on the C++ API of ONNX Runtime. For installation steps, follow the steps in R2Inference/Building the library section.

Enabling the backend

To use the ONNXRT backend on GstInference be sure to run the R2Inference configure with the flag -Denable-onnxrt=true . Then, use the property backend=onnxrt on the Gst-Inference plugins. Please refer to this wiki page for more information.

Properties

Some documentation of the C/C++ ONNX Runtime API can be found in onnxruntime_c_api.h and onnxruntime_cxx_api.h.

The following syntax is used to change backend options on Gst-Inference plugins:

backend::<property>

As an example, to change the backend to use Tensorflow-Lite with the inceptionv4 plugin you need to run the pipeline like this:

gst-launch-1.0 \ 
mobilenetv2 name=net model-location=mobilenet_v2_1.0_224_quant_edgetpu.tflite backend=edgetpu \
filesrc location=video_stream.mp4 ! decodebin ! videoconvert ! videoscale ! queue ! tee name=t \
t. ! queue ! videoconvert ! videoscale !  net.sink_model \
t. ! queue ! videoconvert ! net.sink_bypass \
 net.src_model ! fakesink

To learn more about the ONNXRT C++ API, please check the ONNXRT API section on the R2Inference sub wiki.

Tuning performance

The properties graph-optimization-level and intra-num-threads can be used to increase performance of the inference tasks.

These results below were obtained graph-optimization-level=99 (GraphOptimizationLevel::ORT_ENABLE_ALL) and intra-num-threads=4 :

gst-launch-1.0 filesrc location=/home/jafet/work/devdirs/ridgerun/benchmark-onnxrt/Test_benchmark_video.mp4 num-buffers=-1 ! decodebin ! videoconvert ! perf print-arm-load=true name=inputperf ! tee name=t t. ! videoscale ! queue ! net.sink_model t. ! queue ! net.sink_bypass inceptionv1 backend=onnxrt name=net backend::graph-optimization-level=99 backend::intra-num-threads=4 model-location=/home/jafet/work/devdirs/ridgerun/benchmark-onnxrt/InceptionV1_onnxrt/graph_inceptionv1.onnx net.src_bypass ! perf print-arm-load=true name=outputperf ! videoconvert ! fakesink sync=false
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
Redistribute latency...
INFO:
perf: inputperf; timestamp: 15:43:37.771179466; bps: 0,000; mean_bps: 0,000; fps: 0,000; mean_fps: 0,000; cpu: 6; 
INFO:
perf: outputperf; timestamp: 15:43:37.803875630; bps: 0,000; mean_bps: 0,000; fps: 0,000; mean_fps: 0,000; cpu: 6; 
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
INFO:
perf: inputperf; timestamp: 15:43:38.780969031; bps: 700927029,287; mean_bps: 700927029,287; fps: 95,069; mean_fps: 95,069; cpu: 54; 
INFO:
perf: outputperf; timestamp: 15:43:38.811741109; bps: 607166742,736; mean_bps: 607166742,736; fps: 82,352; mean_fps: 82,352; cpu: 53; 
INFO:
perf: inputperf; timestamp: 15:43:39.783310697; bps: 595801631,577; mean_bps: 648364330,432; fps: 80,811; mean_fps: 87,940; cpu: 53; 
INFO:
perf: outputperf; timestamp: 15:43:39.819033440; bps: 600192795,472; mean_bps: 603679769,104; fps: 81,406; mean_fps: 81,879; cpu: 53; 
Got EOS from element "pipeline0".
Execution ended after 0:00:02.405809585
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Setting pipeline to NULL ...
Freeing pipeline ...

These results below were obtained graph-optimization-level=0 (GraphOptimizationLevel::ORT_DISABLE_ALL) and intra-num-threads=1 :

gst-launch-1.0 filesrc location=/home/jafet/work/devdirs/ridgerun/benchmark-onnxrt/Test_benchmark_video.mp4 num-buffers=-1 ! decodebin ! videoconvert ! perf print-arm-load=true name=inputperf ! tee name=t t. ! videoscale ! queue ! net.sink_model t. ! queue ! net.sink_bypass inceptionv1 backend=onnxrt name=net backend::graph-optimization-level=0 backend::intra-num-threads=1 model-location=/home/jafet/work/devdirs/ridgerun/benchmark-onnxrt/InceptionV1_onnxrt/graph_inceptionv1.onnx net.src_bypass ! perf print-arm-load=true name=outputperf ! videoconvert ! fakesink sync=false
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
Redistribute latency...
INFO:
perf: inputperf; timestamp: 15:44:03.245464001; bps: 0,000; mean_bps: 0,000; fps: 0,000; mean_fps: 0,000; cpu: 6; 
INFO:
perf: outputperf; timestamp: 15:44:03.332774956; bps: 0,000; mean_bps: 0,000; fps: 0,000; mean_fps: 0,000; cpu: 6; 
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
INFO:
perf: inputperf; timestamp: 15:44:04.275339999; bps: 279197884,559; mean_bps: 279197884,559; fps: 37,869; mean_fps: 37,869; cpu: 14; 
INFO:
perf: outputperf; timestamp: 15:44:04.353027962; bps: 187887513,070; mean_bps: 187887513,070; fps: 25,484; mean_fps: 25,484; cpu: 14; 
INFO:
perf: inputperf; timestamp: 15:44:05.300288291; bps: 187026800,763; mean_bps: 233112342,661; fps: 25,367; mean_fps: 31,618; cpu: 14; 
INFO:
perf: outputperf; timestamp: 15:44:05.377882071; bps: 187043988,326; mean_bps: 187465750,698; fps: 25,369; mean_fps: 25,427; cpu: 14; 
Got EOS from element "pipeline0".
Execution ended after 0:00:07.932608446
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Setting pipeline to NULL ...
Freeing pipeline ...

Previous: Supported backends/TensorRT

Index

Next: Supported backends/ONNXRT ACL

❯