GstInference and ONNXRT backend
Make sure you also check GstInference's companion project: R2Inference |
GstInference |
---|
Introduction |
Getting started |
Supported architectures |
InceptionV1 InceptionV3 YoloV2 AlexNet |
Supported backends |
Caffe |
Metadata and Signals |
Overlay Elements |
Utils Elements |
Legacy pipelines |
Example pipelines |
Example applications |
Benchmarks |
Model Zoo |
Project Status |
Contact Us |
|
Microsoft ONNX Runtime is an inference engine focused on performance for ONNX (Open Neural Network Exchange) models.
ONNX Runtime provides scalability and high performance in order to support very heavy workloads, including extensibility options for compatibility with emerging hardware from NVIDIA, Intel, Xilinx, and Rockchip. It supports many of the most popular machine learning frameworks (Pytorch, TensorFlow, Keras, or any other framework that supports interoperability with ONNX standard).
Installation
GstInference depends on the C++ API of ONNX Runtime. For installation steps, follow the steps in R2Inference/Building the library section.
Enabling the backend
To use the ONNXRT backend on GstInference be sure to run the R2Inference configure with the flag -Denable-onnxrt=true
. Then, use the property backend=onnxrt
on the Gst-Inference plugins. Please refer to this wiki page for more information.
Properties
Some documentation of the C/C++ ONNX Runtime API can be found in onnxruntime_c_api.h and onnxruntime_cxx_api.h.
The following syntax is used to change backend options on Gst-Inference plugins:
backend::<property>
As an example, to change the backend to use Tensorflow-Lite with the inceptionv4 plugin you need to run the pipeline like this:
gst-launch-1.0 \ mobilenetv2 name=net model-location=mobilenet_v2_1.0_224_quant_edgetpu.tflite backend=edgetpu \ filesrc location=video_stream.mp4 ! decodebin ! videoconvert ! videoscale ! queue ! tee name=t \ t. ! queue ! videoconvert ! videoscale ! net.sink_model \ t. ! queue ! videoconvert ! net.sink_bypass \ net.src_model ! fakesink
To learn more about the ONNXRT C++ API, please check the ONNXRT API section on the R2Inference sub wiki.
Tuning performance
The properties graph-optimization-level
and intra-num-threads
can be used to increase performance of the inference tasks.
These results below were obtained graph-optimization-level=99
(GraphOptimizationLevel::ORT_ENABLE_ALL) and intra-num-threads=4
:
gst-launch-1.0 filesrc location=/home/jafet/work/devdirs/ridgerun/benchmark-onnxrt/Test_benchmark_video.mp4 num-buffers=-1 ! decodebin ! videoconvert ! perf print-arm-load=true name=inputperf ! tee name=t t. ! videoscale ! queue ! net.sink_model t. ! queue ! net.sink_bypass inceptionv1 backend=onnxrt name=net backend::graph-optimization-level=99 backend::intra-num-threads=4 model-location=/home/jafet/work/devdirs/ridgerun/benchmark-onnxrt/InceptionV1_onnxrt/graph_inceptionv1.onnx net.src_bypass ! perf print-arm-load=true name=outputperf ! videoconvert ! fakesink sync=false Setting pipeline to PAUSED ... Pipeline is PREROLLING ... Redistribute latency... Redistribute latency... INFO: perf: inputperf; timestamp: 15:43:37.771179466; bps: 0,000; mean_bps: 0,000; fps: 0,000; mean_fps: 0,000; cpu: 6; INFO: perf: outputperf; timestamp: 15:43:37.803875630; bps: 0,000; mean_bps: 0,000; fps: 0,000; mean_fps: 0,000; cpu: 6; Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstSystemClock INFO: perf: inputperf; timestamp: 15:43:38.780969031; bps: 700927029,287; mean_bps: 700927029,287; fps: 95,069; mean_fps: 95,069; cpu: 54; INFO: perf: outputperf; timestamp: 15:43:38.811741109; bps: 607166742,736; mean_bps: 607166742,736; fps: 82,352; mean_fps: 82,352; cpu: 53; INFO: perf: inputperf; timestamp: 15:43:39.783310697; bps: 595801631,577; mean_bps: 648364330,432; fps: 80,811; mean_fps: 87,940; cpu: 53; INFO: perf: outputperf; timestamp: 15:43:39.819033440; bps: 600192795,472; mean_bps: 603679769,104; fps: 81,406; mean_fps: 81,879; cpu: 53; Got EOS from element "pipeline0". Execution ended after 0:00:02.405809585 Setting pipeline to PAUSED ... Setting pipeline to READY ... Setting pipeline to NULL ... Freeing pipeline ...
These results below were obtained graph-optimization-level=0
(GraphOptimizationLevel::ORT_DISABLE_ALL) and intra-num-threads=1
:
gst-launch-1.0 filesrc location=/home/jafet/work/devdirs/ridgerun/benchmark-onnxrt/Test_benchmark_video.mp4 num-buffers=-1 ! decodebin ! videoconvert ! perf print-arm-load=true name=inputperf ! tee name=t t. ! videoscale ! queue ! net.sink_model t. ! queue ! net.sink_bypass inceptionv1 backend=onnxrt name=net backend::graph-optimization-level=0 backend::intra-num-threads=1 model-location=/home/jafet/work/devdirs/ridgerun/benchmark-onnxrt/InceptionV1_onnxrt/graph_inceptionv1.onnx net.src_bypass ! perf print-arm-load=true name=outputperf ! videoconvert ! fakesink sync=false Setting pipeline to PAUSED ... Pipeline is PREROLLING ... Redistribute latency... Redistribute latency... INFO: perf: inputperf; timestamp: 15:44:03.245464001; bps: 0,000; mean_bps: 0,000; fps: 0,000; mean_fps: 0,000; cpu: 6; INFO: perf: outputperf; timestamp: 15:44:03.332774956; bps: 0,000; mean_bps: 0,000; fps: 0,000; mean_fps: 0,000; cpu: 6; Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstSystemClock INFO: perf: inputperf; timestamp: 15:44:04.275339999; bps: 279197884,559; mean_bps: 279197884,559; fps: 37,869; mean_fps: 37,869; cpu: 14; INFO: perf: outputperf; timestamp: 15:44:04.353027962; bps: 187887513,070; mean_bps: 187887513,070; fps: 25,484; mean_fps: 25,484; cpu: 14; INFO: perf: inputperf; timestamp: 15:44:05.300288291; bps: 187026800,763; mean_bps: 233112342,661; fps: 25,367; mean_fps: 31,618; cpu: 14; INFO: perf: outputperf; timestamp: 15:44:05.377882071; bps: 187043988,326; mean_bps: 187465750,698; fps: 25,369; mean_fps: 25,427; cpu: 14; Got EOS from element "pipeline0". Execution ended after 0:00:07.932608446 Setting pipeline to PAUSED ... Setting pipeline to READY ... Setting pipeline to NULL ... Freeing pipeline ...