R2Inference - ONNXRT

Description

Microsoft's ONNX Runtime is an open-source project for accelerated training and inferencing of deep learning models compliant with the ONNX standard. It supports many of the most popular machine learning frameworks (Pytorch, TensorFlow, Keras, or any other framework that supports interoperability with ONNX standard). It can also be run in multiple hardware platforms and supports hardware acceleration through its different execution providers.

Installation

The R2Inference ONNXRT backend depends on the C/C++ ONNX Runtime API. For the x86 platform, the installation process needs to install ONNX Runtime from the source code.

x86

The ONNX Runtime C/C++ API can be installed from source code. This section presents instructions to build from the source files based on the official installation guide.

The following commands are based on the instruction of the official guide presented before:

#This step is needed if you require to use ONNX tools included in onnxruntime repo
pip3 install onnx

git clone --recursive https://github.com/Microsoft/onnxruntime -b v1.2.0
cd onnxruntime
sudo -H pip3 install cmake
./build.sh --config RelWithDebInfo --build_shared_lib --parallel
cd build/Linux/RelWithDebInfo
sudo make install
sudo cp libonnxruntime.so.1.2.0 /usr/lib/x86_64-linux-gnu/libonnxruntime.so.1.2.0
sudo ln -s /usr/lib/x86_64-linux-gnu/libonnxruntime.so.1.2.0 /usr/lib/x86_64-linux-gnu/libonnxruntime.so

Convert existing models for R2I ONNXRT backend

Tensorflow

There is an official tool at https://github.com/onnx/tensorflow-onnx

Requirements

Python >= 3.6 or (tf2onnx-1.5.4 is the last version that supports python 3.5)
Tensorflow (python version)

Example usage

python3 -m tf2onnx.convert --graphdef graph_inceptionv1_tensorflow.pb --output graph_inceptionv1.onnx --inputs input:0 --outputs InceptionV1/Logits/Predictions/Reshape_1:0

API

Some documentation of the C/C++ ONNX Runtime API can be found in onnxruntime_c_api.h and onnxruntime_cxx_api.h. The R2Inference uses the C++ API which is mostly a wrapper for the C API. R2Inference provides a high-level abstraction for loading the ONNX model, creating the ONNX Runtime session, and executing the inference of the model. We recommend looking at the examples available for this backend in the r2inference repository to get familiar with the R2Inference interface. R2Inference also abstracts many options available in the ONNX Runtime C++ API through the "IParameters class".

The parameters listed below are currently supported:

Property	C++ API Counterpart	Value	Operation	Description
logging-level	OrtLoggingLevel	Integer	R/W and Write before start	Level of log information of the ONNX Runtime session.
log-id	N/A	String	R/W and Write before start	String identification of the ONNX Runtime session.
intra-num-threads	SessionOptions::SetIntraOpNumThreads()	Integer	R/W and Write before start	Number of threads to parallelize execution within model nodes.
graph-optimization-level	GraphOptimizationLevel	Integer	R/W and Write before start	Graph optimization level of the ONNX Runtime session.

Supported execution providers

ONNX Runtime offers the possibility to enable a wide range of current execution providers (accelerators) to boost performance on different hardware platforms. The list below enumerates the currently supported execution providers to use through the R2inference ONNXRT backend:

Previous: Supported_backends/EdgeTPU

Index

Next: Supported_backends/ONNXRT ACL

❯