GstInference/Supported backends/TensorRT: Difference between revisions

Latest revision as of 19:27, 27 February 2023

NVIDIA TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT is built on CUDA, NVIDIA's parallel programming model, and enables you to optimize inference for all deep learning frameworks leveraging libraries, development tools, and technologies in CUDA-X for artificial intelligence, autonomous machines, high-performance computing, and graphics.

To use the TensorRT backend on Gst-Inference be sure to run the R2Inference configure with the flag -Denable-tensorrt=true . Then, use the property backend=tensorrt on the Gst-Inference plugins. GstInference depends on the C++ API of TensorRT.

Installation

GstInference depends on the C++ API of TensorRT. For installation steps, follow the steps in R2Inference/Building the library section.

TensorRT python API and utilities can be installed following the official guide, but it is not needed by GstInference.

Enabling the backend

To enable TensorRT as a backend for GstInference you need to install R2Inference with TensorRT. To do this, use the option -Denable-tensorrt=true while following this wiki

Properties

TensorRT API Reference has full documentation of the TensorRT C++ API. Gst-Inference uses only the C++ API of TensorRT and R2Inference takes care of devices and loading the models.

The following syntax is used to change backend options on Gst-Inference plugins:

backend::<property>

For example to change the backend to use Tensorflow-Lite with the inceptionv4 plugin you need to run the pipeline like this:

gst-launch-1.0 \
tinyyolov2 name=net backend=tensorrt model-location=graph_tinyyolov2.trt  \
filesrc location=video_stream.mp4 ! decodebin ! nvvidconv ! "video/x-raw" ! tee name=t \
t. ! queue ! videoconvert ! videoscale !  net.sink_model \
t. ! queue ! videoconvert ! "video/x-raw,format=RGB" ! net.sink_bypass \
net.src_bypass ! perf ! queue ! inferencedebug ! inferenceoverlay ! fakesink

Previous: Supported backends/EdgeTPU

Index

Next: Supported backends/ONNXRT

@@ Line 1: / Line 1: @@
 <noinclude>
-{{GstInference/Head|previous=Supported backends/EdgeTPU|next=Metadatas|keywords=GstInference backends,TensorRT,Jetson-TX2,Jetson-TX1,Xavier,Nvidia,Deep Neural Networks,DNN,DNN Model,Neural Compute API}}
+{{GstInference/Head|previous=Supported backends/EdgeTPU|next=Supported backends/ONNXRT|metakeywords=GstInference backends,TensorRT,Jetson-TX2,Jetson-TX1,Xavier,NVIDIA,Deep Neural Networks,DNN,DNN Model,Neural Compute API}}
 </noinclude>
 <!-- If you want a custom title for the page, un-comment and edit this line:
@@ Line 8: / Line 8: @@
 {{DISPLAYTITLE:GstInference and TensorRT backend|noerror}}
-NVIDIA [https://developer.nvidia.com/tensorrt TensorRT™] is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT is built on CUDA, NVIDIA's parallel programming model, and enables you to optimize inference for all deep learning frameworks leveraging libraries, development tools and technologies in CUDA-X for artificial intelligence, autonomous machines, high-performance computing, and graphics.
+NVIDIA [https://developer.nvidia.com/tensorrt TensorRT™] is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT is built on CUDA, NVIDIA's parallel programming model, and enables you to optimize inference for all deep learning frameworks leveraging libraries, development tools, and technologies in CUDA-X for artificial intelligence, autonomous machines, high-performance computing, and graphics.
-To use the TensorRT backend on Gst-Inference be sure to run the R2Inference configure with the flag <code> --enable-tensorrt </code>. Then, use the property <code> backend=tensorrt </code> on the Gst-Inference plugins. GstInference depends on the [https://docs.nvidia.com/deeplearning/tensorrt/api/index.html#api C++ API of TensorRT].
+To use the TensorRT backend on Gst-Inference be sure to run the R2Inference configure with the flag <code> -Denable-tensorrt=true </code>. Then, use the property <code> backend=tensorrt </code> on the Gst-Inference plugins. GstInference depends on the [https://docs.nvidia.com/deeplearning/tensorrt/api/index.html#api C++ API of TensorRT].
 ==Installation==
-GstInference depends on the  the C++ API of TensorRT. For installation steps, follow the steps in [[R2Inference/Getting_started/Building_the_library|R2Inference/Building the library]] section.
+GstInference depends on the C++ API of TensorRT. For installation steps, follow the steps in [[R2Inference/Getting_started/Building_the_library|R2Inference/Building the library]] section.
 TensorRT python API and utilities can be installed following the [https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html official guide], but it is not needed by GstInference.
@@ Line 20: / Line 20: @@
 == Enabling the backend ==
-To enable Tensorflow-Lite as a backend for GstInference you need to install R2Inference with TensorFlow-Lite, which is a dependency, and EdgeTPU support. To do this, use the option --enable-tflite and --enable-edgetpu while following this [[R2Inference/Getting_started/Building_the_library|wiki]]
+To enable TensorRT as a backend for GstInference you need to install R2Inference with TensorRT. To do this, use the option <code> -Denable-tensorrt=true </code> while following this [[R2Inference/Getting_started/Building_the_library|wiki]]
 ==Properties==
@@ Line 44: / Line 44: @@
 <noinclude>
-{{GstInference/Foot|Supported backends/EdgeTPU|Metadatas}}
+{{GstInference/Foot|Supported backends/EdgeTPU|Supported backends/ONNXRT}}
 </noinclude>