NVIDIA Jetson Xavier - Deep Learning with TensorRT

From RidgeRun Developer Wiki




Previous: Deep Learning Index Next: Deep Learning/TensorRT/Tensorflow






This TensorRT wiki demonstrates how to use the C++ and Python APIs to implement the most common deep learning layers. Also provides step-by-step instructions with examples for common user tasks such as creating a TensorRT network definition, invoking the TensorRT builder, serializing and deserializing, and how to feed the engine with data and perform inference.

Description

TensorRT is a C++ library that facilitates high-performance inference on NVIDIA platforms. It is designed to work with the most popular deep learning frameworks, such as TensorFlow, Caffe, PyTorch, etc. It focuses specifically on running an already trained model, to train the model, other libraries like cuDNN are more suitable. Some frameworks like TensorFlow have integrated TensorRT so that it can be used to accelerate inference within the framework. For other frameworks like Caffe a parser is provided to generate a model that can be imported on TensorRT. And finally, TensorRT C++ and Python APIs can be used to build a model from the ground up. For a more in-depth analysis of each use case refer to the following sections:

  1. Using TensorRT integrated with Tensorflow
  2. Parsing Tensorflow model for TensorRT
  3. Parsing Caffe model for TensorRT
  4. Building TensorRT API examples

This GitHub repo has a great collection of Tensorflow models with TensorRT.

Some NVIDIA benchmarks on TX2:

Model Input Size Tensorflow on TX2 without TensorRT Tensorflow on TX2 with TensorRT
inception_v4 299x299 129ms 38.5ms
resnet_v1_50 224x224 55.1ms 12.5ms
resnet_v1_101 224x224 91.0ms 20.6ms
resnet_v1_152 224x224 124ms 28.9ms
mobilenet_v1_1p0_224 224x224 17.3ms 11.1ms



Previous: Deep Learning Index Next: Deep Learning/TensorRT/Tensorflow