NVIDIA Jetson Xavier - Building TensorRT API examples

From RidgeRun Developer Wiki



Previous: Deep Learning‎/TensorRT/Parsing Caffe Index Next: Deep Learning/Deep Learning Accelerator









The following section demonstrates how to build and use NVIDIA samples for the TensorRT C++ API and Python API

C++ API

First you need to build the samples. TensorRT is installed in /usr/src/tensorrt/samples by default. To build all the c++ samples run:

cd /usr/src/tensorrt/samples
sudo make -j4
cd ../bin
./<sample_name>

After building the samples directory, binaries are generated in the In the /usr/src/tensorrt/bin directory, and they are named in snake_case. On the other hand, the source code is located in the samples directory under a second-level directory named like the binary but in camelCase. Some samples require some extra steps like downloading a model or a frozen graph, those steps are enumerated in the README files on the source folder. Inside the following table you can find the sample binary names and descriptions:

Sample Description Notes
sample_mnist
  • Perform basic TensorRT setup and initialization
  • Import a Caffe model using Caffe parser
  • Build an engine
  • Serialize and deserialize the engine
  • Use the engine to perform inference on an input image

The Caffe model was trained with the MNIST data set. To test the engine, this example picks a handwritten digit at random and runs an inference with it. This sample outputs the ASCII rendering of the input image and the most likely digit associated with that image.

sample_mnist_api
  • Build a network creating every layer
  • Use the engine to perform inference on an input image

This sample builds a model from scratch using the C++ API. For a more detailed guide on how to do this, you can visit this topic on the official documentation.

This sample does not train the model. It just loads the pre-trained weights.

sample_uff_mnist
  • Implement a TensorFlow model
  • Create the UFF Parser
  • Use the UFF Parser to get the dimensions and the order of the input tensor
  • Load a trained TensorFlow model converted to UFF
  • Build an engine
  • Use the engine to perform inference

This sample uses a pre-trained TensorFlow model that was frozen and converted to UFF /usr/src/tensorrt/data/mnist/lenet5.uff. To generate your own UFF files see Generate the UFF file

This sample outputs the inference results and ASCII rendering of every digit from 0 to 9.

sample_onnx_mnist
  • Configure the ONNX parser
  • Convert an MNIST network in ONNX format to a TensorRT network
  • Build the engine and run inference using the generated TensorRT network

See this for a detailed ONNX parser configuration guide.

sample_googlenet
  • Use FP16 mode in TensorRT
  • Use TensorRTHalf2Mode
  • Use layer-based profiling

See this for details on how to set the half-precision mode and network profiling.

sample_char_rnn
  • Implement a recurrent neural network based on the char-rnn.

The network is trained for predictive text completion with the Treebank-3 dataset

sample_int8
  • Perform INT8 calibration
  • Perform INT8 inference
  • Calibrate a network for execution in INT8
  • Cache the output of the calibration to avoid repeating the process

INT8 inference is available only on GPUs with compute capability 6.1 or 7.x. The advantage of using INT8 is that the inference and training are faster, but it requires an investment to determine how best to represent the weights and activations as 8-bit integers.

The sample calibrates for MNIST but can be used to calibrate other networks. Run the sample on MNIST with: ./sample_int8 mnist

sample_plugin
  • Define a Custom layer that supports multiple data formats
  • Define a Custom layer that can be serialized and deserialized
  • Enable a Custom layer in NvCaffeParser

A limiting factor when using the Caffe and Tensorflow parser is that using not supported layers will result in an error. This sample creates a custom layer and adds it to the parser to counteract that problem.

The custom layer is a replacement for the FullyConnected layer using cuBLAS matrix multiplication and cuDNN tensor addition. So it makes a great example of how to integrate other GPU APIs with TensorRT.

sample_nmt
  • Create a seq2seq type NMT inference engine using a checkpoint from TensorFlow

This sample requires more setup to test. you should follow the guide on /usr/src/tensorrt/samples/sampleNMT/README.txt

For more information about NMT models this is a great resource.

sample_fasterRCNN
  • Implement the Faster R-CNN network in TensorRT

The model used in this example is too large to be included with the package, to download it follow the guide on /usr/src/tensorrt/samples/sampleFasterRNN/README.txt

This model is based on this paper

The original Caffe model was modified to include RPN and ROIPooling layers

sample_uff_ssd
  • Perform inference on the SSD network in TensorRT

The model used in this example is too large to be included with the package, to download it follow the guide on /usr/src/tensorrt/samples/sampleUffSSD/README.txt

sample_movielens
  • Implement a movie recommendation system using Neural Collaborative Filter in TensorRT

Each input of the model consists of a userID and a list of movieIDs. The network predicts the highest rated movie for each user.

The sample uses a set of 32 users with 100 movies each and compares its prediction with the ground truth.

Python API

You can find the Python samples in the /usr/src/tensorrt/samples/python directory. Every Python sample includes a README.md and requirements.txt file. To run one of the Python samples, the process typically involves two steps:

python -m pip install -r requirements.txt #Install the sample requirements
python sample.py #run the sample

The available samples are:

  • introductory_parser_samples
  • end_to_end_tensorflow_mnist
  • network_api_pytorch_mnist
  • fc_plugin_caffe_mnist
  • uff_custom_plugin




Previous: Deep Learning‎/TensorRT/Parsing Caffe Index Next: Deep Learning/Deep Learning Accelerator