Xavier/Deep Learning/TensorRT/Building Examples: Difference between revisions

Latest revision as of 17:56, 13 February 2023

The following section demonstrates how to build and use NVIDIA samples for the TensorRT C++ API and Python API

C++ API

First you need to build the samples. TensorRT is installed in /usr/src/tensorrt/samples by default. To build all the c++ samples run:

cd /usr/src/tensorrt/samples
sudo make -j4
cd ../bin
./<sample_name>

After building the samples directory, binaries are generated in the In the /usr/src/tensorrt/bin directory, and they are named in snake_case. On the other hand, the source code is located in the samples directory under a second-level directory named like the binary but in camelCase. Some samples require some extra steps like downloading a model or a frozen graph, those steps are enumerated in the README files on the source folder. Inside the following table you can find the sample binary names and descriptions:

Sample	Description	Notes
sample_mnist	Perform basic TensorRT setup and initialization Import a Caffe model using Caffe parser Build an engine Serialize and deserialize the engine Use the engine to perform inference on an input image	The Caffe model was trained with the MNIST data set. To test the engine, this example picks a handwritten digit at random and runs an inference with it. This sample outputs the ASCII rendering of the input image and the most likely digit associated with that image.
sample_mnist_api	Build a network creating every layer Use the engine to perform inference on an input image	This sample builds a model from scratch using the C++ API. For a more detailed guide on how to do this, you can visit this topic on the official documentation. This sample does not train the model. It just loads the pre-trained weights.
sample_uff_mnist	Implement a TensorFlow model Create the UFF Parser Use the UFF Parser to get the dimensions and the order of the input tensor Load a trained TensorFlow model converted to UFF Build an engine Use the engine to perform inference	This sample uses a pre-trained TensorFlow model that was frozen and converted to UFF `/usr/src/tensorrt/data/mnist/lenet5.uff`. To generate your own UFF files see Generate the UFF file This sample outputs the inference results and ASCII rendering of every digit from 0 to 9.
sample_onnx_mnist	Configure the ONNX parser Convert an MNIST network in ONNX format to a TensorRT network Build the engine and run inference using the generated TensorRT network	See this for a detailed ONNX parser configuration guide.
sample_googlenet	Use FP16 mode in TensorRT Use TensorRTHalf2Mode Use layer-based profiling	See this for details on how to set the half-precision mode and network profiling.
sample_char_rnn	Implement a recurrent neural network based on the char-rnn.	The network is trained for predictive text completion with the Treebank-3 dataset
sample_int8	Perform INT8 calibration Perform INT8 inference Calibrate a network for execution in INT8 Cache the output of the calibration to avoid repeating the process	INT8 inference is available only on GPUs with compute capability 6.1 or 7.x. The advantage of using INT8 is that the inference and training are faster, but it requires an investment to determine how best to represent the weights and activations as 8-bit integers. The sample calibrates for MNIST but can be used to calibrate other networks. Run the sample on MNIST with: `./sample_int8 mnist`
sample_plugin	Define a Custom layer that supports multiple data formats Define a Custom layer that can be serialized and deserialized Enable a Custom layer in NvCaffeParser	A limiting factor when using the Caffe and Tensorflow parser is that using not supported layers will result in an error. This sample creates a custom layer and adds it to the parser to counteract that problem. The custom layer is a replacement for the `FullyConnected` layer using cuBLAS matrix multiplication and cuDNN tensor addition. So it makes a great example of how to integrate other GPU APIs with TensorRT.
sample_nmt	Create a seq2seq type NMT inference engine using a checkpoint from TensorFlow	This sample requires more setup to test. you should follow the guide on `/usr/src/tensorrt/samples/sampleNMT/README.txt` For more information about NMT models this is a great resource.
sample_fasterRCNN	Implement the Faster R-CNN network in TensorRT	The model used in this example is too large to be included with the package, to download it follow the guide on `/usr/src/tensorrt/samples/sampleFasterRNN/README.txt` This model is based on this paper The original Caffe model was modified to include RPN and ROIPooling layers
sample_uff_ssd	Perform inference on the SSD network in TensorRT	The model used in this example is too large to be included with the package, to download it follow the guide on `/usr/src/tensorrt/samples/sampleUffSSD/README.txt`
sample_movielens	Implement a movie recommendation system using Neural Collaborative Filter in TensorRT	Each input of the model consists of a userID and a list of movieIDs. The network predicts the highest rated movie for each user. The sample uses a set of 32 users with 100 movies each and compares its prediction with the ground truth.

Python API

You can find the Python samples in the /usr/src/tensorrt/samples/python directory. Every Python sample includes a README.md and requirements.txt file. To run one of the Python samples, the process typically involves two steps:

python -m pip install -r requirements.txt #Install the sample requirements
python sample.py #run the sample

The available samples are:

introductory_parser_samples
end_to_end_tensorflow_mnist
network_api_pytorch_mnist
fc_plugin_caffe_mnist
uff_custom_plugin

Previous: Deep Learning‎/TensorRT/Parsing Caffe

Index

Next: Deep Learning/Deep Learning Accelerator

@@ Line 1: / Line 1: @@
 <noinclude>
-{{Xavier/Head}}
+{{Xavier/Head|previous=Deep Learning‎/TensorRT/Parsing Caffe|next=Deep Learning/Deep Learning Accelerator|metakeywords=TensorRT,samples,examples,nvdia samples,python samples,c++ samples}}
 </noinclude>
 {{DISPLAYTITLE:NVIDIA Jetson Xavier - Building TensorRT API examples|noerror}}
-The following section demonstrates how to build and use nvidia samples for the TensorRT C++ API and Python API
+__TOC__
-= C++ API=
+<br>
+The following section demonstrates how to build and use NVIDIA samples for the TensorRT C++ API and Python API
+== C++ API==
 First you need to build the samples. TensorRT is installed in <code>/usr/src/tensorrt/samples</code> by default. To build all the c++ samples run:
 <syntaxhighlight lang=bash>
@@ Line 13: / Line 16: @@
 ./<sample_name>
 </syntaxhighlight>
-After building the samples directory, binaries are generated in the In the <code>/usr/src/tensorrt/bin</code> directory, and they are named in <code>snake_case</code>. On the other hand, the source code is located in the samples directory under a second level directory named like the binary but in <code>camelCase</code>. Some samples require some extra steps like downloading a model or a frozen graph, those steps are enumerated in the README files on the source folder. Inside the following table you can find the sample binary names and descriptions:
+After building the samples directory, binaries are generated in the In the <code>/usr/src/tensorrt/bin</code> directory, and they are named in <code>snake_case</code>. On the other hand, the source code is located in the samples directory under a second-level directory named like the binary but in <code>camelCase</code>. Some samples require some extra steps like downloading a model or a frozen graph, those steps are enumerated in the README files on the source folder. Inside the following table you can find the sample binary names and descriptions:
-{| class="wikitable" style="width: 100%;"
+{| class="wikitable" style="width: 70%;"
 |-
 ! style="width: 20%"|Sample
@@ Line 28: / Line 31: @@
 *Use the engine to perform inference on an input image
 |
-The Caffe model was trained with the MNIST data set. To test the engine, this example picks a handwritten digit at random and runs an inference with it. This sample outputs the ASCII rendering of the input image and the most likely digit associated to that image.
+The Caffe model was trained with the MNIST data set. To test the engine, this example picks a handwritten digit at random and runs an inference with it. This sample outputs the ASCII rendering of the input image and the most likely digit associated with that image.
 |-
 |sample_mnist_api
@@ Line 48: / Line 51: @@
 *Use the engine to perform inference
 |
-This sample uses a pre-trained TensorFlow model that was frozen and converted to UFF <code>/usr/src/tensorrt/data/mnist/lenet5.uff</code>. To generate your own UFF files see [[Xavier/JetPack_4.1/Components/TensorRT#Step_3:_Generate_the_UFF_file|Generate the UFF file]]
+This sample uses a pre-trained TensorFlow model that was frozen and converted to UFF <code>/usr/src/tensorrt/data/mnist/lenet5.uff</code>. To generate your own UFF files see [[Xavier/Deep_Learning/TensorRT/Parsing_Tensorflow#Step_3:_Generate_the_UFF|Generate the UFF file]]
 This sample outputs the inference results and ASCII rendering of every digit from 0 to 9.
@@ Line 66: / Line 69: @@
 *Use layer-based profiling
 |
-See [https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#half2mode this] for details on how to set the half precision mode and network profiling.
+See [https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#half2mode this] for details on how to set the half-precision mode and network profiling.
 |-
 |sample_char_rnn
@@ Line 83: / Line 86: @@
 INT8 inference is available only on GPUs with compute capability 6.1 or 7.x. The advantage of using INT8 is that the inference and training are faster, but it requires an investment to determine how best to represent the weights and activations as 8-bit integers.
-The sample calibrates for MNIST, but can be used to calibrate other networks. Run the sample on MNIST with: <code>./sample_int8 mnist</code>
+The sample calibrates for MNIST but can be used to calibrate other networks. Run the sample on MNIST with: <code>./sample_int8 mnist</code>
 |-
 |sample_plugin
@@ Line 91: / Line 94: @@
 *Enable a Custom layer in NvCaffeParser
 |
-A limiting factor when using the Caffe and Tensorflow parser is that using not supported layers will result on an error. This sample creates a custom layer and adds it to the parser to counteract that problem.
+A limiting factor when using the Caffe and Tensorflow parser is that using not supported layers will result in an error. This sample creates a custom layer and adds it to the parser to counteract that problem.
-The custom layer is a replacement for the <code>FullyConnected</code> layer using cuBLAS matrix multiplication and cuDNN tensor addition. So it makes a great example on how to integrate other GPU APIs with TensorRT.
+The custom layer is a replacement for the <code>FullyConnected</code> layer using cuBLAS matrix multiplication and cuDNN tensor addition. So it makes a great example of how to integrate other GPU APIs with TensorRT.
 |-
 |sample_nmt
@@ Line 107: / Line 110: @@
 *Implement the Faster R-CNN network in TensorRT
 |
-The model used on this example is to large to be included with the package, to download it follow the guide on  <code>/usr/src/tensorrt/samples/sampleFasterRNN/README.txt</code>
+The model used in this example is too large to be included with the package, to download it follow the guide on  <code>/usr/src/tensorrt/samples/sampleFasterRNN/README.txt</code>
 This model is based on [https://arxiv.org/abs/1506.01497 this] paper
@@ Line 117: / Line 120: @@
 *Perform inference on the SSD network in TensorRT
 |
-The model used on this example is to large to be included with the package, to download it follow the guide on  <code>/usr/src/tensorrt/samples/sampleUffSSD/README.txt</code>
+The model used in this example is too large to be included with the package, to download it follow the guide on  <code>/usr/src/tensorrt/samples/sampleUffSSD/README.txt</code>
 |-
 |sample_movielens
@@ Line 123: / Line 126: @@
 *Implement a movie recommendation system using Neural Collaborative Filter in TensorRT
 |
-Each input of the model consist of a userID and a list of movieIDs. The network predicts the highest rated movie for each user.
+Each input of the model consists of a userID and a list of movieIDs. The network predicts the highest rated movie for each user.
-The sample uses a set of 32 users with 100 movies each, and compares its prediction with the ground truth.
+The sample uses a set of 32 users with 100 movies each and compares its prediction with the ground truth.
 |}
-=Python API=
+==Python API==
 You can find the Python samples in the <code>/usr/src/tensorrt/samples/python</code> directory. Every Python sample includes a README.md and requirements.txt file. To run one of the Python samples, the process typically involves two steps:
@@ Line 142: / Line 145: @@
 *fc_plugin_caffe_mnist
 *uff_custom_plugin
+<br>
-NOTE: Python API isn't supported on Xavier at this time, and the Python API samples are not included with Xavier's TensorRT installation. To get these samples you need to install TensorRT on the host.
+{{Ambox
+|type=notice
+|small=left
+|issue='''Python API isn't supported on Xavier at this time, and the Python API samples are not included with Xavier's TensorRT installation. To get these samples you need to install TensorRT on the host.'''
+|style=width:unset;
+}}
 <noinclude>
-{{Xavier/Foot|<Replace with "previous" page>|<Replace with "next" page>}}
+{{Xavier/Foot|Deep Learning‎/TensorRT/Parsing Caffe|Deep Learning/Deep Learning Accelerator}}
 </noinclude>