Pose Estimation using TensorRT on NVIDIA Jetson
![]() |
|
Add a section to the top called Jetson Devkit and Jetpack SDK and list the hardware and software used to run the demo.
This guide is based on the Real time human pose estimation project on Jetson Nano at 22FPS from NVIDIA and the repository Real-time pose estimation accelerated with NVIDIA TensorRT.
This is a NVIDIA demo that uses a pose estimation model trained on PyTorch and deployed with TensorRT to demonstrate PyTorch to TRT conversion and pose estimation performance on NVIDIA Jetson platforms.
PyTorch Installation
To install PyTorch on NVIDIA Jetson TX2 you will need to build from the source and apply a small patch.
First install pip and cmake
sudo apt-get install python-pip sudo apt-get install cmake
Clone the PyTorch repo
I used v1.0.0 because in other versions disabling NCCL in the cmake and Setup.py wasn't working:
git clone http://github.com/pytorch/pytorch cd pytorch git checkout v1.0.0 git submodule update --init --recursive
Install PyTorch prerequisites
sudo -H pip3 install -U setuptools sudo -H pip3 install -r requirements.txt
Applying Patch
- You will need to disable NCCL (NVIDIA multi-GPU library for desktop GPUs) and distributed processing. Also load the CUDA toolkit library as static. Here is the patch
diff --git a/CMakeLists.txt b/CMakeLists.txt index 159b15367e..6f7423df4e 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -95,7 +95,7 @@ option(USE_LMDB "Use LMDB" ON) option(USE_METAL "Use Metal for iOS build" ON) option(USE_MOBILE_OPENGL "Use OpenGL for mobile code" ON) option(USE_NATIVE_ARCH "Use -march=native" OFF) -option(USE_NCCL "Use NCCL" ON) +option(USE_NCCL "Use NCCL" OFF) option(USE_SYSTEM_NCCL "Use system-wide NCCL" OFF) option(USE_NNAPI "Use NNAPI" OFF) option(USE_NNPACK "Use NNPACK" ON) @@ -119,7 +119,7 @@ option(USE_TENSORRT "Using Nvidia TensorRT library" OFF) option(USE_ZMQ "Use ZMQ" OFF) option(USE_ZSTD "Use ZSTD" OFF) option(USE_MKLDNN "Use MKLDNN" OFF) -option(USE_DISTRIBUTED "Use distributed" ON) +option(USE_DISTRIBUTED "Use distributed" OFF) cmake_dependent_option( USE_MPI "Use MPI for Caffe2. Only available if USE_DISTRIBUTED is on." ON "USE_DISTRIBUTED" OFF) diff --git a/cmake/public/cuda.cmake b/cmake/public/cuda.cmake index 849fa07524..9c71bfa027 100644 --- a/cmake/public/cuda.cmake +++ b/cmake/public/cuda.cmake @@ -9,6 +9,8 @@ endif() # release (3.11.3) yet. Hence we need our own Modules_CUDA_fix to enable sccache. list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_LIST_DIR}/../Modules_CUDA_fix) +SET(CUDA_USE_STATIC_CUDA_RUNTIME OFF CACHE INTERNAL "") + # Find CUDA. find_package(CUDA 7.0) if(NOT CUDA_FOUND) diff --git a/setup.py b/setup.py index 20654625ab..be5191ac63 100644 --- a/setup.py +++ b/setup.py @@ -198,6 +198,8 @@ IS_DARWIN = (platform.system() == 'Darwin') IS_LINUX = (platform.system() == 'Linux') IS_PPC = (platform.machine() == 'ppc64le') IS_ARM = (platform.machine() == 'aarch64') +USE_NCCL = False +USE_DISTRIBUTED = False BUILD_PYTORCH = check_env_flag('BUILD_PYTORCH') # ppc64le and aarch64 do not support MKLDNN
Build and install PyTorch
(this will take a while)
sudo python3 setup.py install cd ..
Other dependencies
Install the required modules and packages with pip
sudo -H pip3 install Pillow==6.1 sudo -H pip3 install torchvision sudo -H pip3 install tensorrt sudo -H pip3 install tqdm sudo -H pip3 install cython sudo -H pip3 install pycocotools sudo apt-get install python3-matplotlib
Install Jetcam
(Jetcam is a NVIDIA util to access a CSI or USB camera in Python)
git clone https://github.com/NVIDIA-AI-IOT/jetcam cd jetcam sudo python3 setup.py install cd ..
Jetcam is really simple to use. Once installed you can access the camera with:
from jetcam.usb_camera import USBCamera #from jetcam.csi_camera import CSICamera from jetcam.utils import bgr8_to_jpeg camera = USBCamera(width=WIDTH, height=HEIGHT, capture_fps=30) #camera = CSICamera(width=WIDTH, height=HEIGHT, capture_fps=30) camera.running = True # The current camera frame. camera.value # Attach the execution function whenever a new camera frame is received. camera.observe(function, names='value')
Install torch2trt
(torch2trt is a NVIDIA util to convert Torch models to TRT)
git clone https://github.com/NVIDIA-AI-IOT/torch2trt cd torch2trt sudo python3 setup.py install cd ..
Run the real time human pose estimation using TensorRT demo
Clone and install trt_pose repo
git clone https://github.com/NVIDIA-AI-IOT/trt_pose cd trt_pose sudo python3 setup.py install cd ..
Models
Download the models from the following links and place the downloaded weighs in tasks/human_pose.
- resnet18_baseline_att_224x224_A
- resnet50_baseline_att_256x256_A
- resnet50_baseline_att_384x384_A
- densenet121_baseline_att_224x224_B
- densenet121_baseline_att_256x256_B
- densenet121_baseline_att_320x320_A
Run the demo
(live_demo.py) with jupyter notebook:
cd trt_pose/tasks/human_pose jupyter notebook
The expected output should be something like this:

For direct inquiries, please refer to the contact information available on our Contact page. Alternatively, you may complete and submit the form provided at the same link. We will respond to your request at our earliest opportunity.
Links to RidgeRun Resources and RidgeRun Artificial Intelligence Solutions can be found in the footer below.