Spherical Video PTZ Performance on Jetson AGX Thor platform

From RidgeRun Developer Wiki


Previous: Performance Index Next: Performance/Jetson AGX Xavier








Benchmark environment

The measurements are taken considering the following criteria:

  • Average behaviour: measurements considering typical image processing pipelines.

Instruments:

  • GPU: nvidia-smi
  • CPU: nvidia-smi
  • RAM: /PROC/PID/STATUS
  • Framerate: GstShark

Engine wrapper

As previously mentioned, the Spherical Video PTZ features an engine wrapper, designed for application development. This section aims to conduct performance measurements using the average value obtained after processing the input image through the 'Process' method of the Spherical Video PTZ engine wrapper. The values presented in the table below are the averages obtained after using the Process method 1000 times. This testing was conducted using the 14 cores of the MAXN power mode on a Jetson AGX Thor with JetPack 7.1.0.

Spherical Video PTZ performance
n Images type PTZ used Input size (px) Output size (px) RAM (MiB) GPU Avg CPU (%)
1 Image No 2000x1000 500x500 (phys) 164.57 (Usage) 0.32 % (VRAM) 104 MiB 1.12 %
2 Image No 4000x2000 500x500 (phys) 215.77 (Usage) 0.57 % (VRAM) 172 MiB 1.16 %
3 Image No 4000x2000 1000x1000 (phys) 217.04 (Usage) 0.79 % (VRAM) 182 MiB 1.05 %
4 Image Yes 2000x1000 500x500 (phys) 164.58 (Usage) 1.26 % (VRAM) 116 MiB 0.89 %
5 Image Yes 4000x2000 500x500 (phys) 206.35 (Usage) 1.51 % (VRAM) 140 MiB 1.12 %
6 Image Yes 4000x2000 1000x1000 (phys) 217.06 (Usage) 4.37 % (VRAM) 150 MiB 1.19 %
7 CudaImage No 2000x1000 500x500 (phys) 164.57 (Usage) 0.37 % (VRAM) 116 MiB 0.88 %
8 CudaImage No 4000x2000 500x500 (phys) 214.12 (Usage) 0.59 % (VRAM) 104 MiB 1.03 %
9 CudaImage No 4000x2000 1000x1000 (phys) 214.12 (Usage) 0.93 % (VRAM) 150 MiB 1.03 %
10 CudaImage Yes 2000x1000 500x500 (phys) 164.57 (Usage) 1.33 % (VRAM) 116 MiB 0.97 %
11 CudaImage Yes 4000x2000 500x500 (phys) 218.98 (Usage) 1.48 % (VRAM) 104 MiB 0.96 %
12 CudaImage Yes 4000x2000 1000x1000 (phys) 214.13 (Usage) 4.50 % (VRAM) 150 MiB 0.88 %

Please consider the following points:

  • Image Type: This refers to whether the input and output images use the lp/allocators/cudaimage.hpp library (images allocated in GPU memory) or the lp/image.hpp library (images allocated in CPU memory). The engine wrapper accepts both types of images, when using an Image, the engine internally creates a copy of the CPU memory to perform operations on the GPU. This copying process consumes considerable time compared to using an image already allocated in GPU memory.
  • PTZ Usage: This indicates a change in the pan, tilt, or zoom property before processing the input image. Consequently, the engine wrapper executes an additional step to calculate the output image, leading to an increase in processing time.

Gstreamer element: rrpanoramaptz

Average CPU Usage, RAM, and Processing time

The following table summarizes the average CPU Usage, RAM, and Processing time for different resolutions with and without NVMM.

Average CPU, RAM, and Processing time for different resolutions with and without NVMM memory.
Memory Type Resolution (Input – Output) CPU RAM Processing Average (ms)
RAW 2000x1000 – 500x500 0.07% 0.15% 0.3473
4000x2000 – 500x500 1.40% 0.18% 0.5569
4000x2000 – 1000x1000 1.37% 0.17% 0.6341
NVMM 2000x1000 – 500x500 0.37% 0.10% 0.6538
4000x2000 – 500x500 0.46% 0.22% 1.2466
4000x2000 – 1000x1000 0.46% 0.23% 1.356576


Replicate the results

The following pipelines were used for each section.

Processing using without NVMM

2000x1000 input to 500x500 output

GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 videotestsrc is-live=true num-buffers=200 ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! "video/x-raw,width=2000,height=1000" ! rrpanoramaptz ! "video/x-raw,width=500,height=500" ! perf ! fakesink

4000x2000 input to 500x500 output

GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 videotestsrc is-live=true num-buffers=200 ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! "video/x-raw,width=4000,height=2000" ! rrpanoramaptz ! "video/x-raw,width=500,height=500" ! perf ! fakesink

4000x2000 input to 1000x1000 output

GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 videotestsrc is-live=true num-buffers=200 ! queue max-size-buffers=0 max-size-time=0 max-size-bytes=0 ! "video/x-raw,width=4000,height=2000" ! rrpanoramaptz ! "video/x-raw,width=1000,height=1000" ! perf ! fakesink

PTZ transformations

#!/bin/bash

counter=0

gst-client pipeline_create p1 "videotestsrc is-live=true num-buffers=200 ! queue ! video/x-raw,width=4000,height=2000 ! rrpanoramaptz name=ptz ! video/x-raw,width=1000,height=1000 ! fakesink"

gst-client pipeline_play p1

while [ $counter -lt 180 ]; do
    gst-client --quiet element_set p1 ptz pan ${counter}
    ((counter++))
    sleep 0.02 # pan rate of change
done

gst-client pipeline_stop p1
gst-client pipeline_delete p1

Processing using NVMM

2000x1000 input to 500x500 output

GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 videotestsrc is-live=true num-buffers=200 ! nvvidconv ! queue ! "video/x-raw(memory:NVMM),width=2000,height=1000" ! rrpanoramaptz ! "video/x-raw,width=500,height=500" ! perf ! fakesink

4000x2000 input to 500x500 output

GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 videotestsrc is-live=true num-buffers=200 ! nvvidconv ! queue ! "video/x-raw(memory:NVMM),width=4000,height=2000" ! rrpanoramaptz ! "video/x-raw,width=500,height=500" ! perf ! fakesink


4000x2000 input to 1000x1000 output

GST_DEBUG="GST_TRACER:7" GST_TRACERS="proctime" gst-launch-1.0 videotestsrc is-live=true num-buffers=200 ! nvvidconv ! queue ! "video/x-raw(memory:NVMM),width=4000,height=2000" ! rrpanoramaptz ! "video/x-raw,width=1000,height=1000" ! perf ! fakesink

PTZ transformations

#!/bin/bash

counter=0

gst-client pipeline_create p1 "videotestsrc is-live=true num-buffers=200 ! nvvidconv ! queue ! video/x-raw(memory:NVMM),width=4000,height=2000 ! rrpanoramaptz name=ptz ! video/x-raw,width=1000,height=1000 ! perf ! fakesink"

gst-client pipeline_play p1

while [ $counter -lt 180 ]; do
    gst-client --quiet element_set p1 ptz pan ${counter}
    ((counter++))
    sleep 0.02 # pan rate of change
done

gst-client pipeline_stop p1
gst-client pipeline_delete p1



Previous: Performance Index Next: Performance/Jetson AGX Xavier