GstQtOverlay plugin performance on NVIDIA Platforms
| GStreamer Qt Overlay |
|---|
| Overview |
| Getting Started |
| Examples |
| Performance |
| Similar Solutions |
| Troubleshooting |
| FAQ |
| Contact Us |
|
For testing purposes, take into account the following points:
- Maximum performance mode enabled: all cores, and Jetson clocks enabled
- Base installation
- The GstQtOverlay is surrounded by queues to measure the actual capability of GstQtOverlay
Thor AGX Platform
The Thor AGX was tested with the JetPack 7.1.
Format Performance
The contrast performance with and without NVMM support for every format, we obtained the following results.
| Measurement | RGBA | UYVY | NV12 |
|---|---|---|---|
| NVMM with nvvidconv | 405.327 fps |
467.725 fps |
462.701 fps |
| NVMM with native formats |
405.335 fps |
402.133 fps |
406.673 fps |
CPU Usage
The CPU usage results are the following.
| Measurement | No NVMM |
NVMM |
|---|---|---|
| GstQtOverlay | 1% | 1% |
| Rest of pipeline | 7.7% | 10% |
| Total | 8.7% | 11% |
Xavier NX Platform
The following Jetpack 4.4 (4.2.1 or earlier is not recommended) was tested for the Xavier NX.
Formats Performance
The following table shows the formats performance with and without NVMM.
| Measurement | RGBA | UYVY | NV12 |
|---|---|---|---|
| NO NVMM | 190 fps |
N/A |
N/A |
| NVMM with nvvidconv | 227 fps |
200 fps |
177 fps |
| NVMM with native formats |
265 fps |
172 fps |
177 fps |
CPU usage
The following table shows the CPU usage of the GstQtOverlay.
| Measurement | No NVMM |
NVMM |
|---|---|---|
| GstQtOverlay | 16.5% | 7.3% |
| Rest of pipeline | 14% | 27.8% |
| Total | 30.5% | 35.1% |
The total consumption of the pipeline is higher in the NVMM case since there are more elements. The GstQtOverlay consumes less CPU in NVMM mode.
Nano Platform
The Jetson Nano was tested with Jetpack 4.5 (4.2.1 or earlier is not recommended)
Formats Performance
The results of performance with NVMM and without NVMM using the following pipeline only for RGBA format, supported for non NVMM memory are the following.
| Measurement | RGBA | UYVY | NV12 |
|---|---|---|---|
| No NVMM | 57 fps |
N/A | N/A |
| NVMM with nvvidconv |
121 fps |
67 fps | 61 fps |
| NVMM with native formats |
123 fps |
67 fps | 61 fps |
While using the native formats may not provide big performance gains here as it still uses nvvidconv to upload to NVMM memory, it allows connecting directly to some cameras that output NV12 in NVMM memory like with the nvarguscamerasrc element.
CPU usage
The following table shows the results of CPU usage with and without NVMM.
| Measurement | No NVMM |
NVMM |
|---|---|---|
| GstQtOverlay | 2% | 2% |
| Rest of pipeline | 17% | 13% |
| Total | 19% | 15% |
Tests in multiple platforms regarding the resolution
The following results show multiple tests for different resolutions at 30 fps, in order to dig into the multiple capabilities of the end user. You can link the limit fps of the limit tables to the average table just to realize the limits of each resolution, but remember that the limit is just virtual since we are using the element imagefreeze to set the hardware to the limit.
| Platform | Mode | Resolution | Max Framerate(fps) | CPU(%) | RAM(MiB) |
|---|---|---|---|---|---|
| Thor AGX | No NVMM | 720p | 30 | 0.51 | 271.33 |
| 1080p | 30 | 0.73 | 293.12 | ||
| 4K | 30 | 2.30 | 412.97 | ||
| NVMM | 720p | 30 | 0.54 | 286.22 | |
| 1080p | 30 | 0.84 | 315.99 | ||
| 4K | 30 | 2.29 | 474.94 | ||
| Orin Nano | No NVMM | 720p | 30 | 4 | 96 |
| 1080p | 30 | 6 | 128 | ||
| 4K | 11.96 | 7 | 160 | ||
| NVMM | 720p | 30 | 7 | 128 | |
| 1080p | 30 | 9 | 144 | ||
| 4K | 25.409 | 22 | 160 | ||
| Xavier Nx | No NVMM | 720p | 30 | 6 | 91 |
| 1080p | 30 | 7 | 98 | ||
| 4K | 11.574 | 8 | 119 | ||
| NVMM | 720p | 30 | 10 | 98 | |
| 1080p | 30 | 16 | 105 | ||
| 4K | 25.409 | 23 | 126 |
| Platform | Mode | Resolution | Max Framerate(fps) | CPU(%) | RAM(MiB) |
|---|---|---|---|---|---|
| Thor AGX | No NVMM | 720p | 312.529 | 2.60 | 270.07 |
| 1080p | 250.146 | 2.47 | 292.64 | ||
| 4K | 120.175 | 2.77 | 409.79 | ||
| NVMM | 720p | 406.038 | 3.46 | 271.73 | |
| 1080p | 403.607 | 3.85 | 291.19 | ||
| 4K | 193.832 | 2.29 | 411.6 | ||
| Orin Nano | No NVMM | 720p | 118.978 | 6 | 110.04 |
| 1080p | 89.702 | 6 | 111.06 | ||
| 4K | 35.474 | 6 | 117.376 | ||
| NVMM | 720p | 301.708 | 14 | 128 | |
| 1080p | 185.239 | 13 | 128 | ||
| 4K | 53.395 | 8 | 136 | ||
| Xavier Nx | No NVMM | 720p | 254 | 7 | 89.141 |
| 1080p | 202 | 14.1 | 89.141 | ||
| 4K | 84.5 | 15.6 | 110.16 | ||
| NVMM | 720p | 390 | 6 | 117 | |
| 1080p | 228 | 11 | 151 | ||
| 4K | 73.3 | 13 | 157 |
Reproducing the results
The following pipelines were used for each respective section.
Formats Performance
With the addition of native UYVY and NV12 support for NVMM memory, we measured the performance for each format between using nvvidconv or the native support. The following pipelines were used:
NVMM using nvvidconv:
gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080, format=NV12' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM), format=RGBA' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false
NVMM with native UYVY and NV12 support:
gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080, format=NV12' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM), format=NV12' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false
No NVMM
gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080' ! queue ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! fakesink sync=false
CPU Usage
No NVMM
gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=1920, height=1080' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=true ! 'video/x-raw, width=1920, height=1080' ! fakesink
NVMM:
gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=1920, height=1080' ! nvvidconv ! 'video/x-raw(memory:NVMM), width=1920, height=1080' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=true ! nvvidconv ! 'video/x-raw, width=1920, height=1080' ! fakesink sync=false
Performance for different resolutions
No NVMM
For average behavior:
gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw,width=${W},height=${H},framerate=30/1" ! nvvidconv ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink
For limit behavior:
gst-launch-1.0 videotestsrc ! "video/x-raw, width=${W},height=${H}" ! imagefreeze ! imxvideoconvert_g2d ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink
NVMM:
For average behavior:
gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw, width=${W}, height=${H}, framerate=30/1" ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! queue ! fakesink sync=false
For limit behavior:
gst-launch-1.0 videotestsrc ! "video/x-raw, width=${W}, height=${H}, framerate=30/1" ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! fakesink sync=false