GstQtOverlay plugin performance on NVIDIA Platforms
GStreamer Qt Overlay |
---|
Overview |
Getting Started |
Examples |
Performance |
Similar Solutions |
Troubleshooting |
FAQ |
Contact Us |
Xavier NX Platform
For testing purposes, take into account the following points:
- Maximum performance mode enabled: all cores, and Jetson clocks enabled
- Jetpack 4.4 (4.2.1 or earlier is not recommended)
- Base installation
- The GstQtOverlay is surrounded by queues to measure the actual capability of GstQtOverlay
For measuring and contrasting the performances with and without NVMM support, we used the following pipelines:
NVMM:
gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false
No NVMM
gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080' ! queue ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! fakesink sync=false
The results obtained:
Measurement | Jetson NX |
---|---|
No NVMM | 190 fps |
NVMM |
265 fps |
With the addition of native UYVY and NV12 support for NVMM memory, we measured the performance for each format between using nvvidconv or the native support. The following pipelines were used:
NVMM using nvvidconv:
gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080, format=RGBA' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM), format=RGBA' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false
NVMM with native UYVY and NV12 support:
gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080, format=NV12' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM), format=NV12' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false
The results obtained:
Measurement | RGBA | UYVY | NV12 |
---|---|---|---|
NVMM with nvvidconv | 227 fps |
200 fps |
177 fps |
NVMM with native formats |
227 fps |
172 fps |
177 fps |
CPU usage
Take the following pipelines as reference:
No NVMM
gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=1920, height=1080' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! 'video/x-raw, width=1920, height=1080' ! fakesink
NVMM:
gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=1920, height=1080' ! nvvidconv ! 'video/x-raw(memory:NVMM), width=1920, height=1080' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! nvvidconv ! 'video/x-raw, width=1920, height=1080' ! fakesink sync=false
Results
Measurement | No NVMM |
NVMM |
---|---|---|
GstQtOverlay | 16.5% | 7.3% |
Rest of pipeline | 14% | 27.8% |
Total | 30.5% | 35.1% |
The total consumption of the pipeline is higher in the NVMM case since there are more elements. The GstQtOverlay consumes less CPU in NVMM mode.
Nano Platform
For testing purposes, take into account the following points:
- Maximum performance mode enabled: all cores, and Jetson clocks enabled
- Jetpack 4.5 (4.2.1 or earlier is not recommended)
- Base installation
- The GstQtOverlay is surrounded by queues to measure the actual capability of GstQtOverlay
For measuring and contrasting the performances with and without NVMM support for every format, we used the following pipelines:
NVMM using nvvidconv:
gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080, format=NV12' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM), format=RGBA' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false
NVMM with native UYVY and NV12 support:
gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080, format=NV12' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM), format=NV12' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false
No NVMM
gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080' ! queue ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! fakesink sync=false
The results obtained:
Measurement | RGBA | UYVY | NV12 |
---|---|---|---|
No NVMM | 57 fps |
N/A | N/A |
NVMM with nvvidconv |
121 fps |
67 fps | 61 fps |
NVMM with native formats |
123 fps |
67 fps | 61 fps |
While using the native formats may not provide big performance gains here as it still uses nvvidconv to upload to NVMM memory, it allows connecting directly to some cameras that output NV12 in NVMM memory like with the nvarguscamerasrc element.
CPU usage
Taking the following pipelines as reference:
No NVMM
gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=1920, height=1080' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=true ! 'video/x-raw, width=1920, height=1080' ! fakesink
NVMM:
gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=1920, height=1080' ! nvvidconv ! 'video/x-raw(memory:NVMM), width=1920, height=1080' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=true ! nvvidconv ! 'video/x-raw, width=1920, height=1080' ! fakesink sync=false
Results
Measurement | No NVMM |
NVMM |
---|---|---|
GstQtOverlay | 2% | 2% |
Rest of pipeline | 17% | 13% |
Total | 19% | 15% |
Tests in multiple platforms regarding the resolution
The following results show multiple tests for different resolutions at 30 fps, in order to dig into the multiple capabilities of the end user. You can link the limit fps of the limit tables to the average table just to realize the limits of each resolution, but remember that the limit is just virtual since we are using the element imagefreeze to set the hardware to the limit.
Orin Nano Platform
CPU usage
Take the following pipelines as reference:
No NVMM
For average behavior:
gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw,width=${W},height=${H},framerate=30/1" ! nvvidconv ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink
For limit behavior:
gst-launch-1.0 videotestsrc ! "video/x-raw, width=${W},height=${H}" ! imagefreeze ! imxvideoconvert_g2d ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink
Results for average behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 30 | 30 | 11.626 |
CPU(%) | 4 | 6 | 7 |
RAM(MiB) | 96 | 128 | 160 |
Results for limit behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 118.978 | 89.702 | 35.474 |
CPU(%) | 6 | 6 | 6 |
RAM(MiB) | 110.04 | 111.06 | 117.376 |
NVMM:
For average behavior:
gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw, width=${W}, height=${H}, framerate=30/1" ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! fakesink sync=false
For limit behavior:
gst-launch-1.0 videotestsrc ! "video/x-raw, width=${W}, height=${H}, framerate=30/1" ! imagefreeze ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink sync=false
Results for average behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 30 | 30 | 25.409 |
CPU(%) | 7 | 9 | 22 |
RAM(MiB) | 128 | 144 | 160 |
Results for limit behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 301.708 | 185.239 | 53.395 |
CPU(%) | 14 | 13 | 8 |
RAM(MiB) | 128 | 128 | 136 |
Xavier NX Platform
CPU usage
Taking the following pipelines as reference:
No NVMM
For average behavior:
gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw,width=${W},height=${H},framerate=30/1" ! nvvidconv ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink
For limit behavior:
gst-launch-1.0 videotestsrc ! "video/x-raw,width=${W},height=${H},framerate=30/1" ! imagefreeze ! nvvidconv ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink
Results for average behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 30 | 30 | 11.574 |
CPU(%) | 6 | 7 | 8 |
RAM(MiB) | 91 | 98 | 119 |
Results for limit behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 254 | 202 | 84.5 |
CPU(%) | 7 | 14.1 | 15.6 |
RAM(MiB) | 89.141 | 89.141 | 110.16 |
NVMM:
For average behavior:
gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw, width=${W}, height=${H}, framerate=30/1" ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! queue ! fakesink sync=false
For limit behavior:
gst-launch-1.0 videotestsrc ! "video/x-raw, width=${W}, height=${H}, framerate=30/1" ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! fakesink sync=false
Results for average behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 30 | 30 | 25.025 |
CPU(%) | 10 | 16 | 23 |
RAM(MiB) | 98 | 105 | 126 |
Results for limit behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 390 | 228 | 73.3 |
CPU(%) | 6 | 11 | 13 |
RAM(MiB) | 117 | 151 | 157 |