Xavier NX Platform

For testing purposes, take into account the following points:

Maximum performance mode enabled: all cores, and Jetson clocks enabled
Jetpack 4.4 (4.2.1 or earlier is not recommended)
Base installation
The GstQtOverlay is surrounded by queues to measure the actual capability of GstQtOverlay

For measuring and contrasting the performances with and without NVMM support, we used the following pipelines:

NVMM:

gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false

No NVMM

gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080' ! queue ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! fakesink sync=false

The results obtained:

Measurement	Jetson NX
No NVMM	190 fps
NVMM	265 fps

With the addition of native UYVY and NV12 support for NVMM memory, we measured the performance for each format between using nvvidconv or the native support. The following pipelines were used:

NVMM using nvvidconv:

gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080, format=RGBA' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM), format=RGBA' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false

NVMM with native UYVY and NV12 support:

gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080, format=NV12' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM), format=NV12' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false

The results obtained:

Measurement	RGBA	UYVY	NV12
NVMM with nvvidconv	227 fps	200 fps	177 fps
NVMM with native formats	227 fps	172 fps	177 fps

CPU usage

Take the following pipelines as reference:

No NVMM

gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=1920, height=1080'  ! qtoverlay qml=gst-libs/gst/qt/main.qml ! 'video/x-raw, width=1920, height=1080' ! fakesink

NVMM:

gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=1920, height=1080' ! nvvidconv ! 'video/x-raw(memory:NVMM), width=1920, height=1080' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! nvvidconv ! 'video/x-raw, width=1920, height=1080'  ! fakesink sync=false

Results

Measurement	No NVMM	NVMM
GstQtOverlay	16.5%	7.3%
Rest of pipeline	14%	27.8%
Total	30.5%	35.1%

The total consumption of the pipeline is higher in the NVMM case since there are more elements. The GstQtOverlay consumes less CPU in NVMM mode.

Nano Platform

For testing purposes, take into account the following points:

Maximum performance mode enabled: all cores, and Jetson clocks enabled
Jetpack 4.5 (4.2.1 or earlier is not recommended)
Base installation
The GstQtOverlay is surrounded by queues to measure the actual capability of GstQtOverlay

For measuring and contrasting the performances with and without NVMM support for every format, we used the following pipelines:

NVMM using nvvidconv:

gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080, format=NV12' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM), format=RGBA' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false

NVMM with native UYVY and NV12 support:

gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080, format=NV12' ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM), format=NV12' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! nvvidconv ! 'video/x-raw' ! fakesink sync=false

No NVMM

gst-launch-1.0 videotestsrc pattern=black ! 'video/x-raw, width=1920, height=1080' ! queue ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! fakesink sync=false

The results obtained:

Measurement	RGBA	UYVY	NV12
No NVMM	57 fps	N/A	N/A
NVMM with nvvidconv	121 fps	67 fps	61 fps
NVMM with native formats	123 fps	67 fps	61 fps

While using the native formats may not provide big performance gains here as it still uses nvvidconv to upload to NVMM memory, it allows connecting directly to some cameras that output NV12 in NVMM memory like with the nvarguscamerasrc element.

CPU usage

Taking the following pipelines as reference:

No NVMM

gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=1920, height=1080'  ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=true ! 'video/x-raw, width=1920, height=1080' ! fakesink

NVMM:

gst-launch-1.0 videotestsrc is-live=true ! 'video/x-raw, width=1920, height=1080' ! nvvidconv ! 'video/x-raw(memory:NVMM), width=1920, height=1080' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=true ! nvvidconv ! 'video/x-raw, width=1920, height=1080'  ! fakesink sync=false

Results

Measurement	No NVMM	NVMM
GstQtOverlay	2%	2%
Rest of pipeline	17%	13%
Total	19%	15%

Tests in multiple platforms regarding the resolution

The following results show multiple tests for different resolutions at 30 fps, in order to dig into the multiple capabilities of the end user. You can link the limit fps of the limit tables to the average table just to realize the limits of each resolution, but remember that the limit is just virtual since we are using the element imagefreeze to set the hardware to the limit.

Orin Nano Platform

CPU usage

Take the following pipelines as reference:

No NVMM

For average behavior:

gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw,width=${W},height=${H},framerate=30/1" ! nvvidconv ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink

For limit behavior:

gst-launch-1.0 videotestsrc ! "video/x-raw, width=${W},height=${H}" ! imagefreeze ! imxvideoconvert_g2d ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink

Results for average behavior

Resolution	720p	1080p	4K
Max Framerate (fps)	30	30	11.626
CPU(%)	4	6	7
RAM(MiB)	96	128	160

Results for limit behavior

Resolution	720p	1080p	4K
Max Framerate (fps)	118.978	89.702	35.474
CPU(%)	6	6	6
RAM(MiB)	110.04	111.06	117.376

NVMM:

For average behavior:

gst-launch-1.0 videotestsrc  is-live=1 ! "video/x-raw, width=${W}, height=${H}, framerate=30/1" ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! fakesink sync=false

For limit behavior:

gst-launch-1.0 videotestsrc ! "video/x-raw, width=${W}, height=${H}, framerate=30/1" ! imagefreeze ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink sync=false

Results for average behavior

Resolution	720p	1080p	4K
Max Framerate (fps)	30	30	25.409
CPU(%)	7	9	22
RAM(MiB)	128	144	160

Results for limit behavior

Resolution	720p	1080p	4K
Max Framerate (fps)	301.708	185.239	53.395
CPU(%)	14	13	8
RAM(MiB)	128	128	136

Xavier NX Platform

CPU usage

Taking the following pipelines as reference:

No NVMM

For average behavior:

gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw,width=${W},height=${H},framerate=30/1" ! nvvidconv ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink

For limit behavior:

gst-launch-1.0 videotestsrc ! "video/x-raw,width=${W},height=${H},framerate=30/1"  ! imagefreeze ! nvvidconv ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1 ! fakesink

Results for average behavior

Resolution	720p	1080p	4K
Max Framerate (fps)	30	30	11.574
CPU(%)	6	7	8
RAM(MiB)	91	98	119

Results for limit behavior

Resolution	720p	1080p	4K
Max Framerate (fps)	254	202	84.5
CPU(%)	7	14.1	15.6
RAM(MiB)	89.141	89.141	110.16

NVMM:

For average behavior:

gst-launch-1.0 videotestsrc  is-live=1 ! "video/x-raw, width=${W}, height=${H}, framerate=30/1" ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf print-cpu-load=1  ! queue ! fakesink sync=false

For limit behavior:

gst-launch-1.0 videotestsrc  ! "video/x-raw, width=${W}, height=${H}, framerate=30/1" ! queue ! nvvidconv ! queue ! 'video/x-raw(memory:NVMM)' ! qtoverlay qml=gst-libs/gst/qt/main.qml ! perf ! queue ! fakesink sync=false

Results for average behavior

Resolution	720p	1080p	4K
Max Framerate (fps)	30	30	25.025
CPU(%)	10	16	23
RAM(MiB)	98	105	126

Results for limit behavior

Resolution	720p	1080p	4K
Max Framerate (fps)	390	228	73.3
CPU(%)	6	11	13
RAM(MiB)	117	151	157

Previous: Performance

Index

Next: Performance/Nano

GstQtOverlay plugin performance on NVIDIA Platforms

Contents

Xavier NX Platform

Nano Platform

Tests in multiple platforms regarding the resolution

Orin Nano Platform

Xavier NX Platform