OpenGL Accelerated HTML Overlay: Performance - NVIDIA Jetson
|
OpenGL Accelerated HTML Overlay |
---|
Basics |
Getting Started |
Library User Manual |
GStreamer |
Examples |
Performance |
Contact Us |
Library performance
The library has two major components: the hardware-accelerated graphical rendering done by OpenGL and the web rendering engine done by Webkit GTK. In the following section, you will find the performance for the library separated into these two sections.
Graphical Rendering by OpenGL
In this section, we present results about the performance of HTML Overlay tested in the following setup:
- Board: NVIDIA Jetson Xavier NX
- Jetpack: 5.1
All the packages and dependencies are retrieved from the default APT repositories.
The following table shows the CPU usage, GPU usage, processing time and FPS.
Board | NVIDIA Jetson Xavier NX | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Resolution | 4K | +1080p | +720p | |||||||||
Measurement | CPU usage (%) | GPU usage (%) | Processing time (ms) | FPS | CPU usage (%) | GPU usage (%) | Processing time (ms) | FPS | CPU usage (%) | GPU usage (%) | Processing time (ms) | FPS |
Power Configuration | 10 W Desktop Mode | |||||||||||
Upload | 6.34 | 36.99 | 11.46 | 87.24 | 7.25 | 19.64 | 14 | 71.4 | 4.44 | 15.72 | 7.293 | 137 |
Draw | 0.16 | 11.96 | 4.329 | 231 | 0.36 | 7.71 | 2.5 | 395 | 0.40 | 5.67 | 2.077 | 481 |
Download | 7.70 | 29.08 | 15.053 | 66.43 | 5 | 14.73 | 8.4 | 118 | 3.14 | 6.53 | 4.552 | 220 |
Power Configuration | 20 W + Jetson Clocks (Max Power) | |||||||||||
Upload | 4.25 | 6.40 | 11.357 | 88 | 1.45 | 2.21 | 2.997 | 334 | 0.77 | 1.30 | 1.492 | 670 |
Draw | 0.09 | 1.59 | 0.773 | 1294 | 0.09 | 0.82 | 0.478 | 2092 | 0.10 | 0.66 | 0.443 | 2309 |
Download | 2.73 | 6.43 | 6.489 | 154.1 | 0.94 | 2.13 | 1.947 | 514 | 0.57 | 0.84 | 1.071 | 933.7 |
Web rendering by WebkitGTK
The following table shows the CPU usage, GPU usage, processing time and FPS.
Board | NVIDIA Jetson Xavier NX | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Resolution | 4K | +1080p | +720p | |||||||||
Measurement | CPU usage (%) | GPU usage (%) | Processing time (ms) | FPS | CPU usage (%) | GPU usage (%) | Processing time (ms) | FPS | CPU usage (%) | GPU usage (%) | Processing time (ms) | FPS |
Power Configuration | 10 W Desktop Mode | |||||||||||
Draw | 9.2 | 0 | 215.425 | 4.64 | 9.2 | 0 | 52.525 | 19 | 6.77 | 0 | 23.095 | 43.3 |
Power Configuration | 20 W + Jetson Clocks (Max Power)[1] | |||||||||||
Draw | 6.1 | 0 | 281.714 | 3.55 | 6.3 | 0 | 69.137 | 14.46 | 5.6 | 0 | 28.515 | 35.1 |
- Note: When on mode 20W+Jetson Clocks the frequency of operation is 1.4 GHz and for 10w+Desktop mode the frequency of operation is 1.9 GHz. This is shown for the measurements for processing time in each mode.
- Note: There is no GPU consumption since we are using a flag that disables the use of GPU for WebkitGTK.
export WEBKIT_DISABLE_COMPOSITING_MODE=1
GStreamer plugin performance
The plugin was tested with an example overlay and a camera, using a Jetson Xavier NX with Jetpack 5.1.1. The measurements were taken with the following pipeline, using gst-perf:
gst-launch-1.0 -ve nvarguscamerasrc num-buffers=300 ! "video/x-raw(memory:NVMM),height=$H,width=$W,framerate=30/1" ! nvvidconv flip-method=2 ! queue ! htmloverlay url="http://0.0.0.0:8000/overlay.html" enable-js=true web-refresh-rate=10 ! perf ! queue ! nvvidconv ! xvimagesink
Board | Jetson Xavier NX | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Resolution | +720p | +1080p | +4k | |||||||||
FPS(10W-4core) | 166.6167 | 54.7143 | 13.4852 | |||||||||
FPS(20W-6core & jetson-clocks)[1] | 120.5662 | 56.2988 | 12.7323 |
Used overlay
The user overlay (click View Source on the wiki to see the html):
The following results show multiple tests for different resolutions at 30 fps, in order to dig into the multiple capabilities of the end user. You can link the limit fps of the limit tables to the average table just to realize the limits of each resolution, but remember that the limit is just virtual since we are using the element imagefreeze to set the hardware to the limit.
Orin Nano Platform
CPU usage
Taking the following pipelines as reference:
No GL Memory
For average behavior:
gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw,framerate=30/1,height=${H},width=${W}" ! queue ! nvvidconv ! videoconvert ! "video/x-raw" ! queue ! htmloverlay url="https://www.clocktab.com/" enable-js=true web-refresh-rate=5 overlay-x=100 ! queue ! "video/x-raw" ! queue ! perf print-cpu-load=true ! fakesink
For limit behavior:
gst-launch-1.0 videotestsrc num-buffers=1 pattern=ball ! "video/x-raw,format=RGBA,height=${H},width=${W}" ! imagefreeze ! queue ! htmloverlay url="https://www.clocktab.com/" enable-js=true web-refresh-rate=5 overlay-x=100 ! queue ! perf print-cpu-load=true ! fakesink
Results for average behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 30 | 30 | 30 |
CPU(%) | 21 | 23 | 47 |
RAM(MiB) | 568 | 640 | 1040 |
Results for limit behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 375.123 | 187.907 | 51.722 |
CPU(%) | 24 | 24 | 31 |
RAM(MiB) | 332.564 | 359.464 | 359.464 |
GL Memory:
For average behavior:
gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw,framerate=30/1,height=${H},width=${W}" ! queue ! nvvidconv ! videoconvert ! glupload ! "video/x-raw(memory:GLMemory)" ! queue ! glhtmloverlay url="https://www.clocktab.com/" enable-js=true web-refresh-rate=5 overlay-x=100 ! queue ! "video/x-raw(memory:GLMemory)" ! gldownload ! queue ! perf print-cpu-load=true ! fakesink
For limit behavior:
gst-launch-1.0 videotestsrc num-buffers=1 pattern=ball ! "video/x-raw,format=RGBA,height=${H},width=${W}" ! imagefreeze ! queue ! glupload ! "video/x-raw(memory:GLMemory)" ! queue ! glhtmloverlay url="https://www.clocktab.com/" enable-js=true web-refresh-rate=5 overlay-x=100 ! queue ! "video/x-raw(memory:GLMemory)" ! gldownload ! queue ! perf print-cpu-load=true ! fakesink
Results for average behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 30 | 30 | 24 |
CPU(%) | 9 | 22 | 32 |
RAM(MiB) | 424 | 616 | 848 |
Results for limit behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 793.2 | 462.677 | 146.421 |
CPU(%) | 25 | 25 | 36 |
RAM(MiB) | 301.245 | 337.456 | 337.456 |
Xavier NX Platform
CPU usage
Taking the following pipelines as reference:
No GL Memory
For average behavior:
gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw,framerate=30/1,height=${H},width=${W}" ! queue ! nvvidconv ! videoconvert ! "video/x-raw" ! queue ! htmloverlay url="https://www.clocktab.com/" enable-js=true web-refresh-rate=5 overlay-x=100 ! queue ! "video/x-raw" ! queue ! perf print-cpu-load=true ! fakesink
For limit behavior:
gst-launch-1.0 videotestsrc num-buffers=1 pattern=ball ! "video/x-raw,format=RGBA,height=${H},width=${W}" ! imagefreeze ! queue ! htmloverlay url="https://www.clocktab.com/" enable-js=true web-refresh-rate=5 overlay-x=100 ! queue ! perf print-cpu-load=true ! fakesink
Results for average behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 30 | 30 | 30 |
CPU(%) | 14 | 17 | 40 |
RAM(MiB) | 189 | 231 | 469 |
Results for limit behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 302.5 | 159.685 | 48.624 |
CPU(%) | 25 | 29 | 39 |
RAM(MiB) | 406 | 455 | 630 |
GL Memory:
For average behavior:
gst-launch-1.0 videotestsrc is-live=1 ! "video/x-raw,framerate=30/1,height=${H},width=${W}" ! queue ! nvvidconv ! videoconvert ! glupload ! "video/x-raw(memory:GLMemory)" ! queue ! glhtmloverlay url="https://www.clocktab.com/" enable-js=true web-refresh-rate=5 overlay-x=100 ! queue ! "video/x-raw(memory:GLMemory)" ! gldownload ! queue ! perf print-cpu-load=true ! fakesink
For limit behavior:
gst-launch-1.0 videotestsrc num-buffers=1 pattern=ball ! "video/x-raw,format=RGBA,height=${H},width=${W}" ! imagefreeze ! queue ! glupload ! "video/x-raw(memory:GLMemory)" ! queue ! glhtmloverlay url="https://www.clocktab.com/" enable-js=true web-refresh-rate=5 overlay-x=100 ! queue ! "video/x-raw(memory:GLMemory)" ! gldownload ! queue ! perf print-cpu-load=true ! fakesink
Results for average behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 30 | 30 | 23 |
CPU(%) | 20 | 26 | 39 |
RAM(MiB) | 147 | 168 | 301 |
Results for limit behavior
Resolution | 720p | 1080p |
4K |
---|---|---|---|
Max Framerate (fps) | 450.771 | 311.757 | 125.448 |
CPU(%) | 27 | 28 | 37 |
RAM(MiB) | 378 | 448 | 679 |