GStreamer Encoding Latency in NVIDIA Jetson Platforms
|
Problems running the pipelines shown on this page? Please see our GStreamer Debugging guide for help. |
Introduction
This wiki is intended to evaluate the latency or processing time of the GStreamer hardware-accelerated encoders available in Jetson platforms.
The results shown here were obtained using a Jetson TX2 platform. The evaluation was intended for optimizing Jetpack 3.3.3 but some results using Jetpack 4.5.1 are included to compare the improvements in new releases.
Latency Tests
This evaluation involves two pipelines encoding simultaneously, using H264 codec. The test cases are for the following resolutions:
- 1920x1080@50FPS
- 1280x720@50FPS
The starting point pipelines are the following:
- 1920x1080@50FPS
gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,format=I420,width=1920,height=1080,framerate=50/1" ! nvvidconv name=nv0 ! "video/x-raw(memory:NVMM)" ! omxh264enc name=enc0 control-rate=variable bitrate=20000000 profile=main ! video/x-h264,stream-format=byte-stream ! fakesink videotestsrc is-live=true ! "video/x-raw,format=I420,width=1920,height=1080,framerate=50/1" ! nvvidconv name=nv1 ! "video/x-raw(memory:NVMM)" ! omxh264enc name=enc1 control-rate=variable bitrate=20000000 profile=main ! video/x-h264,stream-format=byte-stream ! fakesink
- 1280x720@50FPS
gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,format=I420,width=1280,height=720,framerate=50/1" ! nvvidconv name=nv0 ! "video/x-raw(memory:NVMM)" ! omxh264enc name=enc0 control-rate=variable bitrate=20000000 profile=main ! video/x-h264,stream-format=byte-stream ! fakesink videotestsrc is-live=true ! "video/x-raw,format=I420,width=1280,height=720,framerate=50/1" ! nvvidconv name=nv1 ! "video/x-raw(memory:NVMM)" ! omxh264enc name=enc1 control-rate=variable bitrate=20000000 profile=main ! video/x-h264,stream-format=byte-stream ! fakesink
For this analysis we are only interested in the encoders processing time, therefore the plots will only show the behavior of these.
Jetpack 3.3 OMX (TX2)
Base
Here we can see the following results:
Resolution | Mean Latency | Peak Latency |
---|---|---|
1920x1280 | ~22ms | ~32ms |
1280x720 | ~15ms | ~18ms |
Different Profiles
In order to validate if the encoder profile affects the latency we tested the 3 different profiles with the 1080p resolution:
We can see that there is no real difference between the baseline and main profile, but a high profile increases significantly the maximum processing time of the encoder:
Profile | Mean Latency | Peak Latency |
---|---|---|
Baseline | ~22ms | ~32ms |
Main | ~22ms | ~32ms |
High | ~22ms | ~53ms |
Different Bitrates
Another common question is if the encoder bitrate affects the latency, so to test it we tried 4 different values with the 1080p resolution:
Note that none of the values seem to make a significant difference, so we can say that the bitrate does not affect the encoder processing time.
In relation to this, we also tested if the bitrate control method had some effects on the processing time:
But it also does not seem to have any effect on latency.
H265
The H265 codec is being used more and more lately, so it is important to validate its performance too:
We can see that there are fewer peaks in the plot, but also that the average processing time increased a little bit:
Resolution | Mean Latency | Peak Latency |
---|---|---|
1920x1080 | ~28ms | ~30ms |
1280x720 | ~25ms | ~28ms |
Jetpack 4.5 OMX (TX2)
The idea of these tests is to verify if there were improvements on the encoding with new releases.
Base
Here we can see the following results:
Resolution | Mean Latency | Peak Latency |
---|---|---|
1920x1280 | ~15ms | ~22ms |
1280x720 | ~12ms | ~20ms |
As we can see, there were some small improvements in the 4.5 release compared to 3.3.
H265
As per the H265 codec in this newer release, the results are the following:
Resolution | Mean Latency | Peak Latency |
---|---|---|
1920x1280 | ~15ms | ~18ms |
1280x720 | ~15ms | ~19ms |
So here we also had a big improvement compared to the 3.3 release.
Jetpack 4.5 V4L2 (TX2)
NVIDIA has reported in several posts that OMX encoders are deprecated, therefore they recommend using the V4L2 encoders nvv4l2h264enc and nvv4l2h265enc in newer releases. The base pipelines for these are:
The starting point pipelines are the following:
- 1920x1080@50FPS
gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,format=I420,width=1920,height=1080,framerate=50/1" ! nvvidconv name=nv0 ! "video/x-raw(memory:NVMM)" ! nvv4l2h264enc name=enc0 control-rate=variable_bitrate bitrate=20000000 profile=Main ! video/x-h264,stream-format=byte-stream ! fakesink videotestsrc is-live=true ! "video/x-raw,format=I420,width=1920,height=1080,framerate=50/1" ! nvvidconv name=nv1 ! "video/x-raw(memory:NVMM)" ! nvv4l2h264enc name=enc1 control-rate=variable_bitrate bitrate=20000000 profile=Main ! video/x-h264,stream-format=byte-stream ! fakesink
- 1280x720@50FPS
gst-launch-1.0 videotestsrc is-live=true ! "video/x-raw,format=I420,width=1280,height=720,framerate=50/1" ! nvvidconv name=nv0 ! "video/x-raw(memory:NVMM)" ! nvv4l2h264enc name=enc0 control-rate=variable_bitrate bitrate=20000000 profile=Main ! video/x-h264,stream-format=byte-stream ! fakesink videotestsrc is-live=true ! "video/x-raw,format=I420,width=1280,height=720,framerate=50/1" ! nvvidconv name=nv1 ! "video/x-raw(memory:NVMM)" ! nvv4l2h264enc name=enc1 control-rate=variable_bitrate bitrate=20000000 profile=Main ! video/x-h264,stream-format=byte-stream ! fakesink
Base
Here we can see the following results:
Resolution | Mean Latency | Peak Latency |
---|---|---|
1920x1280 | ~15ms | ~22ms |
1280x720 | ~12ms | ~20ms |
As we can see, they have pretty much the same performance than the OMX encoder.
Maximum Performance
The V4L2 encoders have a property named maxperf-enable that can be set for decreasing the processing time and improving the performance:
We can see the following results:
Resolution | Mean Latency | Peak Latency |
---|---|---|
1920x1280 | ~8ms | ~12ms |
1280x720 | ~5ms | ~10ms |
Note that here we have the best time compared to the other encoders and tests, so the maxperf-enable property seems to have a greatly significant effect on the processing time.
Maximum Performance and H265
The V4L2 encoder maxperf-enable is also available for the H265 encoder. The results for both resolutions are:
Resolution | Mean Latency | Peak Latency |
---|---|---|
1920x1280 | ~8ms | ~12ms |
1280x720 | ~5ms | ~10ms |
So here it seems that the performance is the same for both H265 and H264 codecs.
Quality Tests
For these tests, the main focus was to compare results between different configurations on the OMX and V4L2 encoders for H264 and H265 codecs. However, quality is hard to measure in an objective way, therefore we include some visual results for you to evaluate.
The tests were performed by modifying the same configurations mentioned in the latency section and using the 1080p resolution.
Different Profiles
We tried modifying the profiles on both encoders to see if the quality was significantly affected but the high profile which is the next one after the main profile did not seem to improve a lot the quality. Additionally, we noticed in the latency section that the high profile increases significantly the encoder processing time, so it is not a good alternative to improve quality.
OMX
V4L2
Different Bitrate Control
The bitrate control was another relevant property to test, so we tried to set both encoders to use variable (VBR) and constant (CBR) bitrates, however, it didn't seem to affect the H264 quality a lot.
H264
OMX
V4L2
H265
On the other side, for H265 it seems that the V4L2 encoder quality is a little more affected by this property, while the OMX encoder does not seem to be affected at all.
OMX
V4L2
Different Bitrates
Another worthy test is the bitrate variation, since this parameter is directly associated with the video quality. Also, since we verified in the latency section that it did not seem to affect the processing time of the encoder, it would be a good alternative to improve quality without decreasing performance.
Here at the results we can notice that with lower bitrates the H264 encoder behaves better than H265.
OMX
V4L2
Maximum Performance
The maxperf-enable property gives a really good boost to the V4L2 encoders in terms of latency, so it is important to validate if there's some tradeoff in quality. However, the results seem to tell that there is no difference in quality for both codecs just by enabling this property.
OMX vs V4L2
Finally, we compare both OMX and V4L2 encoders. They both have similar results on H264, however, in H265 it seems that V4L2 has a bit better quality.
OMX vs V4L2
- The gst-omx is a plugin that wraps available OpenMAX IL components and makes them available as standard GStreamer elements. This API allows library and codec implementers to rapidly and effectively utilize the full acceleration potential of new silicon, regardless of the underlying hardware architecture.
- The gst-v4l2 is a plugin that uses the V4L2 kernel framework. Through the framework it creates new V4L2 elements, subscribes to V4L2 events, dequeues an event from an element, and sets/gets control values.
- As per the release notes of Jetpack 4.2, gst-omx plugins are deprecated and are said to be removed in future releases. Starting from this same release NVIDIA provides the gst-v4l2 plugin to use instead, which has proven to have a better performance.
Conclusions
- The minimum mean time achieved for OMX encoders on Jetpack 3.3.3 for a 1080p resolution is ~22ms, which means that 50 FPS are not viable.
- H265 encoding in Jetpack 3.3 is slower than H264, so it can't be used as an alternative.
- Bitrate is not related to processing time on OMX encoders so it is a good option to improve quality.
- OMX encoders have a better performance on Jetpack 4.5.1, but if the upgrade is possible it is recommended to use V4L2 encoders with maxperf-enable=true property, which offer a really low latency among all the results:
Resolution | OMX Jetpack 3.3 Mean Latency | OMX Jetpack 4.5 Mean Latency | V4L2 Jetpack 4.5 Mean Latency |
---|---|---|---|
1920x1280 (H264) | ~22ms | ~15ms | ~8ms |
1280x720 (H264) | ~15ms | ~12ms | ~5ms |
1920x1280 (H265) | ~28ms | ~15ms | ~8ms |
1280x720 (H265) | ~25ms | ~15ms | ~10ms |
Contact Us
For direct inquiries, please refer to the contact information available on our Contact page. Alternatively, you may complete and submit the form provided at the same link. We will respond to your request at our earliest opportunity.
Links to RidgeRun Resources and RidgeRun Artificial Intelligence Solutions can be found in the footer below.