NVIDIA Jetson TX1 TX2 Video Latency

From RidgeRun Developer Wiki




Problems running the pipelines shown on this page? Please see our GStreamer Debugging guide for help.


NVIDIA®Jetson™ TX1 TX2 Video Latency Introduction

This wiki is intended to be used as a reference for measuring the latency using the GStreamer pipelines on Jetson TX1 TX2 Platforms. This binary was built for Jetpack 3.0 L4T 24.2.1. The tests were done using a modified nvcamerasrc binary provided by NVIDIA, which reduces the minimum allowed value of the queue-size property from 10 to 2 buffers.

Jetson TX1 IMX274 1080p60 HDMI latency

Below are the details on the test and measuring the Jetson TX1 capture to display latency with the Sony IMX274 camera sensor and 1080p60fps mode.

It is strictly necessary to first run the following command on the TX1 before running the test pipeline:

sudo ~/jetson_clocks.sh

Test pipeline

gst-launch-1.0 nvcamerasrc queue-size=6 sensor-id=0 fpsRange='60 60' ! \
'video/x-raw(memory:NVMM), width=1920, height=1080,format=I420,framerate=60/1' ! \
perf print-arm-load=true ! nvoverlaysink sync=true enable-last-sample=false

Latency = 43 ms
The reported value is the average of 11 samples.

Jetson TX1 IMX274 4Kp60 HDMI latency

Below are the details on the test and measuring the Jetson TX1 capture to display latency with the IMX274 camera sensor and 4K 60fps mode.

  • It is strictly necessary to first run the following command on the TX1 before running the test pipeline:
sudo ~/jetson_clocks.sh

Test pipeline

gst-launch-1.0 nvcamerasrc queue-size=6 sensor-id=0 fpsRange='60 60' ! \
'video/x-raw(memory:NVMM), width=3840, height=2160,format=I420,framerate=60/1' ! \
perf print-arm-load=true ! nvoverlaysink sync=true enable-last-sample=false

Latency = 43 ms
The reported value is the average of 11 samples.

Jetson TX2 IMX274 1080p60 RTP latency

Here we describe the tests performed in order to measure the latency in UDP streaming using a TX2 GStreamer pipeline and receiving, decoding, and displaying at a PC.

Setup

The TX2 captured a test pattern, then send it over UDP to the same PC that was showing the test pattern that decoded and displayed the video from that stream. An iPhone 7 in slow motion mode captured both screens of the computer at the same time, then we pass frame by frame the recorded video counting the number of frames that passed between a particular pattern and another between screens.
Fig.1 below shows a diagram of the setup.

Figure 1. A diagram of the setup performed


Fig.2 below shows an example of a frame on the recorded videos, the received stream is lagging behind the test pattern.


Figure 2. Example of a single frame in a particular time

Test Pipeline

  • TX2
export HOSTIP= <IP of the Host>
DISPLAY=:0 gst-launch-1.0 nvcamerasrc queue-size=6 sensor-id=0 fpsRange='60 60' ! 'video/x-raw(memory:NVMM), width=1920, height=1080,format=I420,framerate=60/1' ! omxh264enc control-rate=2 bitrate=4000000 ! 'video/x-h264, stream-format=(string)byte-stream' ! h264parse ! rtph264pay mtu=1400 ! udpsink host=$HOSTIP port=5000 sync=true async=false
  • Host PC
gst-launch-1.0 udpsrc port=5000 ! application/x-rtp,encoding-name=H264,payload=96 ! rtph264depay ! h264parse ! queue ! avdec_h264 ! xvimagesink sync=false async=false -e

Results

Attached are the results gathered from three different tests where each test represents a different video recording session. On each test, we measured we took 10 random spots on the video and counted the number of frames that it took the decoding stream to display the pattern on the strobe video.

Iteration Test 1 Test 2 Test 3
1 40 45 40
2 42 40 42
3 40 42 42
4 46 42 43
5 43 40 46
6 44 40 40
7 45 40 45
8 46 42 46
9 40 43 40
10 40 40 42
Average 42.6 41.4 42.6
Total Average 42.2

Since the slow motion camera was recording 240fps the latency is 4.167x42 = 175 ms +/- 8ms

This measurement includes

  • Latency of the camera on the TX2
  • Latency of the h264 encoder on the TX2
  • Any Gstreamer streaming latency on TX2
  • H264 decoding Latency on PC x86
  • Displaying Latency on PC x86
  • HDMI latency on PC x86

Latency measurement reliable method

This wiki section is intended to explain two reliable methods to measure the latency of a system.

The “industry standard” in measuring video latency is nothing short of terrible. It goes something like this: take a portable device, run some kind of stopwatch application on it, point the camera under testing to its screen, take a picture including both camera display and stopwatch, then subtract the difference. This method is very inaccurate because it depends on multiple factors.

As you can see, there are several elements here that can influence (timing) resolution: the freeware stopwatch not being written for the application (simply simulating a live, running stopwatch, and not actually implementing one with live output), the digital camera frame exposure time if the two stopwatches are not horizontally aligned (rolling shutter), the portable device display refresh rate, the FPV video system framerate, etc. All together those errors can easily exceed the actual latency of an analog FPV system, where latency is usually around 20 milliseconds, a number that is not possible to measure when using the stopwatch method

The most critical factor is the refresh rate of the stopwatch. In most cases, we use a millisecond resolution online stopwatch or in the best case a good quality physical one. The majority of the online stopwatches have a very poor refresh rate, which makes them useless to give a reliable and precise latency measurement. Here in RidgeRun, we used to use this online stopwatch (http://stopwatch.onlineclock.net/). This stopwatch has a refresh rate of 43 ms, which is a high value that could lead us to errors in our measurements because we can't get a lower value than 43 ms in our measurements. On the other hand, there are the physical stopwatches, but to get one with 1 ms resolution and a good refresh rate is so difficult and expensive.

On this wiki, you will find two reliable methods to measure latency on a system.

What is latency?

Latency is the time it takes for something to propagate in a system. The latency being measured here is glass-to-glass latency, defined as the time it takes for an image to go from the glass of a camera to the glass of a display.

Slow Motion Forwarding Frames Method

This method is the simplest and quickest way to measure a system latency with reliable measured values. It has the great advantage that it is completely independent of the refresh rate of the timer, it only takes into account how many frames takes to get a change reflected on the system screen under measurement.

To apply it, it is necessary to have a slow-motion camera that captures at least at 240 fps (most new smartphones like iPhone 7 or Samsung Galaxy S8 have a camera with 240fps slow-motion capabilities). To get the theoretical frame-per-frame latency of a 240fps camera, just take the reciprocal of the framerate. So, 1/240 = 4.167 ms.

The method consists in running a "millisecond resolution online timer" and/or "Color Strobe Light video" on your PC, then putting the camera of the system under measurement by focusing on the PC screen where the timer and/or video is displayed. Displays the captured live video of the system under measurement to a different monitor, next to the PC monitor. Then, record a slow-motion video with the smartphone camera of the two monitors.

Finally, we have to analyze the recorded video on VLC, taking advantage of the frame-by-frame forwarding feature. Open the video with VLC, and pause it when the video is stable (around 6s from the beginning). On the video, pay special attention to the PC monitor and identify the timer value or the color frame of the strobe video. Then, forward the video frame by frame until the same timer value or the same color frame that was identified on the PC monitor is seen on the system under the measurement monitor. To take the latency measurement, just simply multiply the number of forwarded frames with the frame latency of the slow-motion camera. For example, if it takes 12 frames to see a PC monitor change to be reflected on the system monitor, and the video was recorded with a 240 fps slow-motion camera (4.167 ms), the latency is 4.167x12 = 50 ms +/- 4ms. It is important to recall that the measured values will have an uncertainty equal to the frame-per-frame latency of the recorded video, so as higher the framerate of the slow-motion camera used to record the video, the less will be the uncertainty of the measurements.

The below picture illustrates the test setup, for a better understanding of it:

Slow motion forwarding frames latency measurement method


Below you will find useful links to an online ms timer and color strobe videos:

Sub-frame Resolution Method

This method is more complex to be applied and required more invested time, money, and resources.

If something is a random event, is it happening on all of the screens at the same time, or is restricted to a point in space? If it is happening in all of the camera’s lenses at the same time, do we consider latency the time it takes for the event to propagate in all of the receiving screens or just a portion of it? The difference between the two might seem small, but it is actually huge.

Consider a 30 fps system, where every frame takes 33 milliseconds. Now consider the camera of that system. To put it simply, every single line of the vertical resolution (say, 720p) is read in sequence, one at a time. The first line is read and sent to the video processor. 16 milliseconds later the middle line (line 360) is read and sent to the processor. 33 milliseconds from start the same happens to the last line (line 720). What happens when an event occurs on the camera glasses at line 360 and the camera just finished reading it? That’s easy, it is going to take the camera the time of a full frame (33 milliseconds) to even notice something changed. That information will have to be sent to the processor and down the line up to the display, but even by supposing all that to be latency-free (it is not), it takes the time of a full frame, 33 milliseconds, to propagate a random event from a portion of the screen in a worst-case scenario.

That is what happens to analog systems, where the interlaced 60 frames per second are converted to 30 progressive and are affected by this latency. There is no such thing as zero latency. It’s just a marketing gimmick, sustained by the difficulty of actually measuring those numbers.

Measuring latency with sub-frame resolution Or, measuring the propagation time of a random event from and to a small portion of the camera/display, because the other way is wrong (it only partially takes framerate into account).

A simple way of doing it is to use an Arduino, light up a LED, and measure the time it takes for some kind of sensor to detect a difference in the display. The sensor needs to be fast, and the most oblivious choice for the job (a photoresistor) is too slow, with some manufacturers quoting as much as 100 milliseconds to detect a change in light. For this, we need to use something more sophisticated, a photodiode, maybe with an included amplifier. A Texas Instruments OPT101P was selected for the job. The diode is pretty fast in detecting light changes, try putting it below a LED table lamp and you will be able to see the LED switching on and off – something usually measured in microseconds. However, measuring the time between two slightly different lights on a screen is going to take some tweaking and you might be forced to increase the feedback loop of the integrated OPAMP by something like 10M Ohms.

However, the end result is worth it: You will have a system capable of measuring FPV latency with milliseconds of precision.

See also




For direct inquiries, please refer to the contact information available on our Contact page. Alternatively, you may complete and submit the form provided at the same link. We will respond to your request at our earliest opportunity.


Links to RidgeRun Resources and RidgeRun Artificial Intelligence Solutions can be found in the footer below.