RidgeRun Metadata/Streaming Protocols/SRT

From RidgeRun Developer Wiki

Follow Us On Twitter LinkedIn Email Share this page





NVIDIA partner logo NXP partner logo




Introduction (SRT)

Secure Reliable Transport (SRT) is a modern low-latency streaming protocol designed for real-time contribution and remote production. It is built on top of UDP but adds mechanisms that UDP alone does not provide, including:

  • ARQ (Automatic Repeat reQuest): retransmissions on demand (NACK-based).
  • Jitter buffering and resequencing: smooths variable delay and restores order of packets.
  • Congestion control: dynamically adapts to available bandwidth.
  • Encryption: optional AES-128/192/256 for content protection.

SRT is widely used in live production where glass-to-glass delay matters. Its design keeps the efficiency of UDP but solves its key weaknesses (loss, jitter, reordering), giving users a tunable latency target in the tens to hundreds of milliseconds range.

Key Characteristics

  • UDP-based + ARQ: Uses selective retransmissions (NACK-driven) within a latency window. Unlike TCP, there is no head-of-line blocking.
  • Configurable latency: Sender and receiver maintain latency buffers (e.g., 60–200 ms) to absorb jitter and allow late packets to be recovered.
  • Jitter handling: SRT resequences out-of-order packets and fills gaps using retransmitted packets if they arrive before the deadline.
  • Congestion control: Bandwidth estimation algorithms adapt to changing network conditions (similar to BBR/UDT).
  • Encryption: AES-128/192/256 using passphrases and key length (`pbkeylen`).
  • Connection modes: caller, listener, and rendezvous modes allow flexible NAT/firewall traversal.

Benefits and Limitations

Benefits

  • Low end-to-end latency — Configurable in the 50–200 ms range for stable links. Lower values can be achieved on clean networks.
  • Resilience to loss and jitter — ARQ and jitter buffers allow smooth playback even with 5–10% packet loss bursts.
  • Integrated security — AES encryption without needing external VPN or tunnels.
  • Firewall/NAT friendly — Rendezvous mode enables connections through symmetric NATs with a single UDP flow.
  • Broad ecosystem support — Implemented in open-source (Haivision, SRT Alliance, GStreamer, FFmpeg, OBS, VLC).

Limitations

  • Latency vs recovery trade-off — Higher loss requires larger latency buffers (increasing end-to-end delay).
  • Not suitable for file transfer — SRT is optimized for real-time media, not bulk file delivery.
  • Endpoint requirement — Both sender and receiver must support SRT (not playable in a standard web browser).
  • Tuning required — Incorrect latency configuration can cause drops if too small, or excess delay if too large.

RidgeRun products with SRT (comparison)

The following table summarizes RidgeRun’s metadata solutions and how they integrate with SRT. It highlights the benefits and limitations of each combination.

Product Combination with SRT Benefits Limitations
GstSEIMetadata (H.264/H.265) SEI NAL units embedded in the video bitstream; carried over SRT (typically inside MPEG-TS). Frame-accurate metadata aligned with frames/IDR.
• Works with seiinject/seiextract.
• Standards-friendly with H.264/H.265 + TS.
• Receiver must explicitly parse SEI.
• Re-encoding or some middleboxes may strip SEI unless preserved.
AV1 + OBU Metadata AV1 metadata OBUs preserved over SRT when encapsulated (e.g., TS/Matroska) and not transcoded. In-band metadata for AV1.
• Future-ready for next-gen workflows.
• Limited player support for AV1+OBU today.
• AV1 encode/decode is CPU-intensive.
MPEG-TS (native metadata) Private PES, descriptors, or KLV inside TS carried natively over SRT. Interoperable en broadcast.
• Multiplex A/V/metadata in one stream.
• Overhead adicional de TS.
• Requiere demultiplexar para extraer la metadata.

Example pipelines using SRT

1) SEI over SRT (H.264)

Sender (inject)
gst-launch-1.0 -e   videotestsrc is-live=true ! x264enc tune=zerolatency key-int-max=60 !   seiinject metadata="HELLO WORLD" ! mpegtsmux !   srtsink uri="srt://127.0.0.1:9000?mode=caller&latency=120" -v
Receiver (extract)
GST_DEBUG=*sei*:MEMDUMP gst-launch-1.0 -e   srtsrc uri="srt://:9000?mode=listener&latency=120" !   tsdemux ! seiextract ! fakesink silent=false -v
Output (expected)
0:00:08.957161209 48157 0x580c62870cc0 MEMDUMP           seiextract gstseiextract.c:299:gst_sei_extract_extract_h264_data:<seiextract0> ---------------------------------------------------------------------------
0:00:08.957167675 48157 0x580c62870cc0 MEMDUMP           seiextract gstseiextract.c:299:gst_sei_extract_extract_h264_data:<seiextract0> The extracted data is: 
0:00:08.957174464 48157 0x580c62870cc0 MEMDUMP           seiextract gstseiextract.c:299:gst_sei_extract_extract_h264_data:<seiextract0> 00000000: 48 45 4c 4c 4f 20 57 4f 52 4c 44 00              HELLO WORLD.    
0:00:08.957179354 48157 0x580c62870cc0 MEMDUMP           seiextract gstseiextract.c:299:gst_sei_extract_extract_h264_data:<seiextract0> ---------------------------------------------------------------------------

2) AV1 + OBU metadata over SRT

Sender (inject OBU)
gst-launch-1.0 -e \
  videotestsrc is-live=true num-buffers=15 ! \
  "video/x-raw,width=320,height=240" ! \
  av1enc cpu-used=8 ! \
  obuinject metadata="Hello OBU World" ! \
  av1parse ! mpegtsmux alignment=7 name=mux ! \
  srtsink uri="srt://127.0.0.1:9000?mode=caller&latency=120&transtype=live" -v
Receiver (extract OBU)
GST_DEBUG=*obu*:MEMDUMP gst-launch-1.0 -e -v \
  srtsrc uri="srt://:9000?mode=listener&latency=120" ! \
  tsdemux ! av1parse ! obuextract ! fakesink silent=false
  • Output (expected)
0:00:03.412583214  98765 0x55c3b3a2e1e0 MEMDUMP           obuextract gstobuextract.c:286:gst_obu_extract_prepare_output_buffer:<obuextract0> -----------------------------
0:00:03.412592133  98765 0x55c3b3a2e1e0 MEMDUMP           obuextract gstobuextract.c:286:gst_obu_extract_prepare_output_buffer:<obuextract0> The extracted data is
0:00:03.412600942  98765 0x55c3b3a2e1e0 MEMDUMP           obuextract gstobuextract.c:286:gst_obu_extract_prepare_output_buffer:<obuextract0> 00000000: 48 65 6c 6c 6f 20 57 6f 72 6c 64                 Hello OBU World
0:00:03.412604499  98765 0x55c3b3a2e1e0 MEMDUMP           obuextract gstobuextract.c:286:gst_obu_extract_prepare_output_buffer:<obuextract0> -----------------------------

Examples: Sending Metadata over SRT

This example shows how to send and receive metadata using RidgeRun’s metasrc / metasink elements with mpegtsmux over SRT. Two terminals are required: one for the sender (Pipe 1) and one for the receiver (Pipe 2).

1) Video stream without metadata

Pipe 1 (Sender)
gst-launch-1.0 -v videotestsrc is-live=true ! \
  x264enc key-int-max=30 tune=zerolatency ! \
  mpegtsmux ! \
  srtsink uri="srt://:8888?mode=listener"
Pipe 2 (Receiver)
GST_DEBUG=WARNING gst-launch-1.0 -v \
  srtsrc uri="srt://127.0.0.1:8888?mode=caller" ! \
  tsdemux ! \
  h264parse ! decodebin ! queue ! videoconvert ! xvimagesink -v

This setup transmits only video. You should see a test pattern displayed in a video window on the receiver side.

---

2) Metadata example

Pipe 1 (Sender with metadata)
GST_DEBUG=WARNING gst-launch-1.0 \
  metasrc metadata="HELLO SRT WORLD" period=1 ! \
  mpegtsmux ! \
  srtsink uri="srt://:8888?mode=listener" -v
Pipe 2 (Receiver with metasink)
GST_DEBUG=WARNING gst-launch-1.0 -v \
  srtsrc uri="srt://127.0.0.1:8888?mode=caller" ! \
  tsdemux ! \
  metasink -v

Expected Output (excerpt)

On the receiver side (Pipe 2) you should see log lines like:

0:00:02.134567891  12345 0x55a6f3f10 DEBUG metasink Received metadata buffer
0:00:02.134572134  12345 0x55a6f3f10 INFO  metasink payload = "HELLO SRT WORLD"

UDP vs SRT (Comparison)

Aspect UDP (raw) SRT (built on UDP)
Loss handling None → packets lost are unrecoverable (visual artifacts, glitches). ARQ (NACK) + jitter buffer → packet recovery within latency window.
Latency under loss Spikes, freezes, macroblocking when loss occurs. Stable latency (e.g., 120 ms) as long as retransmissions arrive before the buffer deadline.
Reordering/Jitter Not handled; late or reordered packets are dropped. Handled by resequencing and jitter buffer.
Security None. AES-128/192/256 with passphrase-based key exchange.
Firewall/NAT traversal Manual port opening, often blocked by NAT/firewalls. Caller/Listener/Rendezvous modes allow connections through NAT/firewall.

Why use SRT instead of UDP?

SRT is particularly effective when you need:

  • Predictable low latency in live streaming or remote production.
  • Robustness to network impairments such as jitter and packet loss.
  • Secure contribution without adding extra overhead from VPNs.
  • Cross-NAT connectivity without complex network setups.

While raw UDP can be sufficient on pristine networks, it fails under real-world conditions (loss, jitter). SRT provides continuity, stability, and professional-grade resilience at the cost of a small, controlled latency buffer.


SRT vs UDP – Low-Latency and Loss Resilience Demo

This demo shows how SRT maintains smooth playback with low latency under packet loss and jitter, while UDP degrades noticeably. You will run four pipelines (SRT sender/receiver and UDP sender/receiver) and then inject impairments using tc netem.

1) Launch the SRT pair

Receiver (listener, ~50 ms target latency)
gst-launch-1.0 -e \
  srtsrc uri="srt://:9000?mode=listener&latency=50" \
  ! queue max-size-time=2000000000 \
  ! tsdemux name=demux \
  ! h264parse \
  ! avdec_h264 \
  ! fpsdisplaysink video-sink=autovideosink sync=true text-overlay=false
Sender (caller)
gst-launch-1.0 -e \
  videotestsrc is-live=true pattern=18 \
  ! video/x-raw,framerate=30/1 \
  ! x264enc tune=zerolatency speed-preset=ultrafast key-int-max=30 bitrate=4000 \
  ! h264parse config-interval=-1 \
  ! mpegtsmux alignment=7 name=mux \
  ! srtsink uri="srt://127.0.0.1:9000?mode=caller&latency=50&transtype=live"

2) Launch the UDP pair (baseline)

Receiver
gst-launch-1.0 -e \
  udpsrc port=5000 caps="application/x-rtp,media=video,encoding-name=H264,clock-rate=90000,payload=96" \
  ! rtph264depay \
  ! h264parse \
  ! avdec_h264 \
  ! fpsdisplaysink video-sink=autovideosink sync=true text-overlay=false
Sender
gst-launch-1.0 -e \
  videotestsrc is-live=true pattern=18 \
  ! video/x-raw,framerate=30/1 \
  ! x264enc tune=zerolatency speed-preset=ultrafast key-int-max=30 bitrate=4000 \
  ! h264parse config-interval=1 \
  ! rtph264pay pt=96 \
  ! udpsink host=127.0.0.1 port=5000

3) Impair the network (simulate loss and jitter)

Run this while the four pipelines are playing (using the loopback device in this local demo):

# Add 5% loss + 50 ms delay ±10 ms jitter (normal distribution)
sudo tc qdisc add dev lo root netem loss 5% delay 50ms 10ms distribution normal

To restore normal conditions:

sudo tc qdisc del dev lo root

Expected behavior

When the four pipelines are running you will see two video windows (one for SRT and one for UDP), each showing the white ball moving in a circle (pattern 18):

  • Left window = SRT (srtsrc) — playback remains much more stable, with continuous motion and steady FPS despite loss/jitter (recovered within the 50 ms latency window).
  • Right window = UDP (udpsrc) — you will observe frame drops, freezes, or macroblocking under the same impairments because RTP/UDP has no loss recovery.

As shown in the following animation, the left window (SRT) remains stable while the right window (UDP) freezes and drops frames under 5% loss and jitter.

SRT (left) stable playback vs UDP (right) unstable playback

Notes & tips

  • You can tune latency by changing the latency query (e.g., 80–200 ms) to trade recovery depth for added buffering.
  • fpsdisplaysink shows instantaneous FPS; expect SRT to keep FPS closer to 30 under loss, while UDP dips noticeably.
  • If you test across different hosts, replace 127.0.0.1 with the receiver’s IP and apply tc netem on the egress/ingress NIC instead of lo.
  • Keep tune=zerolatency and a small GOP (key-int-max=30) to minimize encoder buffering.

References