GStreamer In-Band Metadata for MPEG Transport Stream - MPEG TS Metadata Basics

From RidgeRun Developer Wiki



  Index Next: Getting Started





In-band Metadata Overview

The term In-Band refers to the communication channel used to carry additional data or information related to a specific source, this term means that the additional information travels along with the specific data source sharing the same channel.

As video data is moving through a GStreamer pipeline it can be convenient to add information related to a specific frame of video -such as the GPS location- in a simple manner that receivers, who understand how to extract the additional metadata, can access the GPS location data in a way that keeps it associated with the correct video data. In a similar fashion, if a receiver doesn't understand in-band metadata, the inclusion of such data will not affect the receiver's operation.

MISP: Motion Imagery Standards Profile

The Motion Imagery Standards Profile also known as MISP, is an industry-standard with recommended practices and engineering guidelines for motion imagery technologies. The key statement in the specification is Within the media container, all metadata must be in SMPTE KLV (Key-Length-Value) format. In MISP, the metadata is tagged with a timestamp so that it can be associated with the right video frame. This timestamp is important because GStreamer handles the association slightly differently. In order to understand the difference, you need to see how MISP combines video frames and metadata so you can compare it to GStreamer. The following diagram is a modified version of the MISBTRM0909 MISP specification.

Metadata Diagram with Motion Imagery

KLV - Key Length Value Metadata

For this discussion, we care about time-stamping and transporting KLV data, not what it means. Stated another way, KLV data is any binary data (plus a length indication) that we need to move from one end to the other while keeping the data associated with the correct video frame. It is up to the user of the video encoding stream and the user of the video decoding stream to understand the meaning and encoding of the KLV data.

To give a concrete KLV encoding example, here is a terse description of the SMPTE 336M-2007 Data Encoding Protocol using Key-Length Value, which is used by MISB standard.

Key Length Value

Fixed length (1, 2, 4, or 16 bytes), size known to both sender and receiver, encoding the key. There are very specific rules on how keys are encoded and how both, the sender and receiver, know the meaning of the encoded key.

Fixed or variable length (1, 2, 4, or BER) indication of the number of bytes of data used to encode the value.

Variable length value whose meaning is agreed to by both the sender and the receiver.

As an example (from Wikipedia KLV entry),

Key Length Value
42 2 0 3

Which could be passed in as a 4-byte binary blob of 0x2A 0x02 0x00 0x03. The transport of the KLV doesn't need to know the actual encoding, just that it is 4 bytes long and the actual KLV data.

As another example (not MISB compliant), you could have the length be 8 and the data be 0x46 0x4F 0x4F 0x3D 0x42 0x41 0x52 0x00, which works out to be the NULL-terminated ASCII string FOO=BAR. The transport doesn't care about the encoding, just so both the sending and receiver are in agreement.

Time stamps

In addition, to provide in-band information from the sender to the receiver, the information requires a timestamp mechanism that allows the data to maintain a time relationship with the video frames which also include a timestamp. Both the metadata and the video frame timestamps are generated by the same source clock.

Since both data flows, the metadata and the video frames can be viewed as data streaming through a pipeline, the maximum accuracy in maintaining the time relationship between the two data flows depends on assigning a timestamp value as soon as the data is generated. Any delay or variability in associating the timestamp with either the video frames or the metadata will add error to the time relationship.

MISB: Motion Imagery Standard Board

The mission of the MISB is to ensure the development, application, and implementation of standards that maintain interoperability, integrity, and quality of motion imagery, associated metadata, audio, and other related systems in the DoD/IC/NSG. It monitors and participates in the development of and changes to adopted standards, it also participates in the North Atlantic Treaty Organization (NATO) Standards Agreement (STANAG) process towards coalition forces interoperability. Some of the defined standards are ST1402 (MPEG-2 Transport Stream for Class 1/Class 2 Motion Imagery, Audio, and Metadata) and ST0601 (UAS Datalink Local Set).

MPEG-2 Transport Stream

The MPEG-2 Transport Stream protocol adds a Transport Stream header to the video data, audio data, and metadata. The video data, audio data, and metadata are termed elementary streams. The Transport Stream header is followed by data called packets. The Transport Stream header allows the receiving side to use the header's PID (Packet ID) field to de-multiplex the elementary streams. There are many other fields in a Transport Stream header beside the PID field.

For this discussion, the important point is the Transport Stream protocol definition already supports the notion of including timestamped metadata in a transport stream file.

GStreamer tags

GStreamer supports an event called a tag. When an element receives a tag that it doesn't understand, it simply passes it downstream. Tags are either independent of the stream encoding (like the title of the song for an audio stream) or information that affects how the stream is processed (like the stream bitrate).

More information about GStreamer's events and tags:

GStreamer Core 1.0 - Metadata Support

Since release 1.6, GStreamer has included basic support for muxing KLV metadata into MPEG-TS and demuxing it from MPEG-TS, so it's not necessary to apply any patch in order to enable the elements mpegtsmux and tsdemux to accept and stream metadata, where you can verify this by running the following commands, notice that tsdemux allows any capability for the source pad.

  • Mpegtsmux

Plugin information using gst-inspect

gst-inspect-1.0 mpegtsmux

Details in the output:

Factory Details:
  Rank                     primary (256)
  Long-name                MPEG Transport Stream Muxer
  Klass                    Codec/Muxer
  Description              Multiplexes media streams into a MPEG Transport Stream
  Author                   Fluendo <contact@fluendo.com>

...

Pad Templates:

  ...

  SINK template: 'sink_%d'
    Availability: On request
      Has request_new_pad() function: 0x7ff826c69980
    Capabilities:

      ...

      meta/x-klv
                 parsed: true
  • Tsdemux

Plugin information from gst-inspect

gst-inspect-1.0 tsdemux
Factory Details:
  Rank                     primary (256)
  Long-name                MPEG transport stream demuxer
  Klass                    Codec/Demuxer
  Description              Demuxes MPEG2 transport streams
  Author                   Zaheer Abbas Merali <zaheerabbas at merali dot org>
                           Edward Hervey <edward.hervey@collabora.co.uk>

...

Pad Templates:

...

Pad Templates:
  SRC template: 'private_%04x'
    Availability: Sometimes
    Capabilities:
      ANY

...

Although this is currently supported out of the box, this implementation is not following the MISB Standard for synchronous metadata. To add this support, the mpegtsmux/demux elements need to be updated. You can contact RidgeRun for more information.

The meta-plugin

For GStreamer Core 1.0 (1.8.0 and newer) the meta-plugin was developed and created at RidgeRun, which contains two elements: metasrc, in order to send directly any kind of metadata over a pipeline (similar to gstreamer-0.10) and as a new feature metasrc allows sending metadata periodically and provides GStreamer in-band metadata support with date-time format. On the other hand metasink is for receiving the incoming metadata buffers. So, to be able to inject metadata to a pipeline and/or accept any metadata, the meta plugin must be compiled and installed correctly.

#gst-inspect-1.0 meta

Plugin Details:
  Name                     meta
  Description              Elements used to send and receive metadata
  Filename                 /home/$USER/gst_$VERSION/out/lib/gstreamer-1.0/libgstmeta.so
  Version                  1.0.0
  License                  Proprietary
  Source module            gst-plugin-meta
  Binary package           RidgeRun elements
  Origin URL               http://www.ridgerun.com

  metasrc: Metadata Source
  metasink: Metadata Sink

  2 features:
  +-- 2 elements

GStreamer Application: metasrc and metasink

Adding metadata support to your application involves two main tasks: on the sending side injecting metadata on the pipeline and for the receiving side extracting the metadata. In this section, we will review how to achieve this.

To inject metadata, you create a GStreamer Buffer (now referred as gstbuf), set the timestamp, set the caps to meta/x-klv, copy your metadata into the buffer, and push the gstbuf to the metadata sink pad on the mpegtsmux element.

In order to make this easier, you can use the RidgeRun's metasrc element. Once you have the pipeline with a metasrc element, you can simply set the metasrc metadata property with the data you want to inject. The metasrc element will create the gstbuf, set the timestamp and caps for you. You simply set the metadata property of metasrc element to contain the data you want to send.

In python for instance, once the pipeline is created, you simply run:

   metadata  = “arbitrary data here”
   metasrc = self.pipeline.get_by_name(“metasrc")
   metasrc.set_property(“metadata", metadata)

To extract the metadata, you branch out the demuxer to get the metadata's src_pad and set the caps to meta/x-klv. This pad will output the metadata in gstbuf from the mpegtsdemux element.

References

MISP Specification

GStreamer's KLV Support


  Index Next: Getting Started