GStreamer and in-band metadata

From RidgeRun Developer Wiki
Revision as of 23:48, 18 June 2012 by Tfischer (talk | contribs)

Overview

As video data is moving though a GStreamer pipeline, it can be convenient to add information related to a specific frame of video, such as the GPS location, in a manner that receivers who understand how to extract metadata can access the GPS location data in a way that keeps it associated with the correct video data. In a similar fashion, if a receiver doesn't understand in-band metadata, the inclusion of such data will no effect the receiver.

References


MISP Motion Imagery Standards Profile

The key statement in the specification is Within the media container, all metadata must be in SMPTE KLV (Key-Length-Value) format. In MISP, the metadata is tagged with a timestamp so that it can be associated with the right video frame. This is important because GStreamer handles the association slightly differently. To understand the difference, you need to see how MISP combines video frames and metadata so you can compare it to GStreamer. The following diagram is a modified version from the MISBTRM0909 MISP spec.

KLV Key Length Value Metadata

For this discussion, we care about time stamping and transporting KLV data, not what it means. Stated another way, KLV data is any binary data (plus a length indication) that we need to move from one end to the other while keeping the data associated with correct video frame. It is up to the user of the video encoding stream and the user of the video decoding stream to understand the meaning and encoding of the KLV data.

To give a concrete KLV encoding example, here is a terse description of the SMPTE 336M-2007 Data Encoding Protocol Using Key-Length Value, which is used by MISB standard.

Key Length Value

Fixed length (1, 2, 4, or 16 bytes), size know to both sender and receiver, encoding the key. There are very specific rules on how keys are encoded and how both the sender and receiver know the meaning of the encoded key.

Fixed or variable length (1, 2, 4, or BER) indication of the number of bytes of data used to encode the value.

Variable length value whose meaning is agreed to by both the sender and the receiver.

As an example (from Wikipedia KLV entry),

Key Length Value
42 2 0 3

Which could be passed in as a 4 byte binary blob of 0x2A 0x02 0x00 0x03. The transport of the KLV doesn't need to know the actual encoding, just that it is 4 bytes long and the actual KLV data.

As another example (not MISB compliant), you could have the length be 8 and the data be 0x46 0x4F 0x4F 0x3D 0x42 0x41 0x52 0x00, which works out to be the NULL terminated ASCII string FOO=BAR. The transport doesn't care about the encoding, just so both the sending and receiver are in agreement.

Time stamps

Metadata and GStreamer