Ridgerun OMX 0.10 plugins for DM81xx platforms: Difference between revisions

From RidgeRun Developer Wiki
mNo edit summary
mNo edit summary
Line 328: Line 328:
[[Category:GStreamer]]
[[Category:GStreamer]]
[[Category:RidgeRunTechnology]]
[[Category:RidgeRunTechnology]]
[[Category:DM8148]]
[[Category:DM8168]]

Revision as of 19:37, 2 April 2014

Introduction

To resolve pipeline state changes using the existing GStreamer OMX plugin for the DM81xx platforms, RidgeRun used a new approach which led to the development of a new OpenMax GStreamer plugin. Developers will find the following information useful as they add feature to the new plugin.

Downloading source code

The source files for Ridgerun OMX 0.10 plugins are available on GitHub.

git clone https://github.com/RidgeRun/gst-rr-openmax-dm81xx

You can use these plugins together with the old Ridgerun plugins based on TI's version using the remove-obsolete-elements branch:

git clone https://github.com/RidgeRun/gst-openmax-dm81xx
cd gst-openmax-dm81xx/
git checkout remove-obsolete-elements

The obsolete elements are still available with a legacy_ prefix (i.e legacy_omx_scaler)

omx:  omx_mpeg4dec: OpenMAX IL MPEG-4 video decoder
omx:  legacy_omx_h264dec: OpenMAX IL H.264/AVC video decoder
omx:  omx_mjpegdec: OpenMAX IL JPEG/MJPEG decoder
omx:  legacy_omx_mpeg2dec: OpenMAX IL MPEG2 video decoder
omx:  omx_h264enc: OpenMAX IL H.264/AVC video encoder
omx:  omx_vc1dec: OpenMAX IL vc1/WMV video decoder
omx:  omx_aacdec: OpenMAX IL AAC audio decoder
omx:  omx_aacenc: OpenMAX IL AAC audio encoder
omx:  omx_jpegenc: OpenMAX IL MJPEG video encoder
omx:  omx_videosink: OpenMAX IL videosink element
omx:  swcsc:  Image colorconversion
omx:  gstperf:  Performance element
omx:  omxbufferalloc: omxBufferAlloc
omx:  legacy_omx_scaler: OpenMAX IL for OMX.TI.VPSSM3.VFPC.INDTXSCWB component
omx:  legacy_omx_hdeiscaler: OpenMAX IL for OMX.TI.VPSSM3.VFPC.DEIMDUALOUT component
omx:  omx_noisefilter: OpenMAX IL for OMX.TI.VPSSM3.VFPC.NF component
omx:  omx_ctrl: OpenMAX IL Client to control display/capture mode
omx:  omx_tvp: External video decoder control
omx:  omx_camera: Video OMX Camera Source
omx:  priority: TI Priority adjuster
omx:  rr_h264parser: rr_h264parser
omx:  omx_videomixer: OpenMAX IL for OMX.TI.VPSSM3.VFPC.INDTXSCWB component
rromx:  omx_mpeg2dec: OpenMAX MPEG-2 video decoder
rromx:  omx_h264dec: OpenMAX H.264 video decoder
rromx:  omx_scaler: OpenMAX video scaler
rromx:  omx_hdeiscaler: OpenMAX video deiscaler
rromx:  omx_mdeiscaler: OpenMAX video deiscaler

It is not recommended to try mixing the old and new elements. One exception is v4l2src, which you can use with the old omxbufferalloc.

Development criteria

The Base OMX Class

The RrOmxBase is the base class from which all the elements derive. It handles all the common logic so the subclasses only need to implement the stream specific logic.

The base class handles:

  • Memory management
  • Buffer scheduling
  • Caps negotiation
  • Allocating and releasing resources
  • Element lifecycle

The sub-classes must implement:

  • Install the appropriate pads
  • Caps parsing to grab useful information
  • Port and component specific initialization.
  • Logic to handle processed buffers (push downstream, discard, etc...)

Element lifecycle

1. The instance is created and the OMX component handle is requested.

2. One of two may happen next:

2.1 Upstream element requests empty buffer from OMX component (by calling buffer_alloc())

 buffer_alloc(caps):
   if (caps are not set yet):
     set_caps (caps)
   if (component not started yet):
     start()
   return free_buffer_from_existing()

2.2 Or set caps is called directly (no buffer requested)

 set_caps(caps):
   subclass->parse_caps(caps)
 
   if (already_configured):
     stop ()
 
   subclass->init_ports()

3. Component is started

 start():

   set_component_to_idle()
 
   for each port:
     allocate_buffers()
 
   set_component_to_executing()

4. Start buffer processing

 chain(buffer):
   if (component not started yet):
     if (buffer is OMX)
        share_buffers()
     start()

   if (buffer is OMX):
     mark_as_bussy (buffer)
   else:
     copy_into_free_buffer (buffer)
 
   empty_buffer(buffer)

 empty_buffer_callback(buffer):
   mark_buffer_as_free (buffer)

 fill_buffer_callback(buffer):
   
   mark_as_bussy (buffer)
   subclass->fill_buffer (buffer)

5. Component is stopped

 stop():
 
   for each port:
     flush()
 
   set_component_to_idle()
 
   for each port:
     free_buffers()
 
   set_component_to_loaded()

6. Resources are freed

Subclassing

Subclassing the GstRrOmxBase class should be done by implementing the following class functions

Class function Description
omx_event NOT IMPLEMENTED: This function forwards any received event to its child.
omx_fill_buffer() Once the component finishes processing a buffer, it will call the fill_callback() of the base class and will be forwarded to its child by calling this function. The subclasses should push the buffer to its src port(s) or discard the buffer thus releasing it.
omx_empty_buffer() NOT_IMPLEMENTED: in case the subclass needs to be informed of the empty callback. It's difficult to find a situation in which that happens.
parse_caps() Once caps have been negotiated, they will be forwarded to the subclass for it to parse all the necessary information. Width, height, framerate, etc...
init_ports() Called when the subclass must initialize the ports with the information retrieved from the caps in parse_caps().
parse_buffer() DEPRECATED: This was an initial approach to implement the parser into the decoder. Its too complicated, especially because demuxers will send buffers in chunks and one must implement an adapter, etc...

Philosophy

Dear GstRrOmx developer,

  • All elements should be able to transition NULL-PLAYING-NULL-PLAYING... You can easily test this using GStreamer Daemon
  • Follow the naming convention, even for debug and properties.
  • Elements should be able to reconfigure themselves on the fly when new caps are received. If the base class logic inhibits reconfiguratin, modify the core.
  • Think from a user's perspective: If my pipeline failed, I want to know why.

Known limitations and issues

  • Debug from callbacks will segfault. Use GST_DEBUG_FILE=/dev/console to view debug output.
  • Many configurations except for what depends on the caps are hardcoded. You can change this limitation as you have need.
  • Placing queues after the last omx element has proven to cause problems. We do not know why this is an issue yet.
  • Do not mix new elements with old omx elements; they will implode and smoke and bad smells will come out of your computer.
  • The number of input ports on Scaler and Deiscaler has to be equal to the number of output buffers on decoders elements.
  • The number of input or output buffers has to be set carefully or some videos will not run properly (for example decoders usually work properly with four input buffers). For simplicity just go to the OMX examples and grab the port configuration, as they have been tested and work properly.

Memory Management

The problem

One of the things that make OMX/GStreamer so challenging is the way buffers and memory in general is handled. In the OMX framework data must be registered to the component's ports prior the buffer processing, so only these buffers may be used during the pipeline's lifecycle.

Each port has a list of buffers that will be used during the pipeline's lifecycle.

A buffer not registered in a port cannot be pushed to a port and buffers cannot be registered on-the-fly. If one desires to append a new buffer to the list, the component must be stopped (brought to the idle state) first, interrupting the stream. A mechanism smart enough to recycle the buffers on each port with the upstream and downstream element must be implemented.

The Buffer Table

The list of buffers on each port is managed by the buftab. The buftab is a minimalist implementation of a custom buffer table. It is thread safe and will ensures everything is freed properly before shutting itself off.

The GStreamerish solution

GStreamer has a very vague approach to deal with elements with special memory requirements. Every element will ask its downstream peer for special memory requirements by calling gst_pad_alloc_buffer and friends. The peer element may or may not have a buffer_alloc callback implemented. If implemented, then it must allocate the buffer of its special memory and with the requested size and caps. If not implemented, it must forward the query downstream. GStreamer base classes normally handle this automatically, if you are deriving directly from GstElement, you will need to implement this yourself.

The GstRrOMX plugin leverages this to perform buffer sharing among the different OMX elements. Consider the following diagram:

source     !      rromx_mpeg2dec      !      rromx_scaler peer-alloc=false      !     sink
  V                     A   V                 A                V                       A
  '-- may ask for buff--'   '-- ask for buff--'                '-- don't ask for buff--'

The source element, may or may not ask the decoder for buffers, its up to the source being used. Regular elements don't have the limitation of needing a fixed set of buffers, so if the decoder is going to provide the source with buffers, care must be taken in order to ensure that it won't run out of buffers to supply to the source, and probably crashing the pipeline. If the decoder input port happens to receive a buffer which isn't among its buffer table, then it will memcpy the data into one of its own buffers. This memcpy should emit a log to the GST_PERFORMANCE category.

The decoder output port will then ask the scaler input port for buffers. This port will ask for its whole table all at once prior pipeline processing, so the scaler output's port must have enough buffers to provide all of them simultaneously. Obviously, the caps between the two elements must be compatible. Special care must be taken regarding both component's alignment restrictions. The decoder output and the scaler input will share the same set of buffers during the pipeline lifecycle.

The scaler output port, on the other hand, is configured to avoid asking downstream for buffers. In fact, this is something desired, as it is known that no OMX elements are downstream and some other element may provide a buffer which is not OMX memory. This will cause a holocaust. It is recommended to setting peer-alloc=false to the last OMX element.


Limitations of this approach

Even though this is the standard GStreamer way to deal with special memory requirements, there are very well know limitations to the mechanism. This is probably the reason why it was discontinued on GStreamer 1.0, replaced by the promising allocator.

  • Buggy elements not always forward buffer_alloc petitions: If one of this buggy elements is placed in between the scaler and the decoder then the buffer alloc will not be forwarded and hence, not buffer sharing will be performed. This is more common that one would like to:(
  • Buffer sharing may only be performed between two elements: Consider the following situation
                          .- rromx_scaler 1
rromx_mpeg2decoder - tee +        
                          '- rromx_scaler 2

See the problem? The tee is will only forward the petition to one pad (by this property), and hence to one scaler. This means that the decoder will only share its buffers with one scaler. The other one is condemned to memcpy the data the rest of its life. This can be generalized to all SIMO elements.

The TI way

TI's standard on the regular GstOpenmax plugins is contrary. The OMX element will reserve its memory upon the arrival of the first buffer. If this buffer is a OMX_BUFFER_TRANSPORT (a subclass of a regular GstBuf), then it means that the buffer carries the buffer set of an upstream OMX element, and they may be registered to be shared. Then this element will allocate its output buffers and place them on the OMX_BUFFER_TRANSPORT pushed downstream. The process repeats as needed.

source   !   omx_mpeg2dec   !   omx_scaler  !  sink
              V                         A
              '-- OMX_BUFFER_TRANSPORT--'
                  /       /       /  \
           buffer! buffer! buffer!    buffer!

Limitations of this approach

This approach overcomes the limitation's of the GStreamerish way, but present's it's very own.

  • Toggling between input branches is problematic: Consider the following configuration
 source - omx_mpeg2dec 1 --.
                            >-- omx_scaler - sink 
 source - omx_mpeg2dec 2 --'

The scaler feed is selected between the two decoders. The scaler will only be able to share buffers with one of them (the one from which it receives the first buffer). This can be generalized to all MISO elements. TI's elements currently segfaults if this situation occurs. A more decent implementation would memcpy the data into an owned buffer.

  • GstBuffer subclasses may be hid if a subbuffer is created: Under some special conditions, a subbuffer of the OMX_BUFFER_TRANSPORT is created. This subbuffer is a regular GstBuffer and hence doesn't carry with all the cool stuff its parent does. If the scaler receives a subbuffer from the decoder then it won't notice that it is a OMX_BUFFER_TRANSPORT and the elements won't be able to share buffers. This situation typically occurs when a tee is on the pipeline. One possible workaround is to query for both: buffer and parent (if any). Then all the OMX stuff is retrieved from the parent. It's unsure of the implications of doing this for the core.
  • GstBuffer subclasses are not supported on the GStreamer 1.0 version

Interlaced buffers

Some OMX components require especial memory treatment if the video format is interlaced e.g deiscaler. The component expects a separated OMX buffer for each frame field. For this reason the base class has a variable interlaced that can be overwritten by the subclass. If interlaced is set, indicates to the base class that the incoming buffers should be sent on separated OMX buffers to the OMX component.

When "interlaced" is set only the Tiish memory share is allowed. On this case the base class is going to allocate two OMX buffer headers for each peer buffer. Each field buffer offset and size is calculated to avoid copies.

Implemented Approach

Both GStreamerish and TIish standards work better for different situations. For this reason we have both of them available, configurable by a property (peer-alloc). The memory allocation is manage as follows:

  • For the src pads the GStreamerish way is used. That means if the peer-alloc property is set the pad is going to ask the peer for OMX buffers, if not the pad is going to allocate its own OMX buffers.
  • For the sink pads a combination of both methods is used. If a pad_alloc call is done on the pad then the OMX buffers are allocated for the pad and share with the upstream peer. On the other hand, if no pad_alloc have occur before the first buffer arrives to the chain function the TIish way is used. I the input buffer is an OMX type then the pad will use the upstream peer buffers registering them to the OMX port, if not the pad allocates its own OMX buffers.
 allocate_buffers(pad, peer_buffers):
   if (pad already has buffers table)
     ignore call
   
   if (peer_buffers)
     if (interlaced)
        divide amount of buffers by 2
        calculate second field offset
     
     get peer_buffers table

   for amount of buffers:
     if (pad is src & peer_alloc):
       omxbuffer = peer_alloc_buffer()
     else if (pad is sink & peer_buffers):
       if (top_field)
         omxbuffer = get buffer from peer table
       else 
         omxbuffer = add offset to previous peer buffer

     if (omxbuffer):
       omx_use_buffer()
     else
       omx_allocate_buffer()

Elements supported

  • Mpeg2 Decoder
  • H.264 Decoder
  • Scaler
  • Hdeiscaler and Mdeiscaler
  • v4l2src and v4l2sink - but omxbufferalloc to be used with v4l2src is not implemented yet. Please see section How to Download section for more details

Examples

On this link you can find some examples using the current supported plugins: