GstCUDA - cudademux

This page describes in detail the cudademux element of the GstCUDA plugin.

Description

Cudademux is a single input/multiple output pad video filter GStreamer element, that allows video frames to be processed by the GPU using a custom CUDA library algorithm. With this element users can now develop their own CUDA processing library, pass the library into cudademux, which executes the library on the GPU, passing upstream frames from the GStreamer pipeline to the GPU and passing the modified frames downstream for each output pad to the next element in the GStreamer pipeline.

This element executes the CUDA algorithm from a custom CUDA library (XXX.so file) loaded dynamically during run-time, passed trough an element's property. The CUDA algorithm is separated from the GStreamer element, so the developer could make modifications to the CUDA algorithm, recompile the custom CUDA library and run the GStreamer pipeline again to test the changes. This process can be iterated as many times as needed to debug a custom CUDA algorithm. This feature make cudademux ideal for quick prototyping because it offers flexibility and adaptability to many project requirements.

One key feature of this element is the capability to load the CUDA algorithm to be executed on the GPU to process the incoming frames, from an external compiled custom CUDA library. This gives the advantage of having the GStreamer element separated from the CUDA algorithm. So, the developer doesn't have to worry about the GStreamer-CUDA interface and complex memory handling, because the cudademux will take care of that. Instead, the developer can be focused on the custom CUDA algorithm development, and test any change made during the debugging process by just recompiling the CUDA library and just execute the GStreamer pipeline again without the necessity to modify, recompile and reinstall the GstCUDA plugin. This feature is crucial in reducing the time to market on project development because considerably accelerates the prototyping stage.

Another crucial feature of cudademux is the single input/multiple output pads filter element topology. This feature makes this element very flexible and adaptable to many project requirements. This element has one "Always" sink pad and multiple "On request" source pads. The user is responsible to request the number of source pads as many outputs are required by the custom CUDA algorithm. Because this is quick prototyping intend element, it will not be aware of errors committed by the user related to a mismatch in the number of requested source pads and the number of outputs required by the custom CUDA algorithm. The cudademux element will generate an array of outputs based on the number of "On requested" source pads and pass it to the custom CUDA algorithm, accordingly to the expected template of the custom CUDA library. So, for this reason is very important that the user be aware to match the number of requested source pads with the number of outputs defined in the custom CUDA library to avoid an error.

The cudademux with its single input/multiple-output (SIMO) topology, becomes the best option for quick prototyping projects that wants to interface GStreamer with a CUDA algorithm that requires one input and several outputs, for example, filter bank, decomposition of images in planes, etc.

The cudademux could be viewed as a generic single input/multiple output pads video filter element that executes any custom CUDA algorithm provided by the user. So, this allows the user to develop different CUDA algorithms at the same time and test them using the same cudademux element, by just changes the element's property that specifies the CUDA library that should be loaded during pipeline execution.

Key features

Single input/multiple output pads filter element topology.
Dynamically load of an external compiled CUDA library that contains the CUDA algorithm to be executed in the GPU to process the incoming frames.
Independence between the GStreamer element and CUDA algorithm.
Generic GStreamer element that could execute custom CUDA algorithms.
Adaptability to many project requirements.
Ideal for quick prototyping and reducing time to market of project development.
High performance, due to zero memory copies interface between CUDA and GStreamer.
Directly handle of NVMM memory type buffers.

Element properties description

Under construction

Element inspect

Under construction

Previous: cudamux

Index

Next: cudamimo