GstCUDA - cudamux
This page describes in detail the cudamux element of the GstCUDA plugin.
Description
Cudamux is a multiple inputs/single output pad video filter GStreamer element, that allows video frames to be processed by the GPU using a custom CUDA library algorithm. With this element users can now develop their own CUDA processing library, pass the library into cudamux, which executes the library on the GPU, passing upstream frames from the GStreamer pipeline for each input pad to the GPU and passing the modified frames downstream to the next element in the GStreamer pipeline.
This element executes the CUDA algorithm from a custom CUDA library (XXX.so file) loaded dynamically during run-time, passed trough an element's property. The CUDA algorithm is separated from the GStreamer element, so the developer could make modifications to the CUDA algorithm, recompile the custom CUDA library and run the GStreamer pipeline again to test the changes. This process can be iterated as many times as needed to debug a custom CUDA algorithm. This feature makes cudamux ideal for quick prototyping because it offers flexibility and adaptability to many project requirements.
One key feature of this element is the capability to load the CUDA algorithm to be executed on the GPU to process the incoming frames, from an external compiled custom CUDA library. This gives the advantage of having the GStreamer element separated from the CUDA algorithm. So, the developer doesn't have to worry about the GStreamer-CUDA interface and complex memory handling, because the cudamux will take care of that. Instead, the developer can be focused on the custom CUDA algorithm development, and test any change made during the debugging process by just recompiling the CUDA library and just execute the GStreamer pipeline again without the necessity to modify, recompile and reinstall the GstCUDA plugin. This feature is crucial in reducing the time to market on project development because considerably accelerates the prototyping stage.
Another crucial feature of cudamux is the multiple input/single output pads filter element topology. This feature makes this element very flexible and adaptable to many project requirements. This element has one "Always" source pad and multiple "On request" sink pads. The user is responsible to request the number of sink pads as many inputs are required by the custom CUDA algorithm. Because this is quick prototyping intend element, it will not be aware of errors committed by the user related to a mismatch in the number of requested sink pads and the number of inputs required by the custom CUDA algorithm. The cudamux element will generate an array of inputs based on the number of "On requested" sink pads and pass it to the custom CUDA algorithm, accordingly to the expected template of the custom CUDA library. So, for this reason is very important that the user be aware to match the number of requested sink pads with the number of inputs defined in the custom CUDA library to avoid an error.
The cudamux with its multiple inputs/single-output (MISO) topology, becomes the best option for quick prototyping projects that wants to interface GStreamer with a CUDA algorithm that requires several inputs and one output, for example: image stitching, stereoscopic vision (3D vision), High-dynamic-range imaging (HDRI), the picture on picture overlays, etc.
The cudamux could be viewed as a generic multiple inputs/single output pads video filter element that executes any custom CUDA algorithm provided by the user. So, this allows the user to develop different CUDA algorithms at the same time and test them using the same cudamux element, by just changes the element's property that specifies the CUDA library that should be loaded during pipeline execution.
Key features
- Multiple inputs/single output pads filter element topology.
- Dynamically load of an external compiled CUDA library that contains the CUDA algorithm to be executed in the GPU to process the incoming frames.
- Independence between the GStreamer element and CUDA algorithm.
- Generic GStreamer element that could execute custom CUDA algorithms.
- Adaptability to many project requirements.
- Ideal for quick prototyping and reducing time to market of project development.
- High performance, due to zero memory copies interface between CUDA and GStreamer.
- Directly handle of NVMM memory type buffers.
Documentation
Element inspect
$ gst-inspect-1.0 cudamux
Factory Details:
Rank none (0)
Long-name cudamux
Klass Muxer
Description Allows frames to be processed by the GPU using a custom CUDA library algorithm.
Multiple input single output topology filter element.
Author Diego Chaverri <diego.chaverri@ridgerun.com>
Daniel Garbanzo <daniel.garbanzo@ridgerun.com>
Enrique Ramirez <enrique.ramirez@ridgerun.com>
Michael Gruner <michael.gruner@ridgerun.com>
Plugin Details:
Name cuda
Description Allows frames to be processed by the GPU using a custom CUDA library algorithm
Filename /usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstcuda.so
Version 0.3.1.1
License Proprietary
Source module gst-cuda
Source release date 2018-01-10 17:43 (UTC)
Binary package GStreamer CUDA Plug-in
Origin URL Unknown package origin
GObject
+----GInitiallyUnowned
+----GstObject
+----GstElement
+----GstAggregator
+----GstCudaBaseMiso
+----GstCudaMux
Pad Templates:
SINK template: 'sink_%u'
Availability: On request
Has request_new_pad() function: gst_aggregator_request_new_pad
Capabilities:
video/x-raw(memory:NVMM)
format: I420
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
SRC template: 'src'
Availability: Always
Capabilities:
video/x-raw
format: I420
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
video/x-raw(memory:NVMM)
format: I420
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
Element Flags:
no flags set
Element Implementation:
Has change_state() function: gst_aggregator_change_state
Element has no clocking capabilities.
Element has no URI handling capabilities.
Pads:
SRC: 'src'
Pad Template: 'src'
Element Properties:
name : The name of the object
flags: readable, writable
String. Default: "cudamux0"
parent : The parent of the object
flags: readable, writable
Object of type "GstObject"
latency : Additional latency in live mode to allow upstream to take longer to produce buffers for the current position (in nanoseconds)
flags: readable, writable
Integer64. Range: 0 - 9223372036854775807 Default: 0
start-time-selection: Decides which start time is output
flags: readable, writable
Enum "GstAggregatorStartTimeSelection" Default: 0, "zero"
(0): zero - Start at 0 running time (default)
(1): first - Start at first observed input running time
(2): set - Set start time with start-time property
start-time : Start time to use if start-time-selection=set
flags: readable, writable
Unsigned Integer64. Range: 0 - 18446744073709551615 Default: 18446744073709551615
location : Location of the CUDA algorithm library to load
flags: readable, writable
String. Default: null
in-place : Use in-place transform mode configuration
flags: readable, writable
Boolean. Default: false