Qualcomm Robotics RB5/RB6 - Image Processing and Acceleration Software
The Qualcomm Robotics RB5/RB6 platform offers multiple Software tools and SDKs that allow for multimedia processing using its CPU and acceleration taking advantage of its dedicated Hardware processing units, such as its GPU, DSP and VPU, for more information of the hardware features and capabilities of these components, see the following section SoM Overview. In this section, you will learn about the tools available for multimedia capture, processing, display and graphics processing, as well as machine learning and computer vision use cases.
Multimedia subsystems
Camera capture and encoding[1]
The cameras are interfaced by the Qualcomm Spectra 480 ISP, which can output YUV 4:2:0, Bayer and RGB. The RB5/RB6 supports the use of V4L2 drivers, as well as, the CamX platform for communicating directly to the camera and capturing camera sensor streams.
On the user level, there is a GStreamer source plugin called qmmfsrc, which can be used to capture camera frames and can provide encoded (AVC/HEVC) bitstreams, as well as, raw YUV streams, the hardware encoding done by the VPU supports up to 4K@120fps/8K@30fps and native support for H.265 Main 10, H.265 Main, H.264 High, and VP8 codecs. It works as a client to the MMF Server, which runs as a daemon in the system; the MMF communicates with Camera HAL3 which is the interface to CamX to get the camera streams. The user can communicate directly to HAL3 by bypassing MMF, but encoding would not be handled by the former.
OMX Hardware accelerated H264 encoding with GStreamer using omxh264enc element is also supported.
Video decoding[2]
The RB5/RB6 utilizes its Video Processing Unit for hardware decoding up to 4K@240fps/8K@60fps and native support for H.265 Main 10, H.265 Main, H.264 High, VP9 profile 2, VP8, and MPEG-2 codecs. The Gstreamer plugin for decoding is called qtivdec, it uses V4L2 IOCTLs to decode H264 and H265 bit streams. It is mainly used for playback and transcoding. In the payback case, the decoder outputs GBM buffers that can be rendered by wayland display framework.
Display and graphics[3]
The Wayland protocol is used for the display architecture. The waylandsink element is the GStreamer plugin for display. It communicates with the Weston Server, which is a stand-alone process for composition and rendering. Weston Server uses the Graphics Buffer Manager(GBM) to communicate with the Direct Rendering Manager(DRM), a Linux kernel component for interfacing with the GPU. The composition process happens mainly in the OpenGL ES.
The X11 protocol is supported by this system.
Machine learning and Artificial intelligence
Qualcomm Neural Processing SDK for AI[4]
The RB5/RB6 platform offers a software accelerated inference-only runtime engine for running deep neural networks, which can be used for the following:
- Running a deep neural Network on the hardware the user chooses such as the CPU, GPU, Hexagon DSP or its Hexagon Tensor Accelerator.
- Debugging of the networks execution.
- Converting of models from multiple frameworks such as Caffe, Caffe2, ONNX and TensorFlow to DLC (Deep Learning Container), the SDK's native format.
The DLC files can be optimized with quantization or compression techniques and be used either for developing an application wit the the SDK's C++ or Java API or be executed in GStreamer by the qtimlesnpe plugin. The plugin uses the SDK for offloading the model to CPU, GPU, DSP or HTA(Hexagon Tensor Accelerator), and before forwarding the buffers, it can do post-processing, such as overlays.
TensorFlow[5]
The RB5/RB6 platform supports TFLite models acceleration on the DSP, GPU and CPU, by using the ported NNAPI(Android Neural Networks API). The GStreamer plugin qtimletflite takes a TensorFlow model converted to tflite and offloads its computation to the DSP, CPU or GPU, the inference results are then received back in the plugin for post-processing.
Computer Vision[6]
The RB5/RB6 platform includes the Qualcomm Computer Vision SDK or FastCV, which offers a computer vision library for developers to implement in their advanced CV applications. Some of the features it allows are:
- Gesture recognition
- Face detection, tracking, and recognition
- Text recognition and tracking
- Augmented reality
For more information about the features of FastCV, see FastCV API.
DSP
The Qualcomm® Hexagon™ SDK allows developers in the RB5/RB6 platform to access the resources of the Hexagon DSP and increase multimedia processing fluidity and low latency and overall performance in a heterogeneous computing enviromment[7]. Some features the Hexagon SDK allows are the following[8]:
- High-level programming access to the Hexagon Tensor accelerator through the Hexagon-HTA-NN API for processing fixed-point deep convolutional neural networks models.
- Provides shared libraries for run-time application, for example, FastRPC framework.
- Offers Compute add-on, which contains libraries and tools for data, image and video processing.
- Offers Audio add-on to develop general audio related modules and applications focused on signal processing.
- Qualcomm AI Stack add-on with tools to optimize machine learning runtime, as well as, optimization for Qualcomm Sensing Hub. for low power,low memory AI applications.
For more information of the Hardware features and capabilities of the DSP, see the following section SoM Overview.
References
- ↑ Camera Capture/Encode. Retrieved February 8, 2023, from [1]
- ↑ Video decode. Retrieved February 8, 2023, from [2]
- ↑ Display and Graphics. Retrieved February 8, 2023, from [3]
- ↑ Qualcomm Neural Processing SDK. Retrieved February 8, 2023, from [4]
- ↑ TensorFlow. Retrieved February 8, 2023, from [5]
- ↑ Qualcomm Computer Vision SDK. Retrieved February 8, 2023, from [6]
- ↑ Qualcomm Hexagon SDK. Retrieved February 8, 2023, from [7]
- ↑ Qualcomm Hexagon SDK Tools and Resources. Retrieved February 13, 2023, from [8]