V4L2 FPGA - Introduction - Overview

From RidgeRun Developer Wiki
Revision as of 15:04, 17 July 2019 by Lleon (talk | contribs) (Making the subsystems more detailed)



  Index  




Project Structure

This project consists of three subsystems which allow for the acceleration of algorithms on custom hardware as shown in the following image:

V4L2 Data Flow

Frame Sink: allows sending video frames from a user application to an FPGA device connected through PCI-e. From the architecture perspective, a design could only have a Frame Sink and a hardware accelerator for controlling a display or a series of devices, which interacts with the incoming video stream.

HW accelerator: this subsystem allows for the processing of frames by complex algorithms. Algorithms implemented in hardware are less power-intensive, faster and allow for massive parallelism. An example of an HW accelerator is a demosaicing accelerator, which involves three convolutions running simultaneously at 8 pixels per clock each.

Frame Grabber: allows capturing frames from a PCI-e connected FPGA device. A possible design can only have this subsystem, making possible to connect a camera to the FPGA directly and perform the deserializing, decoding and demosaicing without adding any overhead to the system processor, which will receive the frames ready for their consumption.

FPGA

A Field Programmable Gate Array (FPGA) is an integrated circuit which can be configured after manufacturing. The FPGA allows implementing heavy computational algorithms described as hardware and offering massive parallelism, which leads to multi-pipeline architectures and vectorial computing.

One of the key strengths of FPGA is its capability of implementing software algorithms on hardware, which can be run at one computation per clock or even more, depending on the optimization techniques employed for parallelizing during the hardware description. In the field of Image Processing, custom image processing applications can be described as hardware on FPGA. It allows accelerating algorithms to the limits and achieving better performance in terms of GLOPS/Watt, resulting in high-performance computation with a lower consumption than GPUs.

Another main advantage of the FPGA is the possibility of reconfiguring it on-demand, making possible to change the accelerators when they are required.

PCIe

PCI Express is a high-speed communication standard. PCI-e slots can contain multiple lanes allowing for further speed-up by transmitting information in each slot by parallel. PCI-e is the common interface for devices with high bandwidth requirements such as GPUs, Wi-Fi cards, Solid-State disks and, now FPGAs.

Version Bandwidth (per lane) Bandwidth (per lane in a 16x slot)
PCIe 1.0 2 Gbit/s 32 Gbit/s
PCIe 2.0 4 Gbit/s 64 Gbit/s
PCIe 3.0 7.877 Gbit/s 126.032 Gbit/s
PCIe 4.0 15.752 Gbit/s 252.032 Gbit/s

PCI-e compatible cards also come in a Mini Card factor which has a more flexible physical specification to connect to the PCIe bus, one example of these cards is the PicoEVB board, which allows connecting an FPGA to laptops or embedded systems.

V4L2

Video4Linux is a collection of drivers and a common API for supporting realtime video capture on Linux systems.

The V4L2 provides a video capture interface to get video data from a tuner or camera device, a video output interface which can provide video images outside of the device.

The API also implements code which enables applications to discover a given device's capabilities and to configure the device to operate in the desired manner. These include cropping, frame rates, video compression, image parameters, video formats, etc.


Previous: Introduction Index Next: Introduction/Frame Grabber