FPGA Image Signal Processor - FPGA ISP Accelerators - Geometric Transformation Unit

Introduction

The FPGA-ISP Geometric Transformation Unit (GTU) Accelerator is an FPGA accelerator that performs the geometric transformation given an augmented transformation matrix. With the GTU, it is possible to perform scaling, rotation, and translation of the input image. This also allows us to have greater or smaller output images.

To perform the interpolations, it is possible to choose between the following interpolators:

Nearest Neighbor
Bilinear (still under development)

All the computations are done in fixed-point Qs20,12. This also includes the transformation matrix representation.

Supported caps

The FPGA-ISP GTU Accelerator is capable of managing the following image properties:

Input

Min resolution: 8x8
Max resolution(1): 360x360,90x90 
Formats: GRAY8 (8-bit Grayscale), ARGB (32-bit Color RGB)

¹ The maximum resolution will depend on the RAM size and the number of channels. It is computed as:

MAX_DIMENSION = SQRT(RAM_SIZE) / NUMBER_CHANNELS

Where the dimension is either width or height.

Output

Min resolution: 8x8
Max resolution: 4096x2160, 2047x2047(2)
Formats (same as input): GRAY8 (8-bit Grayscale), ARGB (32-bit Color RGB)

² Depending on the interpolator. 4096x2160 is for Nearest Neighbor interpolator, whereas 2047x2047 is for Bilinear interpolator

Algorithm overview

The FPGA-ISP GTU performs all the computations in fixed-point numbers in order to reduce the area and the latency of the accelerator. Besides of that, the GTU can be used as a module, whose interpolator is passed as a template parameter.

It is composed of two main modules:

Geometric transformer: This performs the inverse map of the output points to the input points through the inverse of the transformation matrix (passed already inverted to avoid area consumption).

Interpolator: It is passed as a template parameter. This performs the interpolation of the input points requested by the geometric transformer. This also computes the output pixel value and sends it through the output stream.

A brief diagram can expose how the different modules interact in order to create the GTU:

At the accelerator level, the image should be uploaded to the local RAM (if the FPGA doesn't have one, a small RAM is created using Block RAM). Once the image is fully loaded, the geometric transformer starts to request points in order to create the output sequentially. The interpolator receives the points mapped to the input to interpolate the pixel value at the requested position.

Example pipelines

In combination with RidgeRun's V4L2-FPGA, it is possible to create a V4L2 interface with GStreamer support, making even easier your computer vision application for embedded systems. Here is some example of pipelines to test the FPGA-ISP Histogram Equalizer.

Generator (Accelerator input)

gst-launch-1.0 videotestsrc ! video/x-raw,width=256,height=144 ! v4l2sink device=/dev/video2 -v

Sink (Accelerator output)

gst-launch-1.0 v4l2src device=/dev/video1 ! "video/x-raw, width=640, height=480" ! perf ! videoconvert ! xvimagesink

Benchmarks

Table 1. Typical framerate of FPGA-ISP GTU Accelerator. Based on ^[1]
Resolution	Maximum framerate (GRAY8-NN)	Maximum framerate (GRAY8-BI)	Maximum framerate (ARGB)
4k	30.03	N/A	15.23
1080p	111.23	7.53	56.3
720p	231.45	16.70	121.46

These framerates are taken based on the following setup:

System: NVidia Jetson Xavier
FPGA: PicoEVB (Artix 7 XC7A50T CSG325 -2l)
OS: Ubuntu 18.04
PCI-e: v2.0 - 1 lane

Also, these configurations:

-- GRAY8 --
Interpolator: Nearest Neighbor (NN) and Bilinear (BI)
Input: 256x144
-- ARGB --
Interpolator: Nearest Neighbor
Input: 72x72

You can reproduce these results by using the following pipelines:

Generator (Accelerator input)

gst-launch-1.0 videotestsrc ! video/x-raw,format=width=256,height=144 ! v4l2sink device=/dev/video2 -v

Sink (Accelerator output)

gst-launch-1.0 v4l2src device=/dev/video1 ! "video/x-raw, width=640, height=480,format=GRAY8" ! perf ! fakesink sync=false

Known issues

1. GStreamer autonegotiation: The caps, such as width, height, and format, must be specified in the pipeline.

2. Numerical precision: Since the resolution is lower than using floating-point numbers, different results might be obtained.

Previous: Modules/HistogramEqualizer

Index

Next: Modules/Interpolators

↑ https://developer.ridgerun.com/wiki/index.php?title=V4L2_FPGA/Examples/Pass_Through

[1] ttps://developer.ridgerun.com/wiki/index.php?title=V4L2_FPGA/Examples/Pass_Through

[1]