Pruning for AI on FPGA

Previous: AI and Machine Learning/Quantisation

Introduction

Pruning is a model optimisation technique that removes weights that do not significantly impact the inference process. It allows for reducing the storage and computations required for the inference tasks. However, it requires specialised data structures and hardware for the computations.

This wiki will explain the sparse data representations, acceleration considerations and how we can use the FPGA to accelerate sparse operations.

Data Representations

Regarding the data structures, sparse matrix representations are used for matrix computations, encoding the valid rows and columns with a value. However, there are multiple ways to encode the rows, columns and values:

Coordinate (COO)

It encodes the dense matrices into sparse matrices through three different arrays:

Row indices
Column indices
Values

The three arrays are same-sized and the arrays' positions are accessed using the same index.

So, to access the dense matrix index from a sparse representation, it is possible to use the following pseudocode:

index_1d = rows_indices[i] * leading_dimension + column_indices[i]

RidgeRun Services

RidgeRun has expertise in offloading processing algorithms using FPGAs, from Image Signal Processing to AI offloading. Our services include:

Algorithm Acceleration using FPGAs.
Image Signal Processing IP Cores.
Linux Device Drivers.
Low Power AI Acceleration using FPGAs.
Accelerated C++ Applications.

And it includes much more. Contact us at https://www.ridgerun.com/contact.

Previous: AI and Machine Learning/Quantisation

Index

FPGA Minutes to Become an Expert

Introduction
What is an FPGA? FPGAs and their applications Popular Vendors Development process When to Choose an FPGA over a GPU/CPU?
FPGA Knowledge
Synthesis Flows High Level Synthesis Vendor Specific Primitives
Xilinx FPGAs
Evaluation boards Minimum Design for KV260 Development workflow and tools Getting Started
Lattice FPGAs
Evaluation boards Development workflow and tools Getting Started
Simulation Tools
CocoTB
AI and Machine Learning
Introduction to AI Architectures and Frameworks Quantisation Pruning
Contact Us