NVIDIA Jetson Xavier - Deep Learning Accelerator

From RidgeRun Developer Wiki



Previous: Deep Learning/TensorRT/Building Examples Index Next: Deep Learning/Deep Learning Accelerator/Software Support








The NVIDIA Deep Learning Accelerator (NVDLA) is a fixed function engine used to accelerate inference operations on convolution neural networks (CNNs). You will learn about the software available to work with the Deep Learning Accelerator and how to use it to create your own projects. If you want to learn more about this engine, its hardware, and the open-source project NVDLA please refer to the NVDLA Processor wiki page.

NVDLA software design is grouped into two groups: the compilation tools (model conversion), and the runtime environment (run-time software to load and execute networks on NVDLA). The general flow of this is as shown in the figure below, and each of these is described below.


Front view of NVIDIA Jetson Xavier developer kit

Compilation Tools

Compilation tools include a compiler, parser, and optimizer.

  • Parser – Reads a pre-trained Caffe model and an intermediate representation to pass to the compiler.
  • Compiler – creates a sequence of hardware layers optimized for an NVDLA configuration.

These steps can be performed offline on the host device and then deploy the loadable model on Xavier's NVDLAs. Knowing the specific NVDLA hardware configuration enables the compiler to further optimize inference. However, the specific NVDLA configuration on the Tegra Xavier is not yet available.

Runtime Environment

The runtime environment involves running a model on compatible NVDLA hardware. It is effectively divided into two layers:

  • User Mode Driver – The main interface with user-mode programs. After parsing the neural network compiler compiles the network layer by layer and converts it into a file format called NVDLA Loadable. User mode runtime driver loads this loadable and submits inference job to Kernel Mode Driver
  • Kernel Mode Driver – Consists of drivers and firmware that do the work of scheduling layer operations on NVDLA and programming the NVDLA registers to configure each functional block.

Selecting which DLA to use

You can select which of the two DLAs to use with IBuilder::setDLACore() and IRuntime::setDLACore()



Previous: Deep Learning/TensorRT/Building Examples Index Next: Deep Learning/Deep Learning Accelerator/Software Support