RidgeRun NVIDIA PVA Development Algorithms

From RidgeRun Developer Wiki




NVIDIA partner logo






PVA Algorithms from LibPVA

RidgeRun has implemented the following image processing algorithms on the PVA. These are foundational for image signal processing (ISP) pipelines and optimized for high efficiency.


Info
Currently, these algorithms are just for performance evaluation purposes and are not intended to be used in production. Stay tuned for more!


All the measurements were taken using the following characteristics:

  • Platform: Jetson AGX Orin 32GB
  • OS: Jetpack 6.2
  • Power Profile: MAXN power mode + Jetson Clocks
  • CPU: all measurements use a single ARM core
  • PVA: all measurements use a single VPS (half of the PVA)
  • Power Measurements: using jetson-stats (a tool based on tegrastats)


Info
The algorithms are still under development and represent the first iteration. The execution times on PVA are expected to have more speed up (from 2-3x)


Bit Shifting (Debayering Resolution Downscaling)

This technique allows for resolution reduction through controlled bit manipulation during debayering. It’s useful in optimizing bandwidth or matching downstream resolution requirements.

Average performance measurements are shown in the following table for the most common resolutions. Measurements are shown for an optimized implementation of the algorithm and all results are in milliseconds. Additionally, power consumption measurements are shown in watts. A shift of 10 bits was used for the benchmarks. Performance measurements can also be observed in the attached graph.

Bit Shifting execution time and power consumption
Resolution Execution time CPU (ms) Execution time PVA (ms) Power consumption CPU only (W) Power consumption CPU and PVA (W)
1280x720 0.265 0.1465 14.8 16.89
1920x1080 0,59 0.272 15.37 17.03
3840x2160 2.364 0.96578 15.33 17.41
Fig 1. Bit shifting execution time

This downscales a single-channel image from 16-bit to 8-bit.

Radial Lens Shading Correction

Corrects vignetting or intensity falloff from the center to the edges of an image caused by lens characteristics. It’s implemented using radial correction maps that are efficiently processed on the PVA.

Average performance measurements are shown in the following table for the most common resolutions. Measurements are shown for an optimized implementation of the algorithm and all results are in milliseconds. Additionally, power consumption measurements are shown in Watts. Performance measurements can also be observed in the attached graph.

Radial Lens Shading correction execution time and power consumption
Resolution Execution time CPU (ms) Execution time PVA (ms) Power consumption CPU only (W) Power consumption CPU and PVA (W)
1280x720 1.531 0.678 15.65 16.65
1920x1080 3.475 1.490 15.83 16.77
3840x2160 13.837 5.832 15.85 16.62
Fig 2. Radial Lens Shading correction execution time

The measurements were done with:

  • 8-bit Fixed-point correction maps (including channels)
  • RGB images (RGB24) - 8-bit per channel

Colour Space Conversion (RGBA-Gray)

Transforms image data from one color space to another (e.g., RGB to YUV). It’s essential for encoding, display pipelines, and transmission where non-RGB formats are used.

These implementations showcase how RidgeRun leverages the PVA to create real-time, power-efficient vision pipelines suitable for embedded systems under tight performance constraints.

Average performance measurements are shown in the following table for the most common resolutions. Measurements are shown for an optimized version of the algorithm, and all results are in milliseconds. Additionally, power consumption measurements are shown in watts. In the example measurements, an RGBA to Grayscale conversion was performed. Performance measurements can also be observed in the attached graph.

RGBA to Grayscale conversion execution time and power consumption
Resolution Execution time CPU (ms) Execution time PVA (ms) Power consumption CPU only (W) Power consumption CPU and PVA (W)
1280x720 1.258 1.003 14.59 14.79
1920x1080 2.836 2.202 14.71 14.68
3840x2160 11.332 8.672 14.86 14.73
Fig 3. RGBA to Grayscale conversion execution time

The images involved:

  • Input: RGBA32 (8-bit per channel, four channels)
  • Output: Gray8 (8-bit single channel)

Final Remarks

From the energy perspective, it is possible to notice that the power consumption may increase. Nevertheless, since the PVA is faster in most cases, the energy consumption and the execution time are lower overall.

The power consumption has been acquired at the entire platform level using the jetson-stats Python library.