NVIDIA Jetson Xavier - HDAV Subsystem Audio Processing Engine

The Audio Processing Engine (APE) is a self-contained unit with dedicated audio clocking that enables Ultra Low Power (ULP) audio processing. It consists of a dedicated programmable audio processor (ARM Cortex A9 with NEON). The High Definition Audio (HDA) controller provides a multichannel audio path to the HDMI interface.

The APE is a processing unit that can be used to implement audio filters on hardware to keep CPU usage as low as possible. However, actually, there is no way to access the Cortex A9 from user application space for audio processing. NVIDIA is planning on giving software support to this unit in the future [1].

Features

96 KB Audio RAM
Low latency voice processing
Audio Hub (AHUB)
- 4 x I2S Stereo/TDM I/O
- DMIC
- DSPK
Multi-Channel IN/OUT
Digital Audio Mixer: 10-in/5-out
- Up to eight channels per stream
- Simultaneous Multi-streams
- Flexible stream routing
Multi-band Dynamic Range Compression (DRC)
- Up to three bands
- Customizable DRC curve with tunable knee points
- Up to 192 kHz, 32-bit sample, eight channels
Parametric equalizer: up to 12 bands
Low latency sample rate conversion (SRC) and high-quality asynchronous sample rate conversion (ASRC)

Components

NVIDIA Jetson Xavier audio device connections

Inter-IC Sound (I2S) Controller

The I2S controller implements full-duplex, bidirectional, and single direction point-to-point serial interfaces. It can interface with I2S-compatible products, such as compact disc players, digital audio tape devices, digital sound processors, modems, Bluetooth chips, etc. The Xavier series module supports four I2S audio outputs with I 2 S/PCM interfaces supporting clock rates up to 24.576 MHz

Features:

Basic I2S modes supported (I2S, RJM, LJM, and DSP) in both Master and Slave modes.
PCM mode with short (one-bit-clock wide) and long-fsync (two bit-clocks wide) in both master and slave modes.
Network (Telephony) mode with independent slot-selection for both Tx and Rx
TDM mode with flexibility in a number of slots and slot(s) selection.
Capability to drive-out a High-z outside the prescribed slot for transmission
Flow control for the external input/output stream

For timing information check the Jetson AGX Xavier Module Data Sheet.

Digital MIC (DMIC) Controller

The DMIC Controller is used to interface with PDM based input devices. The DMIC controller converts Pulse Density Modulation (PDM) signals to Pulse Code Modulation (PCM) signals.

Features:

Sample rate support: 8 kHz - 48 kHz
Input PCM bit width: 16 - 24 bits
Oversampling Ratio: 64, 128, 256

Digital Speaker (DSPK) Controller

The PDM transmit block converts multi-bit PCM audio input to oversampled 1-bit PDM output. The mono or stereo audio is transmitted over a data/clock pair (I2S interface) to an external codec. The block consists of an interpolator followed by a Delta-Sigma Modulator (DSM).

Features:

Sample rate support: 8 – 48 kHz
Input PCM bit-width: 16 – 24 bits
Oversampling Ratio: 64, 128, 256
Passband frequency response: <= 0.5 dB peak-to-peak in 10 Hz – 20 kHz range
THD+N: <= -80 dB @ -10 dBFS
Dynamic Range: >= 105 dB

High Definition Audio (HDA)

Besides the APE, the Xavier series module implements an industry-standard High Definition Audio (HDA) controller. It complies with the "Intel High Definition Audio Specification Revision 1.0a". This controller provides a multi-channel audio path to the HDMI interface. Multiple inputs and output streams are supported.

Features:

Supports HDMI 1.3a and DP1.1
Support up to four audio streams for use with HDMI/DP
Supports striping of audio out across 1,2,4 [a] SDO lines
Supports DVFS with maximum latency up to 208 μs for eight channel
Supports four internal audio codecs
Audio Format Support
- Uncompressed Audio (LPCM): 16/20/24 bits at 32/44.1/48/88.2/96/176.4/192 [b] kHz
- Compressed Audio format: AC3, DTS5.1, MPEG1, MPEG2, MP3, DD+, MPEG2/4 AAC, TrueHD, DTS-HD

Previous: Processors/HDAV Subsystem/Compositor

Index

Next: Processors/Deep Learning Accelerator