NVIDIA Jetson Xavier - HDAV Subsystem Audio Processing Engine

From RidgeRun Developer Wiki




Previous: Processors/HDAV Subsystem/Compositor Index Next: Processors/Deep Learning Accelerator






The Audio Processing Engine (APE) is a self-contained unit with dedicated audio clocking that enables Ultra Low Power (ULP) audio processing. It consists of a dedicated programmable audio processor (ARM Cortex A9 with NEON). The High Definition Audio (HDA) controller provides a multichannel audio path to the HDMI interface.

The APE is a processing unit that can be used to implement audio filters on hardware to keep CPU usage as low as possible. However, actually, there is no way to access the Cortex A9 from user application space for audio processing. NVIDIA is planning on giving software support to this unit in the future [1].

Features

  • 96 KB Audio RAM
  • Low latency voice processing
  • Audio Hub (AHUB)
    • 4 x I2S Stereo/TDM I/O
    • DMIC
    • DSPK
  • Multi-Channel IN/OUT
  • Digital Audio Mixer: 10-in/5-out
    • Up to eight channels per stream
    • Simultaneous Multi-streams
    • Flexible stream routing
  • Multi-band Dynamic Range Compression (DRC)
    • Up to three bands
    • Customizable DRC curve with tunable knee points
    • Up to 192 kHz, 32-bit sample, eight channels
  • Parametric equalizer: up to 12 bands
  • Low latency sample rate conversion (SRC) and high-quality asynchronous sample rate conversion (ASRC)

Components


NVIDIA Jetson Xavier audio device connections

Inter-IC Sound (I2S) Controller

The I2S controller implements full-duplex, bidirectional, and single direction point-to-point serial interfaces. It can interface with I2S-compatible products, such as compact disc players, digital audio tape devices, digital sound processors, modems, Bluetooth chips, etc. The Xavier series module supports four I2S audio outputs with I 2 S/PCM interfaces supporting clock rates up to 24.576 MHz

Features:

  • Basic I2S modes supported (I2S, RJM, LJM, and DSP) in both Master and Slave modes.
  • PCM mode with short (one-bit-clock wide) and long-fsync (two bit-clocks wide) in both master and slave modes.
  • Network (Telephony) mode with independent slot-selection for both Tx and Rx
  • TDM mode with flexibility in a number of slots and slot(s) selection.
  • Capability to drive-out a High-z outside the prescribed slot for transmission
  • Flow control for the external input/output stream

For timing information check the Jetson AGX Xavier Module Data Sheet.

Digital MIC (DMIC) Controller

The DMIC Controller is used to interface with PDM based input devices. The DMIC controller converts Pulse Density Modulation (PDM) signals to Pulse Code Modulation (PCM) signals.

Features:

  • Sample rate support: 8 kHz - 48 kHz
  • Input PCM bit width: 16 - 24 bits
  • Oversampling Ratio: 64, 128, 256

Digital Speaker (DSPK) Controller

The PDM transmit block converts multi-bit PCM audio input to oversampled 1-bit PDM output. The mono or stereo audio is transmitted over a data/clock pair (I2S interface) to an external codec. The block consists of an interpolator followed by a Delta-Sigma Modulator (DSM).

Features:

  • Sample rate support: 8 – 48 kHz
  • Input PCM bit-width: 16 – 24 bits
  • Oversampling Ratio: 64, 128, 256
  • Passband frequency response: <= 0.5 dB peak-to-peak in 10 Hz – 20 kHz range
  • THD+N: <= -80 dB @ -10 dBFS
  • Dynamic Range: >= 105 dB

High Definition Audio (HDA)

Besides the APE, the Xavier series module implements an industry-standard High Definition Audio (HDA) controller. It complies with the "Intel High Definition Audio Specification Revision 1.0a". This controller provides a multi-channel audio path to the HDMI interface. Multiple inputs and output streams are supported.

Features:

  • Supports HDMI 1.3a and DP1.1
  • Support up to four audio streams for use with HDMI/DP
  • Supports striping of audio out across 1,2,4 [a] SDO lines
  • Supports DVFS with maximum latency up to 208 μs for eight channel
  • Supports four internal audio codecs
  • Audio Format Support
    • Uncompressed Audio (LPCM): 16/20/24 bits at 32/44.1/48/88.2/96/176.4/192 [b] kHz
    • Compressed Audio format: AC3, DTS5.1, MPEG1, MPEG2, MP3, DD+, MPEG2/4 AAC, TrueHD, DTS-HD



Previous: Processors/HDAV Subsystem/Compositor Index Next: Processors/Deep Learning Accelerator