Parakeet TDT (0.6B v3) — ExecuTorch Exports

This repository contains ExecuTorch .pte exports of the Hugging Face ASR model nvidia/parakeet-tdt-0.6b-v3 for:

XNNPACK (CPU): fp32 and INT4 weights (8da4w)
Metal (macOS): bf16

Each export directory contains:

model.pte — ExecuTorch program
tokenizer.model — SentencePiece tokenizer (needed by the runner)

Artifacts

Variant	Backend	Precision / Quantization	Path
`xnnpack-fp32`	XNNPACK	fp32	`xnnpack/fp32/model.pte`
`xnnpack-int4`	XNNPACK	8da4w (int8 dynamic activations + int4 weights), group size 32	`xnnpack/int4/model.pte`
`metal-bf16`	Metal (MPS)	bf16	`metal/bf16/model.pte`

Tokenizers:

xnnpack/fp32/tokenizer.model
xnnpack/int4/tokenizer.model
metal/bf16/tokenizer.model

How to run (C++ runner)

Build the Parakeet runner from an ExecuTorch checkout:

./install_executorch.sh
make parakeet-cpu
make parakeet-metal  # macOS only

Then run from the ExecuTorch repo root:

CPU / XNNPACK

./cmake-out/examples/models/parakeet/parakeet_runner \
  --model_path /path/to/model.pte \
  --audio_path /path/to/audio.wav \
  --tokenizer_path /path/to/tokenizer.model

Metal (macOS)

DYLD_LIBRARY_PATH=/usr/lib ./cmake-out/examples/models/parakeet/parakeet_runner \
  --model_path /path/to/model.pte \
  --audio_path /path/to/audio.wav \
  --tokenizer_path /path/to/tokenizer.model

Export recipe (ExecuTorch)

Exports were produced with examples/models/parakeet/export_parakeet_tdt.py from ExecuTorch commit efe4f0cce36c4b95ac70eac969c0d55163f23c78 using:

PyTorch 2.11.0.dev20251222
TORCHAO_FORCE_SKIP_LOADING_SO_FILES=1 (export-only workaround for torchao C++ extension loading)

Commands:

# XNNPACK fp32
TORCHAO_FORCE_SKIP_LOADING_SO_FILES=1 python examples/models/parakeet/export_parakeet_tdt.py \
  --backend xnnpack \
  --output-dir ./parakeet_xnnpack_fp32

# XNNPACK int4 (8da4w)
TORCHAO_FORCE_SKIP_LOADING_SO_FILES=1 python examples/models/parakeet/export_parakeet_tdt.py \
  --backend xnnpack \
  --qlinear_encoder 8da4w --qlinear_encoder_group_size 32 \
  --qlinear 8da4w --qlinear_group_size 32 \
  --output-dir ./parakeet_xnnpack_int4

# Metal bf16
TORCHAO_FORCE_SKIP_LOADING_SO_FILES=1 python examples/models/parakeet/export_parakeet_tdt.py \
  --backend metal --dtype bf16 \
  --output-dir ./parakeet_metal_bf16

Verification

All variants were verified to transcribe the same test audio correctly (and match eager PyTorch output for the XNNPACK exports).

Audio sample: hf-internal-testing/librispeech_asr_dummy (clean, validation[0])
Reference transcript (dataset):
- MISTER QUILTER IS THE APOSTLE OF THE MIDDLE CLASSES AND WE ARE GLAD TO WELCOME HIS GOSPEL
Model output:
- mister Quilter is the apostle of the middle classes, and we are glad to welcome his gospel.

License / Attribution

Upstream model license: CC-BY-4.0 (see nvidia/parakeet-tdt-0.6b-v3).
Please attribute NVIDIA when using these exported artifacts.

Downloads last month: 10

Model tree for larryliu0820/parakeet-tdt-0.6b-v3-executorch

Base model

nvidia/parakeet-tdt-0.6b-v3

Finetuned

(35)

this model