Parakeet TDT (0.6B v3) β ExecuTorch Exports
This repository contains ExecuTorch .pte exports of the Hugging Face ASR model nvidia/parakeet-tdt-0.6b-v3 for:
- XNNPACK (CPU): fp32 and INT4 weights (8da4w)
- Metal (macOS): bf16
Each export directory contains:
model.pteβ ExecuTorch programtokenizer.modelβ SentencePiece tokenizer (needed by the runner)
Artifacts
| Variant | Backend | Precision / Quantization | Path |
|---|---|---|---|
xnnpack-fp32 |
XNNPACK | fp32 | xnnpack/fp32/model.pte |
xnnpack-int4 |
XNNPACK | 8da4w (int8 dynamic activations + int4 weights), group size 32 | xnnpack/int4/model.pte |
metal-bf16 |
Metal (MPS) | bf16 | metal/bf16/model.pte |
Tokenizers:
xnnpack/fp32/tokenizer.modelxnnpack/int4/tokenizer.modelmetal/bf16/tokenizer.model
How to run (C++ runner)
Build the Parakeet runner from an ExecuTorch checkout:
./install_executorch.sh
make parakeet-cpu
make parakeet-metal # macOS only
Then run from the ExecuTorch repo root:
CPU / XNNPACK
./cmake-out/examples/models/parakeet/parakeet_runner \
--model_path /path/to/model.pte \
--audio_path /path/to/audio.wav \
--tokenizer_path /path/to/tokenizer.model
Metal (macOS)
DYLD_LIBRARY_PATH=/usr/lib ./cmake-out/examples/models/parakeet/parakeet_runner \
--model_path /path/to/model.pte \
--audio_path /path/to/audio.wav \
--tokenizer_path /path/to/tokenizer.model
Export recipe (ExecuTorch)
Exports were produced with examples/models/parakeet/export_parakeet_tdt.py from ExecuTorch commit efe4f0cce36c4b95ac70eac969c0d55163f23c78 using:
- PyTorch
2.11.0.dev20251222 TORCHAO_FORCE_SKIP_LOADING_SO_FILES=1(export-only workaround for torchao C++ extension loading)
Commands:
# XNNPACK fp32
TORCHAO_FORCE_SKIP_LOADING_SO_FILES=1 python examples/models/parakeet/export_parakeet_tdt.py \
--backend xnnpack \
--output-dir ./parakeet_xnnpack_fp32
# XNNPACK int4 (8da4w)
TORCHAO_FORCE_SKIP_LOADING_SO_FILES=1 python examples/models/parakeet/export_parakeet_tdt.py \
--backend xnnpack \
--qlinear_encoder 8da4w --qlinear_encoder_group_size 32 \
--qlinear 8da4w --qlinear_group_size 32 \
--output-dir ./parakeet_xnnpack_int4
# Metal bf16
TORCHAO_FORCE_SKIP_LOADING_SO_FILES=1 python examples/models/parakeet/export_parakeet_tdt.py \
--backend metal --dtype bf16 \
--output-dir ./parakeet_metal_bf16
Verification
All variants were verified to transcribe the same test audio correctly (and match eager PyTorch output for the XNNPACK exports).
- Audio sample:
hf-internal-testing/librispeech_asr_dummy(clean,validation[0]) - Reference transcript (dataset):
MISTER QUILTER IS THE APOSTLE OF THE MIDDLE CLASSES AND WE ARE GLAD TO WELCOME HIS GOSPEL
- Model output:
mister Quilter is the apostle of the middle classes, and we are glad to welcome his gospel.
License / Attribution
- Upstream model license: CC-BY-4.0 (see
nvidia/parakeet-tdt-0.6b-v3). - Please attribute NVIDIA when using these exported artifacts.
- Downloads last month
- 10
Model tree for larryliu0820/parakeet-tdt-0.6b-v3-executorch
Base model
nvidia/parakeet-tdt-0.6b-v3