FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26, 2025 • 78
OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning Paper • 2506.00338 • Published May 31, 2025 • 10
view changelog Hugging Face Changelog Xet is now the default storage option for new users and organizations May 23, 2025 • 76
view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques jmamou • Mar 24, 2025 • 20
view article Article FastRTC: The Real-Time Communication Library for Python freddyaboulton, abidlabs • Feb 25, 2025 • 172
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency not-lain • Jan 30, 2025 • 329
view article Article How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents Steveeeeeeen • Jan 29, 2025 • 17
view article Article SmolLM - blazingly fast and remarkably powerful +1 loubnabnl, anton-l, eliebak • Jul 16, 2024 • 455
view article Article 🇨🇿 BenCzechMark - Can your LLM Understand Czech? +11 mfajcik, hynky, mdocekal, xdolez52, jstetina, Lakoc, popelucha, hales, michal-stefanik, Adamiros, davidadamczyk, JanH, jsedivy • Oct 1, 2024 • 24
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale Paper • 2409.17115 • Published Sep 25, 2024 • 64