13 15 20

Garreth Lee

garrethlee

AI & ML interests

None yet

Recent Activity

updated a dataset 1 day ago

garrethlee/comprehensive-arithmetic-problems

liked a model about 2 months ago

sarulab-speech/DialogueSidon

liked a Space 7 months ago

HuggingFaceTB/smol-training-playbook

View all activity

Organizations

upvoted 2 papers 11 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26, 2025 • 78

OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning

Paper • 2506.00338 • Published May 31, 2025 • 10

upvoted a changelog 12 months ago

Hugging Face Changelog

Xet is now the default storage option for new users and organizations

May 23, 2025

• 76

upvoted a collection about 1 year ago

Llama 4

Collection

Llama 4 release • 13 items • Updated Apr 29, 2025 • 735

upvoted 2 articles about 1 year ago

Article

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

jmamou

•

Mar 24, 2025

• 20

Article

FastRTC: The Real-Time Communication Library for Python

freddyaboulton, abidlabs

•

Feb 25, 2025

• 172

upvoted 3 articles over 1 year ago

Article

1 Billion Classifications

derek-thomas

•

Feb 13, 2025

• 45

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

not-lain

•

Jan 30, 2025

• 329

Article

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

Steveeeeeeen

•

Jan 29, 2025

• 17

upvoted a paper over 1 year ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 379

upvoted 2 articles over 1 year ago

Article

SmolLM - blazingly fast and remarkably powerful

loubnabnl, anton-l, eliebak

•

Jul 16, 2024

• 455

Article

🇨🇿 BenCzechMark - Can your LLM Understand Czech?

mfajcik, hynky, mdocekal, xdolez52, jstetina, Lakoc, popelucha, hales, michal-stefanik, Adamiros, davidadamczyk, JanH, jsedivy

•

Oct 1, 2024

• 24

upvoted a paper over 1 year ago

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

Paper • 2409.17115 • Published Sep 25, 2024 • 64

Garreth Lee

AI & ML interests

Recent Activity

Organizations

garrethlee's activity

Xet is now the default storage option for new users and organizations

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

FastRTC: The Real-Time Communication Library for Python

1 Billion Classifications

KV Caching Explained: Optimizing Transformer Inference Efficiency

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

SmolLM - blazingly fast and remarkably powerful

🇨🇿 BenCzechMark - Can your LLM Understand Czech?