view article Article We Pitted the Cheapest TPU Against an NVIDIA L4. Here's What 6 Experiments Revealed. 19 days ago • 1
view article Article 1.37x Faster on Alibaba's 80B Code Model: EAGLE3 for Qwen3-Coder-Next 21 days ago
view article Article 1.7x Faster on a 218B Model: EAGLE3 Speculative Decoding for GLM-4.7 22 days ago • 1
view article Article 2x Faster on a 229B MoE: EAGLE3 Speculative Decoding for MiniMax-M2.5 27 days ago • 3
view article Article Google Released Gemma-4 Four Days Ago. We Already Made It 1.72× Faster. 30 days ago • 2
view article Article Speculative Decoding in Practice: How EAGLE3 Makes LLMs Faster Without Changing Their Outputs Apr 3 • 8
view article Article Speculative Decoding in Practice: How EAGLE3 Makes LLMs Faster Without Changing Their Outputs Apr 3 • 8