view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 18 days ago • 59
TutorBench: A Benchmark To Assess Tutoring Capabilities Of Large Language Models Paper • 2510.02663 • Published Oct 3, 2025 • 2
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published 18 days ago • 185
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI Paper • 2311.16502 • Published Nov 27, 2023 • 40
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated 20 days ago • 539k • 2.78k
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors Paper • 2502.18940 • Published Feb 26, 2025 • 3
From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning Paper • 2505.15607 • Published May 21, 2025 • 4