Rethinking OPD Collection This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 3 items • Updated 4 days ago
MAIC-UI: Making Interactive Courseware with Generative UI Paper • 2604.25806 • Published 10 days ago • 8
Rethinking OPD Collection This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 3 items • Updated 4 days ago
Rethinking OPD Collection This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 3 items • Updated 4 days ago
DeepPrune: Parallel Scaling without Inter-trace Redundancy Paper • 2510.08483 • Published Oct 9, 2025 • 24
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 24 days ago • 90
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 24 days ago • 90
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards Paper • 2601.06021 • Published Jan 9 • 48
Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation Paper • 2602.12125 • Published Feb 12 • 64
UltraData Collection Ultra Scale, Ultra Quality, Ultra Coverage • 10 items • Updated 20 days ago • 81