Visually-Guided Policy Optimization for Multimodal Reasoning Paper • 2604.09349 • Published 29 days ago • 2
SpatialGenEval Collection [ICLR 2026] Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models • 1 item • Updated 1 day ago
VGPO-RL Collection [ACL 2026] Visually-Guided Policy Optimization for Multimodal Reasoning • 3 items • Updated 1 day ago
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published about 1 month ago • 187
LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics Paper • 2604.17295 • Published 20 days ago • 84
Elucidating the SNR-t Bias of Diffusion Probabilistic Models Paper • 2604.16044 • Published 22 days ago • 74
Visually-Guided Policy Optimization for Multimodal Reasoning Paper • 2604.09349 • Published 29 days ago • 2
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 30 days ago • 289
Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models Paper • 2603.22212 • Published Mar 23 • 126
Video-CoE: Reinforcing Video Event Prediction via Chain of Events Paper • 2603.14935 • Published Mar 16 • 91
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing Paper • 2603.03143 • Published Mar 3 • 145
From Scale to Speed: Adaptive Test-Time Scaling for Image Editing Paper • 2603.00141 • Published Feb 24 • 137
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios Paper • 2602.22638 • Published Feb 26 • 106