Gunnar's picture

Gunnar PRO

grhone

·

grhone

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

upvoted a paper about 1 month ago

Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing

upvoted a paper about 1 month ago

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

View all activity

Organizations

upvoted 3 papers about 1 month ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 94

Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing

Paper • 2604.10708 • Published Apr 12 • 42

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

Paper • 2604.10905 • Published Apr 13 • 28

upvoted 5 papers 3 months ago

Innovator-VL: A Multimodal Large Language Model for Scientific Discovery

Paper • 2601.19325 • Published Jan 27 • 81

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 69

OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models

Paper • 2601.21639 • Published Jan 29 • 51

PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing

Paper • 2601.21957 • Published Jan 29 • 19

Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published Jan 30 • 111

upvoted 2 papers 6 months ago

VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

Paper • 2511.02778 • Published Nov 4, 2025 • 103

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 132

upvoted 2 papers 8 months ago

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10, 2025 • 56

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 193

upvoted 2 papers 9 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 211

MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh

Paper • 2508.01242 • Published Aug 2, 2025 • 11

upvoted 4 papers 10 months ago

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67

Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents

Paper • 2507.04009 • Published Jul 5, 2025 • 55

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published Jun 30, 2025 • 90

Depth Anything at Any Condition

Paper • 2507.01634 • Published Jul 2, 2025 • 49

upvoted a collection 11 months ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 10 items • Updated Mar 2 • 562

upvoted a paper 11 months ago

On Domain-Specific Post-Training for Multimodal Large Language Models

Paper • 2411.19930 • Published Nov 29, 2024 • 30