51 64

Blessing Agyei Kyem

Blessing

AI & ML interests

Data Science, Machine learning, Deep learning, Reinforcement learning

Organizations

upvoted a paper 7 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 514

upvoted a collection 8 months ago

MobileCLIP2

Collection

MobileCLIP2: Mobile-friendly image-text models with SOTA zero-shot capabilities trained on DFNDR-2B • 30 items • Updated 19 days ago • 61

upvoted a paper 9 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25, 2025 • 221

upvoted 2 collections 9 months ago

InternVL3.5

Collection

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 45 items • Updated Mar 2 • 109

Qwen2-VL

Collection

Vision-language model series based on Qwen2 • 15 items • Updated Mar 2 • 231

upvoted a paper 9 months ago

MiMo-VL Technical Report

Paper • 2506.03569 • Published Jun 4, 2025 • 80

upvoted 2 articles 9 months ago

Article

Vision Language Model Alignment in TRL ⚡️

sergiopaniego, merve, qgallouedec, kashif, ariG23498

•

Aug 7, 2025

• 111

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

reach-vb, pcuenq, lewtun, clem, Rocketknight1, clefourrier, celinah, Wauplin, marcsun13, pagezyhf, ahadnagy, joaogante

•

Aug 5, 2025

• 513

upvoted an article 10 months ago

Article

seemore: Implement a Vision Language Model from Scratch

AviSoori1x

•

Jun 23, 2024

• 109

upvoted a paper 10 months ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22, 2025 • 130

upvoted an article 10 months ago

Article

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

moonshotai

•

Jun 21, 2025

• 77

upvoted a paper 10 months ago

GTA1: GUI Test-time Scaling Agent

Paper • 2507.05791 • Published Jul 8, 2025 • 27

upvoted a collection 10 months ago

NuExtract-2.0

Collection

Models specialized in extracting structured information (JSON) from text, PDFs, scans, spreadsheets, etc. • 9 items • Updated Mar 2 • 29

upvoted an article 10 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 773

upvoted a paper 10 months ago

How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks

Paper • 2507.01955 • Published Jul 2, 2025 • 36

upvoted 3 collections 10 months ago

upvoted a paper 11 months ago

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Paper • 2401.12168 • Published Jan 22, 2024 • 29

upvoted an article 11 months ago

Article

Gemma 3n fully available in the open-source ecosystem!

ariG23498, pcuenq, sergiopaniego, reach-vb, FL33TW00D-HF, Xenova, Steveeeeeeen, kashif

•

Jun 26, 2025

• 121

Blessing Agyei Kyem

AI & ML interests

Organizations

Blessing's activity

Vision Language Model Alignment in TRL ⚡️

Welcome GPT OSS, the new open-source model family from OpenAI!

seemore: Implement a Vision Language Model from Scratch

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

SmolLM3: smol, multilingual, long-context reasoner

Gemma 3n fully available in the open-source ecosystem!