In a Training Loop 🔄

8 11 23

Louis Ulmer

lulmer

lulmer

AI & ML interests

NLP (semantic search, topic generation) Computer vision (object detection) Diffusion Models

Recent Activity

liked a model 6 days ago

deepseek-ai/DeepSeek-V4-Pro

liked a model 7 days ago

poolside/Laguna-XS.2

liked a model 4 months ago

zai-org/GLM-4.7-Flash

View all activity

Organizations

liked a model 6 days ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 6 days ago • 2.02M • • 3.86k

liked a model 7 days ago

poolside/Laguna-XS.2

Text Generation • 33B • Updated 4 days ago • 25.6k • 244

liked a model 4 months ago

zai-org/GLM-4.7-Flash

Text Generation • 31B • Updated Jan 29 • 617k • • 1.72k

New activity in stas/openwebtext-10k 4 months ago

Convert dataset to Parquet

#3 opened 4 months ago by

lulmer

upvoted a paper 4 months ago

MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

Paper • 2601.07832 • Published Jan 12 • 52

liked a dataset 5 months ago

khaihernlow/financial-reports-sec

Updated Jan 6, 2023 • 398 • 2

upvoted a paper 8 months ago

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 193

upvoted a paper 10 months ago

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

Paper • 2507.10524 • Published Jul 14, 2025 • 73

upvoted an article 10 months ago

Article

Bringing Fusion Down to Earth: ML for Stellarator Optimization

cgeorgiaw

•

Jul 2, 2025

• 80

liked a model 11 months ago

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27, 2025 • 723k • • 12.8k

New activity in Qwen/Qwen2.5-VL-7B-Instruct 11 months ago

Exception: Could not find the transformer layer class to wrap in the model.

👍 4

#2 opened over 1 year ago by

atishay-scribe

upvoted an article 11 months ago

Article

🐯 Liger GRPO meets TRL

shisahni, kashif, smohammadi, ShirinYamani, m0m0chen, liberty4321

•

May 25, 2025

• 53

liked a Space 11 months ago

The Ultra-Scale Playbook

🌌

3.83k

The ultimate guide to training LLM on large GPU Clusters

upvoted an article 12 months ago

Article

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

tiiuae

•

May 21, 2025

• 39

liked 5 datasets about 1 year ago

liked a model about 1 year ago

Qwen/Qwen2.5-Coder-32B-Instruct

Text Generation • 33B • Updated Jan 12, 2025 • 1.05M • • 2.02k

Louis Ulmer

AI & ML interests

Recent Activity

Organizations

lulmer's activity

Convert dataset to Parquet

Bringing Fusion Down to Earth: ML for Stellarator Optimization

Exception: Could not find the transformer layer class to wrap in the model.

🐯 Liger GRPO meets TRL

The Ultra-Scale Playbook

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance