Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
43.9
TFLOPS
13
10
240
Aunali
Cossale
Follow
shtefcs's profile picture
mindkrypted's profile picture
PhysiQuanty's profile picture
8 followers
Β·
42 following
https://auna.li?q=hf
XCossale
Aunali321
AI & ML interests
Text2Image and Text2Text generation.
Recent Activity
liked
a dataset
5 days ago
fvdfs41/Discord-Unveiled
reacted
to
qgallouedec
's
post
with π₯
9 days ago
TRL v1.2 introduces the SSDTrainer π Simple Self-Distillation (SSD) from Apple's paper "Embarrassingly Simple Self-Distillation Improves Code Generation" is now available as an experimental trainer in TRL. The recipe is as minimal as the name suggests: sample completions from the model itself at a training-time temperature, then fine-tune on those raw, unverified samples with plain cross-entropy. No reward model. No verifier. No teacher model. No reinforcement learning. Just prompts and the model. ```python from trl.experimental.ssd import SSDConfig, SSDTrainer trainer = SSDTrainer( model="Qwen/Qwen3-4B-Instruct", args=SSDConfig(temperature=0.6, top_k=20, top_p=0.95), train_dataset=dataset, ) trainer.train() ``` v1.2 also ships expanded tool-calling support (LLaMA 3.1 / 3.2, DeepSeek-V3), another round of KTO β DPO alignment getting us closer to promoting KTO to stable, a big GRPO simplification for overlong tool results, deprecation of `use_transformers_paged`, and key fixes for VLM response parsing. Full release notes: https://github.com/huggingface/trl/releases/tag/v1.2.0
liked
a model
19 days ago
openbmb/VoxCPM2
View all activity
Organizations
Cossale
's Spaces
2
Sort:Β Recently updated
Runtime error
Agents
OmniGen
πΌ
Image generator/identifier/reposer
Sleeping
Agents
1
AI4BharatTranslation
π¨
Translate text between Indian languages and English