R's picture

In a Training Loop 🔄

R PRO

juiceb0xc0de

·

JuiceB0xC0de

AI & ML interests

destroying heuristic determination in 4 dimensions to flood the engines with diversity and a lot of swear words

Recent Activity

posted an update about 15 hours ago

Okay, I had way too much fun trying to make the unsloth-bot hallucinate incorrect answers like so many frontier models have done to me in the past regarding fine-tuning and general machine learning. Learning to fine-tune LLMs could have been so much simpler had this been available when I began screwing around with neural networks. 10/10 recommend for beginners. https://huggingface.co/unsloth/unsloth-bot

posted an update 1 day ago

I dropped a new scheduler I created last week without much of an explanation of what it was or how it worked called the Lucky Pick Scheduler. It was just a modal ready app that anyone could have launched and troubleshot their way around. I've decided I'm going to enter it into the AMD hackathon. Today I started putting together a Github repo with a few extra additions to the scheduler itself. Essentially it's a training scheduler that randomly drops layers/heads/channels every ~50 steps during fine-tuning, holds the topology frozen, then reshuffles. In theory the model has to build distributed representations because it never trains through the same compute path for long. And with less gradient memory, bigger models are able fit on smaller hardware. It's now close to fully capable of automatically configuring itself to any language mode. I've tested it on: -Qwen-2.5-3b-Instruct -Falcon-E-3B-Instruct -SmolLM2-360M -Ministral-3-3B-Instruct-2512 -Doge-320M -Llama-3.2-3b -Gemma-4-e4b -Phi-4-mini -OLMo-2-0425-1B -Phi-tiny-MoE-instruct Feel free to check it out at Github: https://github.com/JuiceB0xC0de/lucky-pick-scheduler.git

liked a model 10 days ago

Jackrong/Qwopus3.5-9B-v3-GGUF

View all activity

Organizations

juiceb0xc0de 's models 10

juiceb0xc0de/bella-bartender-8b-llama3.1

Text Generation • 8B • Updated 28 days ago • 1.55k • 4

juiceb0xc0de/bella-bartender-3b

Text Generation • 3B • Updated about 1 month ago • 235 • 2

juiceb0xc0de/bella-bartender-v2-8b

Text Generation • 8B • Updated about 1 month ago • 206 • 3

juiceb0xc0de/bella-bartender-9b-yi

Text Generation • 9B • Updated Mar 24 • 592 • 1

juiceb0xc0de/bella-bartender-heretic-1b

Text Generation • 1B • Updated Mar 21 • 929 • 1

juiceb0xc0de/bella-bartender-1b

Text Generation • 1B • Updated Mar 21 • 624 • 1

juiceb0xc0de/bella-bartender-heretic-3b

Text Generation • 3B • Updated Mar 20 • 763 • 2

juiceb0xc0de/bella-tao-merged-qwen2_5-coder-7b

Text Generation • 8B • Updated Mar 19 • 143

juiceb0xc0de/bella-bartender-v2-moody-8b

Text Generation • 8B • Updated Mar 11 • 78 • 1

juiceb0xc0de/dread-llama-8b-existential

Text Generation • 8B • Updated Mar 9 • 117 • 1