YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

🚀 Dexter

The Ultimate Ecosystem to Spin Up, Run, and Fine-Tune Large Language Models

Hugging Face Modular MAX Mojo Python


Dexter provides a straight line from your first API call to autonomous AI research loops and production endpoints. Designed for Gemma-4-31B and optimized for cheap, high-performance cloud GPUs.

🎮 The Dexter OS (Start Here!)

The easiest way to interact with the entire ecosystem, run labs, and manage your cloud swarm is through the interactive Dexter OS CLI.

# Clone the repository
git clone https://github.com/lyffseba/dexter.git
cd dexter

# Boot up the Dexter OS
python3 dexter_cli.py

The OS will guide you through running the local Mojo labs and deploying the cloud inference endpoints automatically.

🗂️ Ecosystem Structure

dexter/
├── scripts/                 # Deployment scripts for Inference (Modular MAX)
└── labs/                    # Interactive Tinkering & Fine-Tuning Labs
    ├── docs/                      # 📖 Dedicated documentation per lab
    ├── 00_getting_started.ipynb   # 🟢 START HERE: Data, Tokenizers & Mojo setup
    ├── 01_inference_test.ipynb    # 💬 API Prompting & Generation
    ├── 02_qlora_finetuning.ipynb  # 🧠 QLoRA fine-tuning 31B models
    └── autoresearch/              # 🤖 Autonomous AI Agent Research Loop (PyTorch -> Mojo port)

🌟 The Best Platform: Modular Cloud

For cost-to-performance, Modular Cloud is currently the best platform. They have the fastest inference speeds for Gemma-4 using MAX, their GenAI native modeling & serving framework, completely outperforming vLLM on both NVIDIA and AMD platforms.

While Modular Cloud is ideal for inference and deployment, RunPod (RTX 3090/4090) remains the best value for bare-metal SSH access required for the autonomous PyTorch training labs.


🛠️ Getting Started on Modular Cloud

1. Account Setup

  1. Go to the Modular Console and sign up.
  2. Modular Cloud gives you a straight line from the first API call to a production endpoint.

2. Running Gemma-4-31B (Inference)

Inside your cloud environment, install the Modular CLI and MAX framework:

bash scripts/01_setup_inference.sh

Then start the MAX engine server:

bash scripts/02_start_server.sh

Now you can interact with the model via Python or using any OpenAI-compatible UI!


🔬 Tinkering & Fine-Tuning Labs

Check the labs/ directory for Jupyter Notebooks designed to let you play with the model locally and in the cloud.


Built with ❤️ by lyffseba and the autonomous swarm
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support