Instructions to use prithivMLmods/Muscae-Qwen3-UI-Code-4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use prithivMLmods/Muscae-Qwen3-UI-Code-4B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="prithivMLmods/Muscae-Qwen3-UI-Code-4B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Muscae-Qwen3-UI-Code-4B") model = AutoModelForCausalLM.from_pretrained("prithivMLmods/Muscae-Qwen3-UI-Code-4B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use prithivMLmods/Muscae-Qwen3-UI-Code-4B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "prithivMLmods/Muscae-Qwen3-UI-Code-4B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/Muscae-Qwen3-UI-Code-4B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/prithivMLmods/Muscae-Qwen3-UI-Code-4B
- SGLang
How to use prithivMLmods/Muscae-Qwen3-UI-Code-4B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "prithivMLmods/Muscae-Qwen3-UI-Code-4B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/Muscae-Qwen3-UI-Code-4B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "prithivMLmods/Muscae-Qwen3-UI-Code-4B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/Muscae-Qwen3-UI-Code-4B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use prithivMLmods/Muscae-Qwen3-UI-Code-4B with Docker Model Runner:
docker model run hf.co/prithivMLmods/Muscae-Qwen3-UI-Code-4B
Muscae-Qwen3-UI-Code-4B
Muscae-Qwen3-UI-Code-4B is a web-UI-focused model fine-tuned on UIGEN-T3-4B-Preview (built upon Qwen3-4B) for controlled Abliterated Reasoning and polished token probabilities, designed exclusively for experimental use. It excels at modern web UI coding tasks, structured component generation, and layout-aware reasoning, making it ideal for frontend developers, UI engineers, and research prototypes exploring structured code generation.
GGUF: https://huggingface.co/prithivMLmods/Muscae-Qwen3-UI-Code-4B-GGUF
Key Features
UI-Oriented Abliterated Reasoning Controlled reasoning precision tailored for frontend development and code generation, with polished token distributions ensuring structured, maintainable output.
Web UI Component Generation Excels at generating responsive components, semantic HTML, and Tailwind-based layouts with reasoning-aware structure and minimal boilerplate.
Layout-Aware Structured Logic Understands UI state flows, component hierarchies, and responsive design patterns, producing logically consistent, production-ready UI code.
Hybrid Reasoning for Code Combines symbolic reasoning with probabilistic inference to deliver optimized component logic, conditional rendering, and event-driven UI behavior.
Structured Output Mastery Natively outputs in HTML, React, Markdown, JSON, and YAML, making it ideal for UI prototyping, design systems, and documentation generation.
Optimized Lightweight Footprint With a 4B parameter size, it’s deployable on mid-range GPUs, offline workstations, or edge devices while retaining strong UI coding capabilities.
Quickstart with Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "prithivMLmods/Muscae-Qwen3-UI-Code-4B"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Generate a responsive landing page hero section with Tailwind and semantic HTML."
messages = [
{"role": "system", "content": "You are a frontend coding assistant skilled in UI generation, semantic HTML, and component structuring."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
Intended Use
- Web UI coding and component generation
- Responsive layout and frontend architecture prototyping
- Semantic HTML, Tailwind, and React code generation
- Research and experimental projects on structured code synthesis
- Design-system-driven development workflows
Limitations
- Experimental model – not optimized for production-critical deployments
- Focused on UI coding – not suitable for general reasoning or creative writing
- May produce inconsistent results with very long prompts or cross-framework tasks
- Prioritizes structure and correctness over stylistic creativity or verbosity
- Downloads last month
- 15
