distilbert-intent-sql-creative-general
Fine-tuned distilbert-base-uncased for 3-class intent routing in an LLM inference pipeline.
Purpose
Routes user prompts to the appropriate vLLM LoRA adapter on a Triton Inference Server:
| Label | ID | Routes to |
|---|---|---|
GENERAL |
0 | Qwen2.5-7B-Instruct (no LoRA) |
SQL |
1 | sql-expert LoRA adapter |
CREATIVE |
2 | creative LoRA adapter |
Training
- Base model:
distilbert/distilbert-base-uncased - Dataset: 84 hand-curated examples (SQL=30, CREATIVE=23, GENERAL=31)
- Epochs: 5
- Learning rate: 2e-5
- Batch size: 16
- Max sequence length: 128
- Optimizer: AdamW (weight_decay=0.01)
- Val split: 20% stratified
Deployment
Exported to ONNX (opset 17) via optimum and served as an ONNX Runtime backend model inside NVIDIA Triton Inference Server on GKE Autopilot (NVIDIA L4 GPU).
Usage
from transformers import pipeline
classifier = pipeline("text-classification", model="xczou/distilbert-intent-sql-creative-general")
classifier("Write a SQL query to find all orders above 100")
# [{'label': 'SQL', 'score': 0.98}]
- Downloads last month
- 97
Model tree for xczou/distilbert-intent-sql-creative-general
Base model
distilbert/distilbert-base-uncased