distilbert-intent-sql-creative-general

Fine-tuned distilbert-base-uncased for 3-class intent routing in an LLM inference pipeline.

Purpose

Routes user prompts to the appropriate vLLM LoRA adapter on a Triton Inference Server:

Label	ID	Routes to
`GENERAL`	0	Qwen2.5-7B-Instruct (no LoRA)
`SQL`	1	`sql-expert` LoRA adapter
`CREATIVE`	2	`creative` LoRA adapter

Training

Base model: distilbert/distilbert-base-uncased
Dataset: 84 hand-curated examples (SQL=30, CREATIVE=23, GENERAL=31)
Epochs: 5
Learning rate: 2e-5
Batch size: 16
Max sequence length: 128
Optimizer: AdamW (weight_decay=0.01)
Val split: 20% stratified

Deployment

Exported to ONNX (opset 17) via optimum and served as an ONNX Runtime backend model inside NVIDIA Triton Inference Server on GKE Autopilot (NVIDIA L4 GPU).

Usage

from transformers import pipeline

classifier = pipeline("text-classification", model="xczou/distilbert-intent-sql-creative-general")
classifier("Write a SQL query to find all orders above 100")
# [{'label': 'SQL', 'score': 0.98}]

Downloads last month: 97

Safetensors

Model size

67M params

Tensor type

F32

Model tree for xczou/distilbert-intent-sql-creative-general

Base model

distilbert/distilbert-base-uncased

Finetuned

(11452)

this model