distilbert-intent-sql-creative-general

Fine-tuned distilbert-base-uncased for 3-class intent routing in an LLM inference pipeline.

Purpose

Routes user prompts to the appropriate vLLM LoRA adapter on a Triton Inference Server:

Label ID Routes to
GENERAL 0 Qwen2.5-7B-Instruct (no LoRA)
SQL 1 sql-expert LoRA adapter
CREATIVE 2 creative LoRA adapter

Training

  • Base model: distilbert/distilbert-base-uncased
  • Dataset: 84 hand-curated examples (SQL=30, CREATIVE=23, GENERAL=31)
  • Epochs: 5
  • Learning rate: 2e-5
  • Batch size: 16
  • Max sequence length: 128
  • Optimizer: AdamW (weight_decay=0.01)
  • Val split: 20% stratified

Deployment

Exported to ONNX (opset 17) via optimum and served as an ONNX Runtime backend model inside NVIDIA Triton Inference Server on GKE Autopilot (NVIDIA L4 GPU).

Usage

from transformers import pipeline

classifier = pipeline("text-classification", model="xczou/distilbert-intent-sql-creative-general")
classifier("Write a SQL query to find all orders above 100")
# [{'label': 'SQL', 'score': 0.98}]
Downloads last month
97
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for xczou/distilbert-intent-sql-creative-general

Finetuned
(11452)
this model