🌿 Bonsai-demo · Dashboard

requests updated
connecting…
Live
Active
requests processing
Queued
requests waiting
Avg Latency
ms · last 5 min
p90 Latency
ms · last 5 min
Gen Speed
tok / s · avg
Prompt Speed
tok / s · avg

Concurrency — active & queued slots

Generation — tok / s (current)

Prompt processing — tok / s (current)

Historical
Requests — 24h
chat completions
Requests — 7d
chat completions
Requests — Total
since last restart
Tokens Generated
cumulative
Decodes
llama_decode() calls

Requests per hour — last 24h

Tokens generated (cumulative)

GPU Health
loading…