Man Cub
mancub
ยท
AI & ML interests
None yet
Recent Activity
new activity 9 minutes ago
froggeric/Qwen-Fixed-Chat-Templates:v11/v12 performance considerations with Claude Code? new activity about 2 hours ago
froggeric/Qwen-Fixed-Chat-Templates:When using Claude Code, tool calls end up broken with this chat template in Qwen3.6-27BOrganizations
None yet
v11/v12 performance considerations with Claude Code?
2
#11 opened about 2 hours ago
by
mancub
When using Claude Code, tool calls end up broken with this chat template in Qwen3.6-27B
6
#6 opened 2 days ago
by
mancub
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half
#10 opened 3 days ago
by
mancub
Good quant!
12
#1 opened 8 days ago
by
qenme
INT8 version for TP=2 / dual Ampere GPUs?
#6 opened 4 days ago
by
mancub
Does not appear to work with the new google drafter MTP model
#2 opened 5 days ago
by
mancub
Is it supposed to work in vllm?
1
#2 opened 5 days ago
by
mancub
Avg Draft acceptance rate is low.
17
#2 opened 17 days ago
by
fouvy
OOM and context limits reached too soon
1
#5 opened 19 days ago
by
mancub
Unable to run on 3090
1
#1 opened 18 days ago
by
mancub
How to split this model between 2 (3) GPUs and CPU/RAM ?
30
#12 opened about 2 months ago
by
mancub
My personal vLLM launch cmd on my old personal 2x3090 workstation
7
#1 opened 2 months ago
by
tclf90
What was just updated and why?
๐ 1
2
#1 opened about 1 month ago
by
mancub
How to use it with llama-server ?
๐ 1
3
#1 opened about 2 months ago
by
mancub
Poor performance and pretty lobotomized
2
#1 opened about 2 months ago
by
mancub
Love the license, confused by some of the decisions.
๐ค๐ 16
15
#15 opened about 2 months ago
by
CyborgPaloma
It's really good.
๐ 1
26
#3 opened 2 months ago
by
Shuasimodo
Increasing the precision of some of the weights when quantizing
๐ 4
57
#2 opened 3 months ago
by
Shuasimodo
A draft model with less parameters, for speculative thinking?
8
#5 opened 3 months ago
by
mancub