Man Cub's picture

Man Cub

mancub

·

AI & ML interests

None yet

Recent Activity

new activity 9 minutes ago

froggeric/Qwen-Fixed-Chat-Templates:v11/v12 performance considerations with Claude Code?

new activity about 2 hours ago

froggeric/Qwen-Fixed-Chat-Templates:When using Claude Code, tool calls end up broken with this chat template in Qwen3.6-27B

new activity 3 days ago

z-lab/Qwen3.6-27B-DFlash:RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half

View all activity

Organizations

None yet

New activity in froggeric/Qwen-Fixed-Chat-Templates 9 minutes ago

v11/v12 performance considerations with Claude Code?

#11 opened about 2 hours ago by

New activity in froggeric/Qwen-Fixed-Chat-Templates about 2 hours ago

When using Claude Code, tool calls end up broken with this chat template in Qwen3.6-27B

#6 opened 2 days ago by

New activity in z-lab/Qwen3.6-27B-DFlash 3 days ago

RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half

#10 opened 3 days ago by

New activity in Minachist/Qwen3.6-27B-INT8-AutoRound 4 days ago

Good quant!

#1 opened 8 days ago by

New activity in Intel/gemma-4-31B-it-int4-AutoRound 4 days ago

INT8 version for TP=2 / dual Ampere GPUs?

#6 opened 4 days ago by

New activity in QuantTrio/gemma-4-31B-it-AWQ 5 days ago

Does not appear to work with the new google drafter MTP model

#2 opened 5 days ago by

New activity in google/gemma-4-31B-it-assistant 5 days ago

Is it supposed to work in vllm?

#2 opened 5 days ago by

New activity in z-lab/Qwen3.6-27B-DFlash 13 days ago

Avg Draft acceptance rate is low.

#2 opened 17 days ago by

New activity in LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Wasserstein-GGUF 18 days ago

OOM and context limits reached too soon

#5 opened 19 days ago by

New activity in rdtand/Qwen3.6-27B-PrismaQuant-5.5bit-vllm 18 days ago

Unable to run on 3090

#1 opened 18 days ago by

New activity in AesSedai/Qwen3.6-35B-A3B-GGUF 18 days ago

Q6_K?

#1 opened 24 days ago by

New activity in ubergarm/Qwen3.5-122B-A10B-GGUF 18 days ago

How to split this model between 2 (3) GPUs and CPU/RAM ?

#12 opened about 2 months ago by

New activity in QuantTrio/Qwen3.5-27B-AWQ about 1 month ago

My personal vLLM launch cmd on my old personal 2x3090 workstation

#1 opened 2 months ago by

New activity in mudler/gemma-4-26B-A4B-it-APEX-GGUF about 1 month ago

What was just updated and why?

#1 opened about 1 month ago by

New activity in adamjen/Devstral-Small-2-24B-Opus-Reasoning about 2 months ago

How to use it with llama-server ?

#1 opened about 2 months ago by

New activity in noctrex/Mistral-Small-4-119B-2603-MXFP4_MOE-GGUF about 2 months ago

Poor performance and pretty lobotomized

#1 opened about 2 months ago by

New activity in mistralai/Mistral-Small-4-119B-2603 about 2 months ago

Love the license, confused by some of the decisions.

#15 opened about 2 months ago by

New activity in noctrex/Qwen3.5-35B-A3B-MXFP4_MOE-GGUF 2 months ago

It's really good.

#3 opened 2 months ago by

New activity in noctrex/Qwen3-Coder-Next-MXFP4_MOE-GGUF 3 months ago

Increasing the precision of some of the weights when quantizing

#2 opened 3 months ago by

New activity in TeichAI/GLM-4.7-Flash-Claude-Opus-4.5-High-Reasoning-Distill-GGUF 3 months ago

A draft model with less parameters, for speculative thinking?

#5 opened 3 months ago by