Qwen3.6 — 27B
Local / Private AIAlibaba's latest 27B dense model doesn't just succeed the previous local AI king — it surpasses their own 397B flagship on every major agentic coding benchmark while running on a single consumer GPU. SWE-bench Verified 77.2, Terminal-Bench 2.0 59.3, native vision and video, Apache 2.0. The local inference turning point.
Beats Qwen3.5-397B-A17B (a 397B MoE model) on SWE-bench Verified (77.2), SWE-bench Pro (53.5), Terminal-Bench 2.0 (59.3), and SkillsBench Avg5 (48.2). GPQA Diamond 87.8. Native multimodal with thinking preservation. r/LocalLLaMA calls it "the biggest release of the year" and "a turning point for local inference."
Similar VRAM profile to predecessor (~17–20 GB in 4-bit); very new so quantized options are still rolling out; thinking mode can be verbose on simpler tasks (toggleable). Not quite closed-model SOTA on the absolute hardest long-horizon agent runs.