Support for medium Qwen 3.5 models

anothersaiemail February 25, 2026, 11:29pm 1

I’m specifically suggesting implementation for the 3.5-35b-a3b and 27b models, which both are VL models and are extremely capable at their size. The current large model is powerful but clunky for most functions.

Topic		Replies	Views
Adding recipe support for OptimizeLLM/Qwen3-VL-30B-A3B-Thinking-NVFP4 DGX Spark / GB10	6	218	March 20, 2026
Implementation Guide: DGX Spark with Qwen3.5-35B-A3B via llama.cpp for Claude Code DGX Spark / GB10 Projects llama , agentic-ai	3	966	April 2, 2026
Develop Native Multimodal Agents with Qwen3.5 VLM Using NVIDIA GPU-Accelerated Endpoints Technical Blog	0	426	February 27, 2026
Custom built vLLM + Qwen3.5-35B on NVIDIA DGX Spark (GB10) — sustained 50 tok/s, 1M context DGX Spark / GB10	17	2743	April 10, 2026
Qwen3.5-397b-a17b - All requests time out Models	3	152	March 9, 2026
Integrate and Deploy Tongyi Qwen3 Models into Production Applications with NVIDIA Technical Blog	1	185	May 5, 2025
Success with QuantTrio/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-AWQ DGX Spark / GB10	1	1572	April 2, 2026
Support new models Models	1	95	November 8, 2025
Playbook vLLM inference models naming/links issue DGX Spark / GB10 Projects jetson , llama-31-8b-instruct , llama , jira-issue , documentation-update , nemotron	1	176	February 10, 2026
Qwen/Qwen3.5-122B-A10B - Alibaba/Qwen thought about us... :-D DGX Spark / GB10	340	14749	March 24, 2026

Support for medium Qwen 3.5 models

Related topics