Nice Job!! Looking forward to implementing the model myself. I do really like the Qwen3 Next Instruct 80b model short TTFT vs Nemotron Nano etc which when I am working fit better with my work flow than than waiting 30-45s in thinking mode.
To get to 60-120 toks/ sec from my baseline of 42 toks/sec using regular images is very impressive.