Why 200 tok/s is new normal? — TP=2 Does Scale After All

qwen 3.5 122b int4 autoround after 1h devops and code tree analysis. 10% never used.

full story: [https://forums.developer.nvidia.com/t/why-you-should-rip-it-yourself-live-moe-expert-pruning-in-vllm/]