Don’t use cyankiwi/GLM-4.7-AWQ-4bit - it produces some random garbage as an output. So far that NVFP4 model in my original post is the only one that works on dual Sparks (at least in vLLM).
Don’t use cyankiwi/GLM-4.7-AWQ-4bit - it produces some random garbage as an output. So far that NVFP4 model in my original post is the only one that works on dual Sparks (at least in vLLM).