GLM-4.7-Flash-NVFP4 was just released, but for Transformers 5.0 + vLLM 0.14...?

FYI. I released the images at New pre-built vLLM Docker Images for NVIDIA DGX Spark ; I don’t think GLM-4.7-Flash works on the latest image, but I was planning to release a new version tomorrow coinciding with pytorch 2.10 release. I don’t think vllm 0.14.0 release will work; I think a key commit came in right after the 0.14.0 release–but I might include it in the updated vllm 0.14.0 release image tomorrow given that GLM-4.7-Flash is pretty key…

3 Likes