vLLM >= 0.12 on DGX Spark?

Hi All! I am running vLLM on a DGX Spark and need to use vLLM version 0.12.0 to get support for the new Ministral 3 model family (Ministral-3-14B-Instruct, etc.).

The official NGC container (nvcr.io/nvidia/vllm:25.11-py3) currently supports vLLM only up to version 0.11, which predates the required model support.

Given that the DGX Spark is a newer platform and often requires the NVIDIA-optimized container for correct memory and hardware handling (specifically the sm_121a architecture), are there plans to release an updated official NGC container with vLLM>0.12.0 soon?

If an official image is not immediately available, could you provide the recommended steps or a canonical guide for building vLLM>0.12.0 from source specifically for the DGX Spark? I’m not sure how to do this without impacting the platform specific optimizations you have set in the container.

Thank you for your assistance!

I am using the wonderful work from @eugr from here GitHub - eugr/spark-vllm-docker: Docker configuration for running VLLM on dual DGX Sparks.

@johnny_nv teased that there will be a new Nvidia NGC release next week which will also progress the state of the art. Run VLLM in Spark - #102 by johnny_nv

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.