I'd like to learn how to use the latest vLLM on DGX Spark

zh_useai · November 5, 2025, 8:59am

The nvcr.io/nvidia/vllm Docker image fails to load the Qwen3-Next-80B-A3B-Instruct-FP8 model on a DGX Spark machine. Installing vLLM from the GitHub repository results in an AssertionError: VLLM_USE_PRECOMPILED is only supported for CUDA builds.

(zh) qichen@spark-2b79:~/zh/vllm$ VLLM_USE_PRECOMPILED=1 uv pip install --editable .
Using Python 3.12.12 environment at: /home/qichen/miniconda3/envs/zh
  × Failed to build `vllm @ file:///home/qichen/zh/vllm`
  ├─▶ The build backend returned an error
  ╰─▶ Call to `setuptools.build_meta.build_editable` failed (exit status: 1)

      [stderr]
      /home/qichen/.cache/uv/builds-v0/.tmpYuxtb2/lib/python3.12/site-packages/torch/_subclasses/functional_tensor.py:279: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at
      /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
        cpu = _conversion_method_template(device=torch.device("cpu"))
      Traceback (most recent call last):
        File "<string>", line 14, in <module>
        File "/home/qichen/.cache/uv/builds-v0/.tmpYuxtb2/lib/python3.12/site-packages/setuptools/build_meta.py", line 473, in get_requires_for_build_editable
          return self.get_requires_for_build_wheel(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/qichen/.cache/uv/builds-v0/.tmpYuxtb2/lib/python3.12/site-packages/setuptools/build_meta.py", line 331, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=[])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/qichen/.cache/uv/builds-v0/.tmpYuxtb2/lib/python3.12/site-packages/setuptools/build_meta.py", line 301, in _get_build_requires
          self.run_setup()
        File "/home/qichen/.cache/uv/builds-v0/.tmpYuxtb2/lib/python3.12/site-packages/setuptools/build_meta.py", line 317, in run_setup
          exec(code, locals())
        File "<string>", line 661, in <module>
      AssertionError: VLLM_USE_PRECOMPILED is only supported for CUDA builds

      hint: This usually indicates a problem with the package or the build environment.
(zh) qichen@spark-2b79:~/zh/vllm$ python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'CUDA version: {torch.version.cuda}')"
CUDA available: True
CUDA version: 12.8

When I use uv pip install vllm, I encounter compilation errors. I’d like to ask if this is due to the Spark system? Can I install the base version of Ubuntu on Spark?

      [stderr]
      /home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/torch/_subclasses/functional_tensor.py:276: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at
      /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.)
        cpu = _conversion_method_template(device=torch.device("cpu"))
      /home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools_scm/_integration/version_inference.py:51: UserWarning: version of None already set
        warnings.warn(self.message)
      listing git files failed - pretending there aren't any
      CMake Error at CMakeLists.txt:14 (project):
        Running

         '/home/qichen/.cache/uv/builds-v0/.tmpjVsjf2/bin/ninja' '--version'

        failed with:

         no such file or directory


      Traceback (most recent call last):
        File "<string>", line 11, in <module>
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/build_meta.py", line 432, in build_wheel
          return _build(['bdist_wheel'])
                 ^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/build_meta.py", line 423, in _build
          return self._build_with_temp_dir(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/build_meta.py", line 404, in _build_with_temp_dir
          self.run_setup()
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/build_meta.py", line 317, in run_setup
          exec(code, locals())
        File "<string>", line 673, in <module>
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/__init__.py", line 117, in setup
          return distutils.core.setup(**attrs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 186, in setup
          return run_commands(dist)
                 ^^^^^^^^^^^^^^^^^^
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 202, in run_commands
          dist.run_commands()
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 1002, in run_commands
          self.run_command(cmd)
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/dist.py", line 1104, in run_command
          super().run_command(command)
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command
          cmd_obj.run()
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/command/bdist_wheel.py", line 370, in run
          self.run_command("build")
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_distutils/cmd.py", line 357, in run_command
          self.distribution.run_command(command)
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/dist.py", line 1104, in run_command
          super().run_command(command)
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command
          cmd_obj.run()
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_distutils/command/build.py", line 135, in run
          self.run_command(cmd_name)
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_distutils/cmd.py", line 357, in run_command
          self.distribution.run_command(command)
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/dist.py", line 1104, in run_command
          super().run_command(command)
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command
          cmd_obj.run()
        File "<string>", line 270, in run
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/command/build_ext.py", line 99, in run
          _build_ext.run(self)
        File "/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_distutils/command/build_ext.py", line 368, in run
          self.build_extensions()
        File "<string>", line 232, in build_extensions
        File "<string>", line 210, in configure
        File "/home/qichen/miniconda3/envs/zh/lib/python3.12/subprocess.py", line 413, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['cmake', '/home/qichen/.cache/uv/sdists-v9/index/e367fd55faf540ee/vllm/0.10.1.1/yK_7JcUAK-RK1qQ5Mssg4/src', '-G',
      'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DVLLM_TARGET_DEVICE=cuda', '-DVLLM_PYTHON_EXECUTABLE=/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/bin/python',
      '-DVLLM_PYTHON_PATH=/home/qichen/miniconda3/envs/zh/lib/python312.zip:/home/qichen/miniconda3/envs/zh/lib/python3.12:/home/qichen/miniconda3/envs/zh/lib/python3.12/lib-dynload:/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages:/home/qichen/.cache/uv/builds-v0/.tmpJYCmuM/lib/python3.12/site-packages/setuptools/_vendor',
      '-DFETCHCONTENT_BASE_DIR=/home/qichen/.cache/uv/sdists-v9/index/e367fd55faf540ee/vllm/0.10.1.1/yK_7JcUAK-RK1qQ5Mssg4/src/.deps', '-DNVCC_THREADS=1', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile',
      '-DCMAKE_JOB_POOLS:STRING=compile=20', '-DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc']' returned non-zero exit status 1.

      hint: This usually indicates a problem with the package or the build environment.

raphael.amorim · November 5, 2025, 2:37pm

Please check Run VLLM in Spark - #21 by johnny_nv

swtb-datascience · November 5, 2025, 2:50pm

As far as I can tell, the official vLLM does not support the DGX Spark. I’ve made an issue requesting this to be added.

github.com/vllm-project/vllm

[Feature]: DGX Spark sm121a support

opened 01:34PM - 05 Nov 25 UTC

swtb3-ryder

feature request

### 🚀 The feature, motivation and pitch Does vLLM support the DGX Spark hardwar…e? I know nvidia provide a vLLM container with the support packaged, but i believe it uses vLLM version 0.10.2 ### Alternatives _No response_ ### Additional context _No response_ ### Before submitting a new issue... - [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

In the meantime, Nvidia have release their own vLLM container that runs on vLLM v0.10.2

aniculescu · November 5, 2025, 4:42pm

Hi, please check out our playbook on how to run vLLM on Spark

eugr · November 5, 2025, 4:46pm

This container is outdated and can’t run Qwen3-Next model that OP wants to run. Better to follow the other thread on compiling vllm from source.

ralph.lora · November 5, 2025, 6:31pm

By the way, I was able to run nvidia/Nemotron-Nano-12B-v2-VL-BF16 on the spark using vLLM

zh_useai · November 6, 2025, 1:41am

I have already referred to this tutorial and this container, but they cannot run the new model architecture. Both this container and the sglang container (lmsysorg/sglang:spark) encounter the issue of the Triton kernel not supporting models with fp8e4nv precision.

zh_useai · November 6, 2025, 1:52am

Hello, I’d like to ask: Are you using the Docker vllm image or building from source code?

ralph.lora · November 6, 2025, 4:33pm

Building from source.

meesasusmith · November 29, 2025, 6:35pm

Did you only have to build the VLLM from source or Pytorch with CUDA, also because it seems like aarm64 compatible pytorch isnt available.

Topic		Replies	Views
vLLM >= 0.12 on DGX Spark? DGX Spark / GB10 cuda , containers , llm , dgx	2	302	December 16, 2025
New pre-built vLLM Docker Images for NVIDIA DGX Spark DGX Spark / GB10	48	2701	February 13, 2026
vLLM Container issue in DGX Spark DGX Spark / GB10	5	426	November 11, 2025
Issue with connection to 2 dgx sparks. vllm DGX Spark / GB10	4	154	November 30, 2025
vLLM container out of date for new models DGX Spark / GB10	10	1568	November 14, 2025
Run VLLM in Spark DGX Spark / GB10	143	9258	January 31, 2026
Some new development work for Qwen3 on the Spark DGX Spark / GB10	5	376	February 3, 2026
DGX Spark + Qwen3-Next-80B: Proven Performance, But Missing Clear Path to NIM, TensorRT-LLM & Web UIs DGX Spark / GB10 cuda , nim , llama	10	1710	January 25, 2026
New bleeding-edge vLLM Docker Image: avarok/vllm-nvfp4-gb10-sm120 DGX Spark / GB10 Projects	35	1596	December 31, 2025
Docker container image for recent vLLM release that enables GGUF loading Docker and NVIDIA Docker	3	719	October 29, 2025

I'd like to learn how to use the latest vLLM on DGX Spark

Related topics