Cosmos-Reason1-7B, Blackwell PRO 6000, Invalid Argument Error on vLLM 0.10.2

Hardware Platform (GPU model and numbers)

  • GPU Model: NVIDIA Blackwell Pro 6000
  • GPU Count: 1

System Memory

  • RAM: 256 GB

Ubuntu Version

  • Version: 24.04

NVIDIA GPU Driver Version (valid for GPU only)

  • 580.65.06

Issue Type

  • Type: Bug

Issue Description

When attempting to launch a vLLM (version 0.10.2) container using a Hugging Face model, the container fails to initialize the engine core.
This issue occurs specifically with the model nvidia/Cosmos-Reason1-7B.

Interestingly, the same setup works successfully on an ADA6000 GPU, but fails on the Blackwell Pro 6000 GPU, suggesting a potential compatibility issue between this model and the newer GPU architecture.

The same behavior and error have also been reported in GitHub Issue #65.

Below is the exact error message observed in the container logs:

(EngineCore_0 pid=6357)
CUDA error (/__w/xformers/xformers/third_party/flash-attention/hopper/flash_fwd_launch_template.h:188): invalid argument
Traceback (most recent call last):
...
RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}

How To Reproduce

1-) Run the following command

# Deploy with docker on Linux:
docker run --runtime nvidia --gpus all \
	--name my_vllm_container \
	-v ~/.cache/huggingface:/root/.cache/huggingface \
 	--env "HUGGING_FACE_HUB_TOKEN=<secret>" \
	-p 8000:8000 \
	--ipc=host \
	vllm/vllm-openai:latest \
	--model nvidia/Cosmos-Reason1-7B

2-) Observe that the container exits with the CUDA invalid argument error shown above.

Expected Behavior
The vLLM container should successfully load the Cosmos-Reason1-7B model and start serving inference normally.**

Are you using VSS-2.4.0?

If yes, try adding export COSMOS_REASON1_USE_TRT=false in .env and then try to deploy.
vss uses vllm==0.9.0rc1+1958ee56.nv25.6.cu129; platform_machine == "x86_64" to deploy ngc:nim/nvidia/cosmos-reason1-7b:1.1-fp8-dynamic.

You can check the version of vLLM in nvcr.io/nvidia/blueprint/vss-engine:2.4.0 image

If not, this doesn’t seem to be a VSS issue, please discuss the issue on the cosmos github

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks.