vLLM 0.12.x Container for jetson Thor

Currently, the available NVIDIA vLLM container for Jetson AGX Thor is based on vLLM 0.11.x, which does not support speculative decoding for vision-based models, particularly for the Qwen family (EAGLE-3).

However, vLLM 0.12.x introduces support for speculative decoding for vision models, which is required for our use case.

Could you please clarify:

  1. Is there an official NVIDIA container release planned for Jetson AGX Thor that includes vLLM 0.12.x?

  2. If so, what is the expected timeline for this release?

  3. In the meantime, what is the recommended and supported approach to upgrade from vLLM 0.11.x to vLLM 0.12.x on Jetson AGX Thor?

Hi,

The vLLM container is released in a monthly manner.
Please wait for a newer release.

Thanks.

nvcr.io/nvidia/vllm:25.12.post1-py3

pip show vllm
Name: vllm
Version: 0.12.0+35a9f223.nv25.12.post1.cu131

@whitesscott
Thanks for pointing this out.

It looks like the vllm 0.12 is available now.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.