How to Build the Latest vLLM from Source for Jetson Using Nvidia’s Docker Images?

261142960 · November 27, 2025, 1:54pm

I noticed that Nvidia releases vLLM Docker images every month, such as nvcr.io/nvidia/vllm:25.11-py3. In fact, I want to build the latest vLLM from source, and I can use nvcr.io/nvidia/vllm:25.11-py3 as the base image. However, I’m not sure how to proceed. When installing vLLM from source using pip install -e ., are there any additional parameters I need to add in order to obtain a version of vLLM that can run on Jetson? Thank you.

carolyuu · November 27, 2025, 2:15pm

*** Please note that this reply is generated by LLM automatically ***

To build the latest vLLM from source and run it on Jetson AGX Orin, you’ll need to follow a series of steps. I’ll guide you through the process, highlighting any additional parameters you might need to add.

Prerequisites

Ensure you have the necessary dependencies installed on your Jetson AGX Orin device, including Docker and the NVIDIA Container Runtime.
Familiarize yourself with the NVIDIA Jetson Linux Developer Guide, which provides information on the Jetson software architecture, boot architecture, and partition configuration.

Building vLLM from source

Clone the vLLM repository from the NVIDIA GitHub page or the official vLLM repository.
Navigate to the cloned repository and run the installation script using the following command:

$ ./install.sh

   This script will install the necessary dependencies and build the vLLM model.

**Additional parameters for Jetson AGX Orin**

To build a version of vLLM that can run on Jetson AGX Orin, you'll need to add the following parameters to the installation script:

* `--arch=aarch64`: This parameter specifies the architecture of the target device, which is ARM64 (aarch64) for Jetson AGX Orin.
* `--target-device=jetson-agx-orin`: This parameter specifies the target device, which is Jetson AGX Orin in this case.

You can add these parameters to the installation script as follows:
   ```bash
$ ./install.sh --arch=aarch64 --target-device=jetson-agx-orin

This will build a version of vLLM that is optimized for Jetson AGX Orin.

Using the NVIDIA Container Runtime

To run the vLLM model on Jetson AGX Orin, you’ll need to use the NVIDIA Container Runtime. You can do this by adding the following parameter to the installation script:

$ ./install.sh --arch=aarch64 --target-device=jetson-agx-orin --container-runtime=nvidia

This will build a containerized version of vLLM that can run on Jetson AGX Orin using the NVIDIA Container Runtime.

Comparison with NVIDIA-provided Docker images

NVIDIA releases vLLM Docker images every month, which are optimized for various NVIDIA devices, including Jetson AGX Orin. These images are built using the same installation script and parameters mentioned above.

However, building vLLM from source allows you to customize the installation process and add additional parameters to optimize the model for your specific use case. Additionally, building from source ensures that you have the latest version of vLLM, which may not be available in the pre-built Docker images.

In summary, to build the latest vLLM from source and run it on Jetson AGX Orin, you’ll need to add the --arch=aarch64 and --target-device=jetson-agx-orin parameters to the installation script. You can also use the --container-runtime=nvidia parameter to build a containerized version of vLLM that can run on Jetson AGX Orin using the NVIDIA Container Runtime.

*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***

261142960 · November 28, 2025, 12:35am

Sorry, but I haven’t found an install.sh file in the vLLM GitHub repository. Is this file located somewhere else, or am I missing something?

AastaLLL · November 28, 2025, 6:00am

Hi,

The container requires an SBSA driver and doesn’t support Orin.
But you can check below for the vLLM Dockerfile on Jetson:

Thanks

261142960 · November 28, 2025, 6:10am

I will see this link ( jetson-containers/packages/llm/vllm at master · dusty-nv/jetson-containers · GitHub ), thank you very much.

and btw, you say “The container requires an SBSA driver and doesn’t support Orin“, do you mean the “nvcr.io/nvidia/vllm:25.11-py3“ container?

AastaLLL · December 1, 2025, 10:47am

Hi,

Yes, the nvcr.io/nvidia/vllm:25.11-py3 container requires the SBSA driver.
For Orin, which uses nvgpu, usually the container will have igpi or r36 tag.

Thanks.

261142960 · December 3, 2025, 3:11am

that’s worked, thanks.

system · December 30, 2025, 7:33am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to Run vLLM ≥ 0.11.0 on Jetson AGX Orin? Jetson AGX Orin llm	8	572	December 4, 2025
Gen AI Benchmarking: LLMs and VLMs on Jetson Jetson AGX Orin llm	7	169	November 5, 2025
How to run vllm on jetpack 6.0 Jetson AGX Orin cudnn , llm	3	68	April 7, 2026
Getting Error in installing vllm on Nvidia Jetson AGX ORIN Jetson AGX Orin generative_ai	3	1197	July 12, 2024
Looking for a Tutorial for Running vLLM on AGX Orin Jetson AGX Orin containers , llm	10	259	February 25, 2026
Request for suitable vLLM Docker for Jetson AGX Orin with CUDA 12.6 Jetson AGX Orin cuda , docker , llm	3	170	November 19, 2025
How to build a specific vLLM version (0.11.1) on Jetson Orin AGX (CUDA 12.6 / JetPack 6.2 r36.4.3)? Jetson AGX Orin llm , deepseek	8	216	February 9, 2026
Vllm on Jetson AGX orin Jetson AGX Orin pytorch , generative_ai	9	4797	July 17, 2024
Getting Error in installing vllm error on Jetson orin nx jetpack6.2 Jetson Orin NX jetpack , cuda , pytorch , python	7	817	May 15, 2025
Triton Inference Server + vLLM Backend on the NVIDIA Jetson AGX Orin 64GB Developer Kit Jetson Projects generative_ai	9	1182	June 16, 2025

How to Build the Latest vLLM from Source for Jetson Using Nvidia’s Docker Images?

Related topics