Launch vLLM on Jetson AGX Xavier

Hi!

I need to launch the vLLM framework on Jetson AGX Xavier.
My environment:

Linux for Tegra version: R35.6;
CUDA version: 11.4;

I read that there are no containers for R35:
Suitable VLLM Container for Jetson Xavier NX with JetPack 5.1.4. It is recommended to build the container manually.

However, when I run:

jetson-containers build vllm:0.15.1

I get the following error:

jetson-containers build` failed after 3.1 seconds (0.1 minutes) 
[20:41:04] Error: "couldn't find package:  triton" 

The 0.15.1 tag is the only one listed by jetson-containers build --list-packages.

The native python version for R35.6 is python3.8 but triton is only available starting from Python 3.10. I updated the python3 symlink to point to Python 3.11 and Installed triton but it had no effect, I still get the same error and Python version inside the container remains unchanged:

 > jetson-containers build vllm:0.15.1:
....
┌───────────────────────┬────────────────────────┐
│ L4T_VERSION   35.6.4  │ JETPACK_VERSION  6.2   │
│ CUDA_VERSION  11.4    │ PYTHON_VERSION   3.8   │
│ SYSTEM_ARCH   aarch64 │ LSB_RELEASE      20.04 │
└───────────────────────┴────────────────────────┘ 

Questions

  1. Is there a way to fix the couldn't find package: triton error when building the vLLM container on Jetson AGX Xavier?
  2. More generally, is there any way to run the vLLM framework on this device, or are their architectures incompatible, as in the case of local_llm?

Thanks!

Hi,

vLLM 0.15.1 is a recent release so it might not be able to run on the JetPack 5.
But we have some container that support r35, you can check if this can meet your requirement:

Thanks.

Thank you for the feedback. I’ll take a look at llama.cpp.