Vllm on Jetson AGX orin

Hi,

I’m trying to install vllm on my Jetson AGX orin developer kit.

I’m using the following image: nvcr.io/nvidia/l4t-pytorch:r35.2.1-pth2.0-py3
and I get this error when I pip install vllm

root@jetson:/workspace# pip install vllm
Collecting vllm
  Downloading vllm-0.5.0.post1.tar.gz (743 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 743.2/743.2 kB 12.6 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [16 lines of output]
      Traceback (most recent call last):
        File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmp/pip-build-env-d1bct981/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmp/pip-build-env-d1bct981/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-d1bct981/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 415, in <module>
        File "<string>", line 341, in get_vllm_version
      RuntimeError: Unknown runtime environment
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

Note the error message Unknown runtime environment
I figured that this is thrown here vllm/setup.py at main · vllm-project/vllm · GitHub due to torch.version.cuda being none

However, when I prompt python3 and try verifying the cuda availability,

root@jetson:/workspace# python3
Python 3.8.10 (default, Nov 14 2022, 12:59:47)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.version.cuda)
11.4

Any help would be appreciated. Thanks

Hi,

Is the error triggered from the below line?

If yes, please also check if the CUDA version meets the requirement.

Thanks.

That’s correct. I’ve already noted that in the original post.

I also tried building vllm from source (pip install -e .) and tried inserting a print statement of torch.version.cuda at vllm/setup.py at main · vllm-project/vllm · GitHub. It’s printed as none.

Whereas it’s printed as 11.4 when I print it within a python prompt. (as shown in the original post)

So, to be clear, the problem is not that the CUDA version doesn’t meet the requirement but is that torch doesn’t not correctly recognize cuda version during the installation.

@pcha were you able to find a resolution for this ?

My guess is that vLLM’s requirements.txt or pyproject.toml uninstalls the built-in version of PyTorch (that was built with CUDA enabled) in lieu of a different version of PyTorch from pypi (that wasn’t built with CUDA). Or perhaps it explicitly needs run with pip3 instead of pip. Regardless, it is for reasons like this that I use jetson-containers to make sure that the right versions get installed, stay installed, and are continuously tested for CUDA functionality and performance.

I had previously tried to get vLLM building on JetPack to no avail - and IMO it is more geared for batching and server/cloud. The inferencing APIs/containers we have managed to get working through jetson-containers like MLC have near-optimum performance on Jetson - Benchmarks - NVIDIA Jetson AI Lab

Alas if you do manage to get it working, I would be happy to add it to jetson-containers build system and redistribute the images for everyone to use.

@ramitpahwa No. still working on it

@dusty_nv
using pip3 didn’t help.
I don’t think vLLM is uninstalling the existing pytorch as I’m not seeing any log messages related to that and the pytorch version remains the same.

Although it’d be great if I can get vllm working, I’m also interested in using MLC container you mentioned. I found this repo. However, the image there seems to have a pretty old version of MLC, and I cannot follow the instruction from MLC nor can find the instruction for that version.

  1. Is there a reference usage that I can follow for that specific version?
  2. I’m using the image dustynv/mlc:51fb0f4-builder-r35.4.1. Do r36 version images have the up-to-date version of MLC? If so, could you also provide a guide to upgrade my system from r35 to r36?

@pcha the dustynv/mlc:0.1.1-r36.3.0 container is a more recent version of MLC, after they transitioned to mlc_llm convert_weight builder from the mlc_llm.build way - note that in my test script for the MLC container, I still support both methods:

BTW those 0.1 and 0.1.1 versions were numbers I made up, because the MLC project is basically unversioned and I needed a better way of tracking it than GitHub SHA’s. You can also use jetson-containers to build more recent MLC, however I apply patches to each build (which you can find under the patches directory). These patches are mostly to enable Orin’s sm87 in the dependencies for all the CUDA kernels that get compiled.

At some point months back it stopped building on older Python and JetPack 5, so going forward I only build the newer MLC versions for JetPack 6. You could attempt to apt-upgrade nvidia-jetpack on your device, but I would just re-flash it fresh with SDK Manager and get a clean start with it (after backing up your work)

Thanks a lot.

Although I’d love to upgrade to JetPack6, it seems impossible to do it with my m1 mac.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.