How to Run NVILA-8B Model with NanoLLM on Jetson AGX Orin?

kouta21146 · February 24, 2025, 1:44pm

Hello

I’m trying to use the NVILA-8B model from the Efficient-Large-Model repository(Efficient-Large-Model/NVILA-8B · Hugging Face) on a Jetson AGX Orin with NanoLLM. However, when I run the following command, the model fails to start:

jetson-containers run $(autotag nano_llm) \
  python3 -m nano_llm.chat --api=mlc \
    --model Efficient-Large-Model/NVILA-8b

I suspect there isn’t a Docker image that currently includes support for NVILA-8B out of the box.
Does anyone know if there is a prebuilt Docker image that can run NVILA-8B on Jetson AGX Orin, or how to build/configure one so that NVILA-8B can be used with the MLC back end? Any help or instructions would be greatly appreciated.

Thank you!

AastaLLL · February 25, 2025, 3:23am

Hi,

Could you share the error message with us?
There is a known docker issue due to the recent docker 28.0.0 release.

You can find more info in the below comment:

Thanks.

kouta21146 · February 26, 2025, 2:46pm

Thank you for your response. Here is an overview of the error.

Error Overview

Inside the jetson-containers, I ran the following command:


python3 -m nano_llm.chat --api=mlc --model Efficient-Large-Model/NVILA-8b

Steps and Issues Encountered:

1.An error occurred stating that mlc_llm.build does not support the quen2 model.

2.I resolved this issue by upgrading mlc_llm from version 0.1.0 to 0.19.0

pip install mlc-llm --upgrade

3.After that, I encountered an incompatibility error between mlc_llm,awq, and tvm, so I upgraded them as well


pip install awq --upgrade

pip install tvm --upgrade

4.Then, I ran the same command again:


python3 -m nano_llm.chat --api=mlc --model Efficient-Large-Model/NVILA-8b

5.However, this time, I received an error stating that mlc_llm.build command was not found.

6.Upon checking mlc_llm version 0.19.0, I noticed that build.py is no longer present, and I am unsure how to build NVILA-8b with the new version.

Request:

Could you provide guidance on how to build NVILA-8b with the new version?

Alternatively, is there an updated Docker image available for it?

Thanks.

mb306 · March 11, 2025, 12:28pm

Did you have a soluation for this? Or did you manage to run NVILA in another way?

kouta21146 · March 11, 2025, 2:03pm

No, I don’t have any solutions yet.

AastaLLL · March 31, 2025, 7:58am

Hi,

We test NVILA with nano_llm and meet some support error like below:

# python3 -m nano_llm.chat --model Efficient-Large-Model/NVILA-8B --api=mlc 
...
07:55:59 | INFO | running MLC quantization:

python3 -m mlc_llm.build --model /data/models/mlc/dist/models/NVILA-8B --quantization q4f16_ft --target cuda --use-cuda-graph --use-flash-attn-mqa --sep-embed --max-seq-len 32768 --artifact-path /data/models/mlc/dist/NVILA-8B/ctx32768 --use-safetensors 


Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/build.py", line 47, in <module>
    main()
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/build.py", line 41, in main
    parsed_args = core._parse_args(parsed_args)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/core.py", line 444, in _parse_args
    parsed = _setup_model_path(parsed)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/core.py", line 494, in _setup_model_path
    validate_config(args.model_path)
  File "/usr/local/lib/python3.10/dist-packages/mlc_llm/core.py", line 538, in validate_config
    config["model_type"] in utils.supported_model_types
AssertionError: Model type qwen2 not supported.

We will test this with the latest mlc_llm release and provide more info to you later.
Thanks.

AastaLLL · April 15, 2025, 3:33am

Hi, all

Thanks for your patience.
For NVILA, please try the server.py included in this container image: dustynv/vila:r36.4.0-cu128-24.04

Thanks.

system · April 29, 2025, 3:34am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
nanoLLM on Docker Jetson Orin Nano generative_ai	10	244	September 10, 2024
Cannot use VILA on jetson-containers Jetson AGX Orin generative_ai	4	419	June 4, 2024
Local_llm vs NanoLLM: Help Getting NanoLLM up & running Jetson Orin Nano generative_ai	7	1068	April 17, 2024
Nano_LLM or nanollm for Python package? Jetson Orin Nano generative_ai , llama	8	70	May 15, 2025
Error on following "NanoVLM - Efficient Multimodal Pipeline" Jetson Orin Nano generative_ai	2	236	May 24, 2024
Failed to MLC-compile mlc-ai/Llama-3.1-8B-Instruct-fp8-MLC on Jetson AGX orin Jetson AGX Orin generative_ai , llama-31-8b-instruct , llama	5	141	January 13, 2025
Jetson orin nano starter kit pro Jetson Orin Nano python	25	132	June 24, 2025
Jetson Container `Nano_llm` version 24.6-r36.2.0 error on Jepack 6.0 DP Jetson Orin NX containers , generative_ai	5	257	July 4, 2024
Torch2trt fail when building container from jetson-containers Jetson AGX Orin tensorrt	10	98	February 10, 2025
Error Running nanoLLM on Jetson Orin Nano with JetPack 6.1: Subprocess SIGKILL Issue Jetson Nano generative_ai	2	60	January 5, 2025

How to Run NVILA-8B Model with NanoLLM on Jetson AGX Orin?

Error Overview

Related topics