Nano_LLM or nanollm for Python package?

baqwas · May 12, 2025, 2:49am

The NanoLLM documentation at https://www.jetson-ai-lab.com/tutorial_nano-llm.html references nano_llm as a Python package. I couldn’t install it neither via PyCharm nor with pip. There is nanollm as an installable Python package, but it does not have NanoLLM as an entry. Is there an update to the documentation?

Also, I fully understand the resource limitations in keeping documentation current, especially with respect to the ever changing models at HuggingFace. There are some challenges (perhaps owing to my ignorance) in getting access to the documented models, mainly the later versions of gemma and llama. I don’t have issues with my HuggingFace token and since permission has been granted routinely by Meta and Google to their respective models, I can only assume that the documentation at Jetson AI Lab needs corresponding upgrades.

Since I’ve cloned jetson-containers and try my best to keep it current, and setting aside the occasional typos at my end, I feel that my experience may provide the technical writers of these tutorials some pointers on how to clarify comprehension issues for beginners. Thanks.

Regards.

AastaLLL · May 12, 2025, 7:56am

Hi,

Have you tried it with dustynv/mlc:r36.4.0 container?
We have verified that Gemma 2 and Llama 3.2/3.1 can run on Orin Nano with MLC.

You can find the detailed commands on the model page below:

Thanks.

baqwas · May 12, 2025, 5:13pm

Based on the information at the link you provided, I ran the following command:

$ docker run -it --rm \
  --name llm_server \
  --gpus all \
  -p 9000:9000 \
  -e DOCKER_PULL=always --pull always \
  -e HF_HUB_CACHE=/root/.cache/huggingface \
  -v /mnt/nvme/cache:/root/.cache \
  dustynv/mlc:r36.4.0 \
    sudonim serve \
      --model dusty-nv/Llama-3.2-1B-Instruct-q4f16_ft-MLC \
      --quantization q4f16_ft \
      --max-batch-size 1 \
      --host 0.0.0.0 \
      --port 9000
r36.4.0: Pulling from dustynv/mlc
Digest: sha256:bee28a47c30e022fac458299844c2dd1cb2de26b768d7c104881e1bb1b7082d2
Status: Image is up to date for dustynv/mlc:r36.4.0
docker: Error response from daemon: failed to set up container networking: driver failed programming external connectivity on endpoint llm_server (cf685c1a813b376c2adb550bbdd42e957e4e16abf9fd715fe16a69e289145bdb): Unable to enable DIRECT ACCESS FILTERING - DROP rule:  (iptables failed: iptables --wait -t raw -A PREROUTING -p tcp -d 172.17.0.2 --dport 9000 ! -i docker0 -j DROP: iptables v1.8.7 (legacy): can't initialize iptables table `raw': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.
 (exit status 3))

Run 'docker run --help' for more information
$ ifconfig -a

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        inet6 fe80::bc2d:fbff:fe05:fae8  prefixlen 64  scopeid 0x20<link>
        ether be:2d:fb:05:fa:e8  txqueuelen 0  (Ethernet)
        RX packets 2  bytes 56 (56.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 5  bytes 566 (566.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enP8p1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.0.32  netmask 255.255.254.0  broadcast 172.16.1.255
        inet6 fe80::6aa7:c933:f327:66da  prefixlen 64  scopeid 0x20<link>
        ether 48:b0:2d:f7:54:9d  txqueuelen 1000  (Ethernet)
        RX packets 32713413  bytes 47931526712 (47.9 GB)
        RX errors 0  dropped 65971  overruns 0  frame 0
        TX packets 8802537  bytes 2467886371 (2.4 GB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 20  base 0xd000  

... <deleted entries>

wlP1p1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.139  netmask 255.255.254.0  broadcast 172.16.1.255
        inet6 fe80::72a6:3e62:7340:934  prefixlen 64  scopeid 0x20<link>
        ether 60:ff:9e:25:25:03  txqueuelen 1000  (Ethernet)
        RX packets 10615  bytes 238572419 (238.5 MB)
        RX errors 0  dropped 62815  overruns 0  frame 0
        TX packets 392  bytes 2704827 (2.7 MB)
        TX errors 0  dropped 151 overruns 0  carrier 0  collisions 0

Two observations:

I was invoking the Meta version of Llama models originally (Hugging Face) and not the dusty-nv one; thanks for the correction
Obviously, the dusty-nv/mlc:r36.4.0 container installs without any issue but my choice of model(s) remains an issue

So I need the newbie guidance on which specific model(s) I should run for my first successful test.
Docker is running on 172.17.0.1
Nano is running on 172.16.0.32 (enP8p1s0) & 172.16.1.139 (p1s0)

I’ll try the gemma2 models shortly.

Thanks.

Regards.

P.S.
Oops!
I need to run the commands again. My mistake: I don’t have /mnt/nvme/cache. Sorry, let me fix that and run again.

AastaLLL · May 14, 2025, 10:41am

Hi,

Just want to double-confirm that are you able to run it with the suggestion shared in the below topic:

Thanks.

baqwas · May 15, 2025, 3:12am

No, I ran into a basic torchvision test error message:

$ docker -v
Docker version 27.5.1, build 9f9e405
$ jetson-containers build text-generation-webui
... 
Successfully installed pillow-11.2.1 torchvision-0.21.0
++ lsb_release -rs
+ '[' 22.04 = 20.04 ']'
 ---> Removed intermediate container 3c1f2f43288b
 ---> e1c2af11c8a6
Successfully built e1c2af11c8a6
Successfully tagged text-generation-webui:r36.4-cu126-22.04-torchvision
...
docker run -t --rm --network=host --runtime=nvidia \
  --volume /ssd/projects/jetson-containers/packages/pytorch/torchvision:/test \
  --volume /ssd/projects/jetson-containers/data:/data \
  --workdir /test \
  text-generation-webui:r36.4-cu126-22.04-torchvision \
    /bin/bash -c 'python3 test.py


testing torchvision...
Traceback (most recent call last):
  File "/test/test.py", line 12, in <module>
    import torchvision
  File "/usr/local/lib/python3.10/dist-packages/torchvision/__init__.py", line 1
0, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, tran
sforms, utils  # usort:skip
  File "/usr/local/lib/python3.10/dist-packages/torchvision/_meta_registrations.
py", line 164, in <module>
    def meta_nms(dets, scores, iou_threshold):
  File "/usr/local/lib/python3.10/dist-packages/torch/library.py", line 828, in 
register
    use_lib._register_fake(op_name, func, _stacklevel=stacklevel + 1)
  File "/usr/local/lib/python3.10/dist-packages/torch/library.py", line 198, in 
_register_fake
    handle = entry.fake_impl.register(func_to_register, source)
  File "/usr/local/lib/python3.10/dist-packages/torch/_library/fake_impl.py", li
ne 31, in register
    if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
RuntimeError: operator torchvision::nms does not exist
[21:58:05] Failed building:  text-generation-webui

Traceback (most recent call last):
  File "/ssd/projects/jetson-containers/jetson_containers/build.py", line 129, i
n <module>
    build_container(**vars(args))
  File "/ssd/projects/jetson-containers/jetson_containers/container.py", line 24
4, in build_container
    test_container(container_name, pkg, simulate)
  File "/ssd/projects/jetson-containers/jetson_containers/container.py", line 43
1, in test_container
    status = subprocess.run(cmd.replace(_NEWLINE_, ' '), executable='/bin/bash',
 shell=True, check=True)
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'docker run -t --rm --network=host --runt
ime=nvidia   --volume /ssd/projects/jetson-containers/packages/pytorch/torchvisi
on:/test   --volume /ssd/projects/jetson-containers/data:/data   --workdir /test
   text-generation-webui:r36.4-cu126-22.04-torchvision     /bin/bash -c 'python3
 test.py' 2>&1 | tee /ssd/projects/jetson-containers/logs/20250514_214139/test/t
ext-generation-webui_r36.4-cu126-22.04-torchvision_test.py.txt; exit ${PIPESTATU
S[0]}' returned non-zero exit status 1.

I’ve backtracked a bit trying to understand all the options available when invoking jetson-containers to workaround the error from:

RuntimeError: operator torchvision::nms does not exist

Regards.

P.S.
The stable-diffusion container (suggested in the original thread) is built, but the server.py script does not run successfully (I believe) because port 7860 doesn’t open up.

baqwas · May 15, 2025, 4:14am

Currently, the installed pytorch and torchvision version numbers are:

>>> import torch
>>> torch.__version__
'2.7.0+cpu'
>>> import torchvision
>>> torchvision.__version__
'0.22.0'

These were installed during the jetson-containers build text-generation-webui step.

Regards.

baqwas · May 15, 2025, 4:26am

Basic check:

$ python3 ./check_nms.py 
Your installed torchvision version 0.22.0 supports nms.
Example nms output: tensor([0, 1])
$

AastaLLL · May 15, 2025, 7:24am

Hi,

By default, it should download the package from the below link:

Does it work if you manually update the package?
Thanks.

system · June 17, 2025, 6:32am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Running NanoLLM Docker on Jetson Orin Nano FileNotFoundError Jetson Orin Nano generative_ai , llama	5	148	April 9, 2025
Local_llm vs NanoLLM: Help Getting NanoLLM up & running Jetson Orin Nano generative_ai	7	1107	April 17, 2024
Available with Small Language Model on tutorial Jetson Orin Nano generative_ai	3	788	May 3, 2024
NanoVLM Issue on Jetson Orin Nano Jetson Orin Nano generative_ai	9	743	June 6, 2024
MiniGPT-4 on Jetson Orin Nano 8Gb Dev kit not working Jetson Orin Nano generative_ai	9	419	May 28, 2024
Can't run NanoVLM on Jetson Orin NX 16GB Jetson Orin NX generative_ai	4	274	May 16, 2024
Error on following "NanoVLM - Efficient Multimodal Pipeline" Jetson Orin Nano generative_ai	2	241	May 24, 2024
LLM on Jetson Nano 4GB B01 Jetson Nano conversational-ai , generative_ai	13	3223	August 12, 2024
VILA 1.5 3B on Jetson Orin Nano Jetson Orin Nano jetson-inference , inception , generative_ai	4	825	June 5, 2024
Jetson Container `Nano_llm` version 24.6-r36.2.0 error on Jepack 6.0 DP Jetson Orin NX containers , generative_ai	5	267	July 4, 2024

Nano_LLM or nanollm for Python package?

Related topics