Nano_LLM or nanollm for Python package?

The NanoLLM documentation at https://www.jetson-ai-lab.com/tutorial_nano-llm.html references nano_llm as a Python package. I couldn’t install it neither via PyCharm nor with pip. There is nanollm as an installable Python package, but it does not have NanoLLM as an entry. Is there an update to the documentation?

Also, I fully understand the resource limitations in keeping documentation current, especially with respect to the ever changing models at HuggingFace. There are some challenges (perhaps owing to my ignorance) in getting access to the documented models, mainly the later versions of gemma and llama. I don’t have issues with my HuggingFace token and since permission has been granted routinely by Meta and Google to their respective models, I can only assume that the documentation at Jetson AI Lab needs corresponding upgrades.

Since I’ve cloned jetson-containers and try my best to keep it current, and setting aside the occasional typos at my end, I feel that my experience may provide the technical writers of these tutorials some pointers on how to clarify comprehension issues for beginners. Thanks.

Regards.

Hi,

Have you tried it with dustynv/mlc:r36.4.0 container?
We have verified that Gemma 2 and Llama 3.2/3.1 can run on Orin Nano with MLC.

You can find the detailed commands on the model page below:

Thanks.

Based on the information at the link you provided, I ran the following command:

$ docker run -it --rm \
  --name llm_server \
  --gpus all \
  -p 9000:9000 \
  -e DOCKER_PULL=always --pull always \
  -e HF_HUB_CACHE=/root/.cache/huggingface \
  -v /mnt/nvme/cache:/root/.cache \
  dustynv/mlc:r36.4.0 \
    sudonim serve \
      --model dusty-nv/Llama-3.2-1B-Instruct-q4f16_ft-MLC \
      --quantization q4f16_ft \
      --max-batch-size 1 \
      --host 0.0.0.0 \
      --port 9000
r36.4.0: Pulling from dustynv/mlc
Digest: sha256:bee28a47c30e022fac458299844c2dd1cb2de26b768d7c104881e1bb1b7082d2
Status: Image is up to date for dustynv/mlc:r36.4.0
docker: Error response from daemon: failed to set up container networking: driver failed programming external connectivity on endpoint llm_server (cf685c1a813b376c2adb550bbdd42e957e4e16abf9fd715fe16a69e289145bdb): Unable to enable DIRECT ACCESS FILTERING - DROP rule:  (iptables failed: iptables --wait -t raw -A PREROUTING -p tcp -d 172.17.0.2 --dport 9000 ! -i docker0 -j DROP: iptables v1.8.7 (legacy): can't initialize iptables table `raw': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.
 (exit status 3))

Run 'docker run --help' for more information
$ ifconfig -a

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        inet6 fe80::bc2d:fbff:fe05:fae8  prefixlen 64  scopeid 0x20<link>
        ether be:2d:fb:05:fa:e8  txqueuelen 0  (Ethernet)
        RX packets 2  bytes 56 (56.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 5  bytes 566 (566.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enP8p1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.0.32  netmask 255.255.254.0  broadcast 172.16.1.255
        inet6 fe80::6aa7:c933:f327:66da  prefixlen 64  scopeid 0x20<link>
        ether 48:b0:2d:f7:54:9d  txqueuelen 1000  (Ethernet)
        RX packets 32713413  bytes 47931526712 (47.9 GB)
        RX errors 0  dropped 65971  overruns 0  frame 0
        TX packets 8802537  bytes 2467886371 (2.4 GB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 20  base 0xd000  

... <deleted entries>

wlP1p1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.139  netmask 255.255.254.0  broadcast 172.16.1.255
        inet6 fe80::72a6:3e62:7340:934  prefixlen 64  scopeid 0x20<link>
        ether 60:ff:9e:25:25:03  txqueuelen 1000  (Ethernet)
        RX packets 10615  bytes 238572419 (238.5 MB)
        RX errors 0  dropped 62815  overruns 0  frame 0
        TX packets 392  bytes 2704827 (2.7 MB)
        TX errors 0  dropped 151 overruns 0  carrier 0  collisions 0

Two observations:

  • I was invoking the Meta version of Llama models originally (Hugging Face) and not the dusty-nv one; thanks for the correction
  • Obviously, the dusty-nv/mlc:r36.4.0 container installs without any issue but my choice of model(s) remains an issue

So I need the newbie guidance on which specific model(s) I should run for my first successful test.
Docker is running on 172.17.0.1
Nano is running on 172.16.0.32 (enP8p1s0) & 172.16.1.139 (p1s0)

I’ll try the gemma2 models shortly.

Thanks.

Regards.

P.S.
Oops!
I need to run the commands again. My mistake: I don’t have /mnt/nvme/cache. Sorry, let me fix that and run again.

Hi,

Just want to double-confirm that are you able to run it with the suggestion shared in the below topic:

Thanks.

No, I ran into a basic torchvision test error message:

$ docker -v
Docker version 27.5.1, build 9f9e405
$ jetson-containers build text-generation-webui
... 
Successfully installed pillow-11.2.1 torchvision-0.21.0
++ lsb_release -rs
+ '[' 22.04 = 20.04 ']'
 ---> Removed intermediate container 3c1f2f43288b
 ---> e1c2af11c8a6
Successfully built e1c2af11c8a6
Successfully tagged text-generation-webui:r36.4-cu126-22.04-torchvision
...
docker run -t --rm --network=host --runtime=nvidia \
  --volume /ssd/projects/jetson-containers/packages/pytorch/torchvision:/test \
  --volume /ssd/projects/jetson-containers/data:/data \
  --workdir /test \
  text-generation-webui:r36.4-cu126-22.04-torchvision \
    /bin/bash -c 'python3 test.py


testing torchvision...
Traceback (most recent call last):
  File "/test/test.py", line 12, in <module>
    import torchvision
  File "/usr/local/lib/python3.10/dist-packages/torchvision/__init__.py", line 1
0, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, tran
sforms, utils  # usort:skip
  File "/usr/local/lib/python3.10/dist-packages/torchvision/_meta_registrations.
py", line 164, in <module>
    def meta_nms(dets, scores, iou_threshold):
  File "/usr/local/lib/python3.10/dist-packages/torch/library.py", line 828, in 
register
    use_lib._register_fake(op_name, func, _stacklevel=stacklevel + 1)
  File "/usr/local/lib/python3.10/dist-packages/torch/library.py", line 198, in 
_register_fake
    handle = entry.fake_impl.register(func_to_register, source)
  File "/usr/local/lib/python3.10/dist-packages/torch/_library/fake_impl.py", li
ne 31, in register
    if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
RuntimeError: operator torchvision::nms does not exist
[21:58:05] Failed building:  text-generation-webui

Traceback (most recent call last):
  File "/ssd/projects/jetson-containers/jetson_containers/build.py", line 129, i
n <module>
    build_container(**vars(args))
  File "/ssd/projects/jetson-containers/jetson_containers/container.py", line 24
4, in build_container
    test_container(container_name, pkg, simulate)
  File "/ssd/projects/jetson-containers/jetson_containers/container.py", line 43
1, in test_container
    status = subprocess.run(cmd.replace(_NEWLINE_, ' '), executable='/bin/bash',
 shell=True, check=True)
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'docker run -t --rm --network=host --runt
ime=nvidia   --volume /ssd/projects/jetson-containers/packages/pytorch/torchvisi
on:/test   --volume /ssd/projects/jetson-containers/data:/data   --workdir /test
   text-generation-webui:r36.4-cu126-22.04-torchvision     /bin/bash -c 'python3
 test.py' 2>&1 | tee /ssd/projects/jetson-containers/logs/20250514_214139/test/t
ext-generation-webui_r36.4-cu126-22.04-torchvision_test.py.txt; exit ${PIPESTATU
S[0]}' returned non-zero exit status 1.

I’ve backtracked a bit trying to understand all the options available when invoking jetson-containers to workaround the error from:

RuntimeError: operator torchvision::nms does not exist

Regards.

P.S.
The stable-diffusion container (suggested in the original thread) is built, but the server.py script does not run successfully (I believe) because port 7860 doesn’t open up.

Currently, the installed pytorch and torchvision version numbers are:

>>> import torch
>>> torch.__version__
'2.7.0+cpu'
>>> import torchvision
>>> torchvision.__version__
'0.22.0'

These were installed during the jetson-containers build text-generation-webui step.

Regards.

Basic check:

$ python3 ./check_nms.py 
Your installed torchvision version 0.22.0 supports nms.
Example nms output: tensor([0, 1])
$ 

Hi,

By default, it should download the package from the below link:

Does it work if you manually update the package?
Thanks.