Errors on tutorial NanoVLM

ygoongood12 · May 24, 2024, 7:32am

I followed the tutorial “NanoVLM - Efficient Multimodal Pipeline” (NanoVLM - NVIDIA Jetson AI Lab) on Jetson Orin Nano (8GB) (with 128GB SD card).
I proceeded with the following command for three models (change the “–model” flag), each with a different error.

I want to know how to solve each error.

jetson-containers run $(autotag nano_llm) \
  python3 -m nano_llm.chat --api=mlc \
    --model Efficient-Large-Model/VILA1.5-3b \
    --max-context-len 256 \
    --max-new-tokens 32

VILA1.5-3b (–model Efficient-Large-Model/VILA1.5-3b)
After the model download was completed, an error occurred during the mlc quantization process, saying, “Exception: The model config should have continued information about maximum sequence length.”
Details are as follows.

seongkyu@ubuntu:~$ jetson-containers run $(autotag nano_llm) \
>   python3 -m nano_llm.chat --api=mlc \
>     --model Efficient-Large-Model/VILA1.5-3b \
>     --max-context-len 256 \
>     --max-new-tokens 32
Namespace(disable=[''], output='/tmp/autotag', packages=['nano_llm'], prefer=['local', 'registry', 'build'], quiet=False, user='dustynv', verbose=False)
-- L4T_VERSION=35.5.0  JETPACK_VERSION=5.1  CUDA_VERSION=11.4
-- Finding compatible container image for ['nano_llm']
dustynv/nano_llm:r35.4.1
[sudo] password for seongkyu:
localuser:root being added to access control list
+ docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/seongkyu/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:1 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/video0 --device /dev/video1 --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-3 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-6 --device /dev/i2c-7 --device /dev/i2c-8 --device /dev/i2c-9 dustynv/nano_llm:r35.4.1 python3 -m nano_llm.chat --api=mlc --model Efficient-Large-Model/VILA1.5-3b --max-context-len 256 --max-new-tokens 32
/usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
Fetching 13 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 122.77it/s]
Fetching 17 files:   0%|                                                                                                                                                            | 0/17 [00:00<?, ?it/s]
llm/model-00001-of-00002.safetensors:  59%|████████████████████████████████████████████████████████████████████████▋                                                   | 2.92G/4.97G [39:59<48:23, 709kBllm/model-00001-of-00002.safetensors:  59%|████████████████████████████████████████████████████████████████████████▉                                                   | 2.93G/4.97G [40:02<48:59, 697kBllm/model-00001-of-00002.safetensors:  59%|█████████████████████████████████████████████████████████████████████████▏                                                  | 2.94G/4.97G [40:17<48:17, 703kBllm/model-00001-of-00002.safetensors:  59%|█████████████████████████████████████████████████████████████████████████▏                                                  | 2.94G/4.97G [40:29<48:17, 703kBllm/model-00llm/model-00001-of-00002.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.97G/4.97G [1:12:51<00:00, 721kB/s]
Fetching 17 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [1:12:51<00:00, 257.17s/it]
09:01:17 | INFO | loading /data/models/huggingface/models--Efficient-Large-Model--VILA1.5-3b/snapshots/699b413ed13620957e955bd7fb938852afa258fc with MLC
09:01:20 | INFO | backing up original model config to /data/models/huggingface/models--Efficient-Large-Model--VILA1.5-3b/snapshots/699b413ed13620957e955bd7fb938852afa258fc/config.json.backup
09:01:20 | INFO | patching model config with {'model_type': 'llama'}
09:01:20 | INFO | running MLC quantization:

python3 -m mlc_llm.build --model /data/models/mlc/dist/models/VILA1.5-3b --quantization q4f16_ft --target cuda --use-cuda-graph --use-flash-attn-mqa --sep-embed --max-seq-len 256 --artifact-path /data/models/mlc/dist/VILA1.5-3b-ctx256


Using path "/data/models/mlc/dist/models/VILA1.5-3b" for model "VILA1.5-3b"
Target configured: cuda -keys=cuda,gpu -arch=sm_87 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.8/dist-packages/mlc_llm/build.py", line 47, in <module>
    main()
  File "/usr/local/lib/python3.8/dist-packages/mlc_llm/build.py", line 43, in main
    core.build_model_from_args(parsed_args)
  File "/usr/local/lib/python3.8/dist-packages/mlc_llm/core.py", line 834, in build_model_from_args
    mod, param_manager, params, model_config = model_generators[args.model_category].get_model(
  File "/usr/local/lib/python3.8/dist-packages/mlc_llm/relax_model/llama.py", line 1333, in get_model
    raise Exception(
Exception: The model config should contain information about maximum sequence length.
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/opt/NanoLLM/nano_llm/chat/__main__.py", line 29, in <module>
    model = NanoLLM.from_pretrained(
  File "/opt/NanoLLM/nano_llm/nano_llm.py", line 71, in from_pretrained
    model = MLCModel(model_path, **kwargs)
  File "/opt/NanoLLM/nano_llm/models/mlc.py", line 59, in __init__
    quant = MLCModel.quantize(model_path, self.config, method=quantization, max_context_len=max_context_len, **kwargs)
  File "/opt/NanoLLM/nano_llm/models/mlc.py", line 278, in quantize
    subprocess.run(cmd, executable='/bin/bash', shell=True, check=True)  
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'python3 -m mlc_llm.build --model /data/models/mlc/dist/models/VILA1.5-3b --quantization q4f16_ft --target cuda --use-cuda-graph --use-flash-attn-mqa --sep-embed --max-seq-len 256 --artifact-path /data/models/mlc/dist/VILA1.5-3b-ctx256 ' returned non-zero exit status 1.

Obsidian-3B (–model NousResearch/Obsidian-3B-V0.5)
I thought the reason for the previous problem was that there was no information about maximum sequence length only in the ‘VILA1.5-3b’ model file of huggingface, so I changed the model and proceeded.
However, it stopped while downloading this model from huggingface, and the following error occurred.

1-111 (1)1836×989 210 KB

2-2221842×291 55.6 KB

I think we should have access to the “force_download” and “resume_download” parameters inside nano_llm, but the tutorial doesn’t say how.
Llava-7b (–model liuhaotian/llava-v1.6-vicuna-7b)
I thought the reason why the problem number 2 occurred was because the huggingface server downloading the model was unstable, so I changed the model once again.
llava-7b proceeded well to download the model and quantization process.
However, after entering the path of the image file into the prompt and entering a question about it, the following internalerror occurred.

seongkyu@ubuntu:~$ jetson-containers run $(autotag nano_llm) \
>   python3 -m nano_llm.chat --api=mlc \
>     --model liuhaotian/llava-v1.6-vicuna-7b \
>     --max-context-len 256 \
>     --max-new-tokens 32 \
>     --prompt /data/prompts/images.json
Namespace(disable=[''], output='/tmp/autotag', packages=['nano_llm'], prefer=['local', 'registry', 'build'], quiet=False, user='dustynv', verbose=False)
-- L4T_VERSION=35.5.0  JETPACK_VERSION=5.1  CUDA_VERSION=11.4
-- Finding compatible container image for ['nano_llm']
dustynv/nano_llm:r35.4.1
[sudo] password for seongkyu:
!Sorry, try again.
[sudo] password for seongkyu:
Sorry, try again.
[sudo] password for seongkyu:
localuser:root being added to access control list
+ docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/seongkyu/jetson-containers/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:1 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/video0 --device /dev/video1 --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-3 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-6 --device /dev/i2c-7 --device /dev/i2c-8 --device /dev/i2c-9 dustynv/nano_llm:r35.4.1 python3 -m nano_llm.chat --api=mlc --model liuhaotian/llava-v1.6-vicuna-7b --max-context-len 256 --max-new-tokens 32 --prompt /data/prompts/images.json
/usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.18) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
04:45:14 | INFO | loading prompts from /data/prompts/images.json
Fetching 10 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 59493.67it/s]
Fetching 13 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 101.75it/s]
04:45:15 | INFO | loading /data/models/huggingface/models--liuhaotian--llava-v1.6-vicuna-7b/snapshots/deae57a8c0ccb0da4c2661cc1891cc9d06503d11 with MLC
04:45:19 | INFO | device=cuda(0), name=Orin, compute=8.7, max_clocks=624000, multiprocessors=8, max_thread_dims=[1024, 1024, 64], api_version=11040, driver_version=None
04:45:19 | INFO | loading llava-v1.6-vicuna-7b from /data/models/mlc/dist/llava-v1.6-vicuna-7b-ctx256/llava-v1.6-vicuna-7b-q4f16_ft/llava-v1.6-vicuna-7b-q4f16_ft-cuda.so
04:45:28 | WARNING | model library /data/models/mlc/dist/llava-v1.6-vicuna-7b-ctx256/llava-v1.6-vicuna-7b-q4f16_ft/llava-v1.6-vicuna-7b-q4f16_ft-cuda.so was missing metadata
04:46:23 | INFO | loading clip vision model openai/clip-vit-large-patch14-336
<class 'transformers.models.clip.image_processing_clip.CLIPImageProcessor'> openai/clip-vit-large-patch14-336 CLIPImageProcessor {
  "_valid_processor_keys": [
    "images",
    "do_resize",
    "size",
    "resample",
    "do_center_crop",
    "crop_size",
    "do_rescale",
    "rescale_factor",
    "do_normalize",
    "image_mean",
    "image_std",
    "do_convert_rgb",
    "return_tensors",
    "data_format",
    "input_data_format"
  ],
  "crop_size": {
    "height": 336,
    "width": 336
  },
  "do_center_crop": true,
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "CLIPImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "shortest_edge": 336
  }
}

<class 'transformers.models.clip.modeling_clip.CLIPVisionModelWithProjection'> openai/clip-vit-large-patch14-336 CLIPVisionModelWithProjection(
  (vision_model): CLIPVisionTransformer(
    (embeddings): CLIPVisionEmbeddings(
      (patch_embedding): Conv2d(3, 1024, kernel_size=(14, 14), stride=(14, 14), bias=False)
      (position_embedding): Embedding(577, 1024)
    )
    (pre_layrnorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
    (encoder): CLIPEncoder(
      (layers): ModuleList(
        (0-23): 24 x CLIPEncoderLayer(
          (self_attn): CLIPAttention(
            (k_proj): Linear(in_features=1024, out_features=1024, bias=True)
            (v_proj): Linear(in_features=1024, out_features=1024, bias=True)
            (q_proj): Linear(in_features=1024, out_features=1024, bias=True)
            (out_proj): Linear(in_features=1024, out_features=1024, bias=True)
          )
          (layer_norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
          (mlp): CLIPMLP(
            (activation_fn): QuickGELUActivation()
            (fc1): Linear(in_features=1024, out_features=4096, bias=True)
            (fc2): Linear(in_features=4096, out_features=1024, bias=True)
          )
          (layer_norm2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
        )
      )
    )
    (post_layernorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
  )
  (visual_projection): Linear(in_features=1024, out_features=768, bias=False)
)
┌──────────────┬───────────────────────────────────┐
│ name         │ openai/clip-vit-large-patch14-336 │
├──────────────┼───────────────────────────────────┤
│ input_shape  │ (336, 336)                        │
├──────────────┼───────────────────────────────────┤
│ output_shape │ torch.Size([1, 768])              │
└──────────────┴───────────────────────────────────┘
04:47:44 | INFO | loading mm_projector weights from /data/models/huggingface/models--liuhaotian--llava-v1.6-vicuna-7b/snapshots/deae57a8c0ccb0da4c2661cc1891cc9d06503d11/mm_projector.bin
mm_projector Sequential(
  (0): Linear(in_features=1024, out_features=4096, bias=True)
  (1): GELU(approximate='none')
  (2): Linear(in_features=4096, out_features=4096, bias=True)
)
┌────────────────────────────┬────────────────────────────────────────────────────────────────┐
│ _name_or_path              │ ./checkpoints/vicuna-7b-v1-5                                   │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ architectures              │ ['LlavaLlamaForCausalLM']                                      │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ attention_bias             │ False                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ attention_dropout          │ 0.0                                                            │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ bos_token_id               │ 1                                                              │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ eos_token_id               │ 2                                                              │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ freeze_mm_mlp_adapter      │ False                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ freeze_mm_vision_resampler │ False                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ hidden_act                 │ silu                                                           │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ hidden_size                │ 4096                                                           │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ image_aspect_ratio         │ anyres                                                         │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ image_crop_resolution      │ 224                                                            │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ image_grid_pinpoints       │ [[336, 672], [672, 336], [672, 672], [1008, 336], [336, 1008]] │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ image_split_resolution     │ 224                                                            │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ initializer_range          │ 0.02                                                           │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ intermediate_size          │ 11008                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ max_position_embeddings    │ 4096                                                           │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_hidden_size             │ 1024                                                           │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_patch_merge_type        │ spatial_unpad                                                  │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_projector_lr            │                                                                │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_projector_type          │ mlp2x_gelu                                                     │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_resampler_type          │                                                                │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_use_im_patch_token      │ False                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_use_im_start_end        │ False                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_vision_select_feature   │ patch                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_vision_select_layer     │ -2                                                             │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_vision_tower            │ openai/clip-vit-large-patch14-336                              │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ mm_vision_tower_lr         │ 2e-06                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ model_type                 │ llama                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ num_attention_heads        │ 32                                                             │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ num_hidden_layers          │ 32                                                             │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ num_key_value_heads        │ 32                                                             │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ pad_token_id               │ 0                                                              │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ pretraining_tp             │ 1                                                              │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ rms_norm_eps               │ 1e-05                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ rope_scaling               │                                                                │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ rope_theta                 │ 10000.0                                                        │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ tie_word_embeddings        │ False                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ tokenizer_model_max_length │ 4096                                                           │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ tokenizer_padding_side     │ right                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ torch_dtype                │ bfloat16                                                       │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ transformers_version       │ 4.36.2                                                         │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ tune_mm_mlp_adapter        │ False                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ tune_mm_vision_resampler   │ False                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ unfreeze_mm_vision_tower   │ True                                                           │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ use_cache                  │ True                                                           │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ use_mm_proj                │ True                                                           │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ vocab_size                 │ 32000                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ name                       │ llava-v1.6-vicuna-7b                                           │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ api                        │ mlc                                                            │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ quant                      │ q4f16_ft                                                       │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ type                       │ llama                                                          │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ max_length                 │ 256                                                            │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ prefill_chunk_size         │ -1                                                             │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ load_time                  │ 150.4796289320002                                              │
├────────────────────────────┼────────────────────────────────────────────────────────────────┤
│ params_size                │ 3232.7265625                                                   │
└────────────────────────────┴────────────────────────────────────────────────────────────────┘

04:47:46 | INFO | using chat template 'vicuna-v1' for model llava-v1.6-vicuna-7b
04:47:46 | INFO | model 'llava-v1.6-vicuna-7b', chat template 'vicuna-v1' stop tokens:  ['</s>'] -> [2]
>> PROMPT: /data/images/dogs.jpg

>> PROMPT: What breeds of dogs are in the image?

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/NanoLLM/nano_llm/models/mlc.py", line 523, in _run
    self._generate(stream)
  File "/opt/NanoLLM/nano_llm/models/mlc.py", line 458, in _generate
    output = self._prefill(input,  # prefill_with_embed
  File "tvm/_ffi/_cython/./packed_func.pxi", line 332, in tvm._ffi._cy3.core.PackedFuncBase.__call__
  File "tvm/_ffi/_cython/./packed_func.pxi", line 277, in tvm._ffi._cy3.core.FuncCall
  File "tvm/_ffi/_cython/./base.pxi", line 182, in tvm._ffi._cy3.core.CHECK_CALL
  File "/usr/local/lib/python3.8/dist-packages/tvm/_ffi/base.py", line 481, in raise_last_ffi_error
    raise py_err
tvm.error.InternalError: Traceback (most recent call last):
  [bt] (8) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::relax_vm::VirtualMachineImpl::InvokeBytecode(long, std::vector<tvm::runtime::TVMRetValue, std::allocator<tvm::runtime::TVMRetValue> > const&)+0x230) [0xfffebf9ac6c8]
  [bt] (7) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::relax_vm::VirtualMachineImpl::RunLoop()+0x210) [0xfffebf9aad58]
  [bt] (6) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::relax_vm::VirtualMachineImpl::RunInstrCall(tvm::runtime::relax_vm::VMFrame*, tvm::runtime::relax_vm::Instruction)+0x5e4) [0xfffebf9ab5bc]
  [bt] (5) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::relax_vm::VirtualMachineImpl::InvokeClosurePacked(tvm::runtime::ObjectRef const&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)+0x7c) [0xfffebf9a99fc]
  [bt] (4) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::NDArray (tvm::runtime::memory::Storage, long, tvm::runtime::ShapeTuple, DLDataType)>::AssignTypedLambda<tvm::runtime::Registry::set_body_method<tvm::runtime::memory::Storage, tvm::runtime::memory::StorageObj, tvm::runtime::NDArray, long, tvm::runtime::ShapeTuple, DLDataType, void>(tvm::runtime::NDArray (tvm::runtime::memory::StorageObj::*)(long, tvm::runtime::ShapeTuple, DLDataType))::{lambda(tvm::runtime::memory::Storage, long, tvm::runtime::ShapeTuple, DLDataType)#1}>(tvm::runtime::Registry::set_body_method<tvm::runtime::memory::Storage, tvm::runtime::memory::StorageObj, tvm::runtime::NDArray, long, tvm::runtime::ShapeTuple, DLDataType, void>(tvm::runtime::NDArray (tvm::runtime::memory::StorageObj::*)(long, tvm::runtime::ShapeTuple, DLDataType))::{lambda(tvm::runtime::memory::Storage, long, tvm::runtime::ShapeTuple, DLDataType)#1}, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::TVMRetValue)+0x10) [0xfffebf977638]
  [bt] (3) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::TypedPackedFunc<tvm::runtime::NDArray (tvm::runtime::memory::Storage, long, tvm::runtime::ShapeTuple, DLDataType)>::AssignTypedLambda<tvm::runtime::Registry::set_body_method<tvm::runtime::memory::Storage, tvm::runtime::memory::StorageObj, tvm::runtime::NDArray, long, tvm::runtime::ShapeTuple, DLDataType, void>(tvm::runtime::NDArray (tvm::runtime::memory::StorageObj::*)(long, tvm::runtime::ShapeTuple, DLDataType))::{lambda(tvm::runtime::memory::Storage, long, tvm::runtime::ShapeTuple, DLDataType)#1}>(tvm::runtime::Registry::set_body_method<tvm::runtime::memory::Storage, tvm::runtime::memory::StorageObj, tvm::runtime::NDArray, long, tvm::runtime::ShapeTuple, DLDataType, void>(tvm::runtime::NDArray (tvm::runtime::memory::StorageObj::*)(long, tvm::runtime::ShapeTuple, DLDataType))::{lambda(tvm::runtime::memory::Storage, long, tvm::runtime::ShapeTuple, DLDataType)#1}, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const, tvm::runtime::TVMRetValue) const+0x27c) [0xfffebf977374]
  [bt] (2) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::memory::StorageObj::AllocNDArray(long, tvm::runtime::ShapeTuple, DLDataType)+0x3a8) [0xfffebf9268c8]
  [bt] (1) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x78) [0xfffebd57af58]
  [bt] (0) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::Backtrace[abi:cxx11]()+0x30) [0xfffebf9236f0]
  File "/opt/mlc-llm/3rdparty/tvm/src/runtime/memory/memory_manager.cc", line 108
InternalError: Check failed: (offset + needed_size <= this->buffer.size) is false: storage allocation failure, attempted to allocate 15360000 at offset 0 in region that is 11272192bytes

dusty_nv · May 24, 2024, 4:27pm

Hi @ygoongood12 , I just tried re-running the quantization and same command as you here again on JetPack 5.1.2 / L4T R35, and did not face the issue. Can you try pulling the latest nano_llm container on your end?

sudo docker pull dustynv/nano_llm:r35.4.1

ygoongood12 · May 27, 2024, 2:55am

Thank you for your reply.
I pulled the latest nano_llm container following your guide. But I got the same error message.
And In this issue, I said that I changed model twice and I got other error messages for each try.
Can you see and reply again?

dusty_nv · May 28, 2024, 2:15pm

Hi @ygoongood12, sorry about that, I had only seen those error messages in older version of container. Due to the model download problems you had, I recommend deleting your /data/models/huggingface/models--Efficient-Large-Model--VILA1.5-3b directory.

It may also be related to being on JetPack 5 instead of the latest on JetPack 6, although it passed these model tests on JetPack 5. I believe this tag on JetPack 5 is actually newer at present time: dustynv/nano_llm:24.5.1-r35.4.1

system · June 19, 2024, 6:34am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
NanoVLM Issue on Jetson Orin Nano Jetson Orin Nano generative_ai	9	573	June 6, 2024
Can't start the live llava on jetson orin nano developer kit Jetson Orin Nano generative_ai	9	698	June 4, 2024
Error on following "NanoVLM - Efficient Multimodal Pipeline" Jetson Orin Nano generative_ai	2	213	May 24, 2024
Cannot run LLaVa with Orin NX Jetson Orin NX generative_ai	7	238	August 1, 2024
MiniGPT-4 on Jetson Orin Nano 8Gb Dev kit not working Jetson Orin Nano generative_ai	9	315	May 28, 2024
Nanoowl example not working Jetson AGX Orin generative_ai	11	408	May 23, 2024
Jetson Container `Nano_llm` version 24.6-r36.2.0 error on Jepack 6.0 DP Jetson Orin NX containers , generative_ai	5	217	July 4, 2024
TensorRT-LLM for jetson errors Jetson AGX Orin generative_ai , paligemma , kosmos-2 , llama	14	273	January 16, 2025
Jetson-containers does not build correctly Jetson Orin Nano containers	14	104	January 23, 2025
Jetson-inference: mageNet.Classify() encountered an error Jetson Nano	14	1474	October 14, 2021

Errors on tutorial NanoVLM

Related topics