Get error message as conver qwen to int-gptq in tensorrt-llm for agx orin

johnsonwag · November 25, 2024, 6:45am

It’s good news that jetson can use tensorrt-llm as following link TensorRT-LLM 🆕 - NVIDIA Jetson AI Lab.

but I met problem as run TensorRT-LLM on agx orin

I have tried tensorrt-llm contain and wheel installation.
tensorrt-llm can not work in contain as the incompleted tensorrt installed.
and tried wheel of tensorrt_llm 0.12.0 for jetson , it display “KeyError: ‘model.layers.0.self_attn.q_proj.qweight” as convert qwen2.5 model , please refer to following words.

#run convert command
python3 /home/chat/TensorRT-LLM/examples/qwen/convert_checkpoint.py
–model_dir /home/chat/Downloads/qwen-7b
–output_dir /home/chat/Downloads/tllm_checkpoint_1gpu_gptq
–dtype float16
–use_weight_only
–weight_only_precision int4_gptq
–per_group

#display message as run command
[TensorRT-LLM] TensorRT-LLM version: 0.12.0
0.12.0
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:07<00:00, 1.94s/it]
loading weight in each layer…: 0%| | 0/28 [00:00<?, ?it/s]
Traceback (most recent call last):
File “/home/chat/TensorRT-LLM/examples/qwen/convert_checkpoint.py”, line 308, in
main()
File “/home/chat/TensorRT-LLM/examples/qwen/convert_checkpoint.py”, line 300, in main
convert_and_save_hf(args)
File “/home/chat/TensorRT-LLM/examples/qwen/convert_checkpoint.py”, line 256, in convert_and_save_hf
execute(args.workers, [convert_and_save_rank] * world_size, args)
File “/home/chat/TensorRT-LLM/examples/qwen/convert_checkpoint.py”, line 263, in execute
f(args, rank)
File “/home/chat/TensorRT-LLM/examples/qwen/convert_checkpoint.py”, line 246, in convert_and_save_rank
qwen = QWenForCausalLM.from_hugging_face(
File “/home/chat/.venv/lib/python3.10/site-packages/tensorrt_llm/models/qwen/model.py”, line 313, in from_hugging_face
weights = load_weights_from_hf_gptq_model(hf_model, config)
File “/home/chat/.venv/lib/python3.10/site-packages/tensorrt_llm/models/qwen/convert.py”, line 1365, in load_weights_from_hf_gptq_model
comp_part = model_params[prefix + key_list[0] + comp + suf]
KeyError: ‘model.layers.0.self_attn.q_proj.qweight’

SivaRamaKrishnaNV · November 25, 2024, 6:57am

We dont have TRT -LLM release available for Devzone release.
Please see Can Drive Orin support TensorRT-LLM? - #2 by SivaRamaKrishnaNV

This forum is exclusively for developers who are part of the NVIDIA DRIVE® AGX SDK Developer Program | NVIDIA Developer To post in the forum, please use an account associated with your corporate or university email address.
This helps us ensure that the forum remains a platform for verified members of the developer program.

johnsonwag · November 25, 2024, 7:26am

hi ,thank a lot for your reply.
and please check other ticket mentioned tensorrt-llm on agx orin.

whitesscott · November 26, 2024, 8:00am

this might resolve your problem.

The MaziyarPanahi/Meta-Llama-3-8B-Instruct-GPTQ repo has requirments and this is the only package not in tensorrt-llm

git clone GitHub - AutoGPTQ/AutoGPTQ: An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

If you aren’t using conda, edit setup.py modify this line to this conda_cuda_include_dir = “/usr/local/cuda/include”

export BUILD_CUDA_EXT=1
export TORCH_CUDA_ARCH_LIST=“8.7”
export COMPILE_MARLIN=1
MAX_JOBS=10 python -m pip wheel . --no-build-isolation -w dist --no-clean
pip install dist/auto_gptq-0.8.0.dev0+cu126-cp310-cp310-linux_aarch64.whl --user

system · December 10, 2024, 8:01am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TensorRT-LLM on Jetson Orin NX(16GB) Jetson Orin NX tensorrt , jetson-inference , generative_ai	7	70	January 16, 2025
TensorRT-LLM for Jetson Jetson AGX Orin generative_ai	9	877	January 1, 2025
Install TensorRT 8.6.1.6 on Jetpack 6.0 DP Jetson Orin NX tensorrt	10	1196	March 1, 2024
Error install torch_tensorrt TensorRT cudnn	5	616	January 31, 2024
Transfer model generated by tensorflow to tensorRT Jetson AGX Orin tensorrt	3	313	April 10, 2024
Can TensorRT-LLM be used on Jetson Orin NX with JetPack 6.1? Jetson Orin NX tensorrt , generative_ai	6	173	December 17, 2024
Jetson上安装TensorRT容器后，编译/运行C++程序，未成功，该该怎么解决呢？ Jetson Orin NX tensorrt	12	85	September 23, 2024
Can I use TensorRT-LLM in Jetson AGX orin? Jetson AGX Orin nvbugs , generative_ai	3	541	July 15, 2024
Tensorrt install error on jetson xiaver nx TensorRT tensorrt , python , cudnn , jetson	0	656	December 20, 2023
TensorRT Installation Jetson AGX Orin tensorrt	12	1895	September 26, 2023

Get error message as conver qwen to int-gptq in tensorrt-llm for agx orin

Related topics