TensorRT LLM

joost-de-v · October 22, 2025, 3:59pm

I tried to run the TensorRT LLM container on Jetson Thor. But I encounter an error
ImportError: libnvinfer.so.10: cannot open shared object file: No such file or directory

I’m following the quickstart guide.

This is in in my docker compose file

  tensorrt-llm:
    image: nvcr.io/nvidia/tensorrt-llm/release:1.2.0rc1
    ports:
      - 8000:8000
    ulimits:
      memlock: 1
      stack: 67108864
    gpus: all
    shm_size: 8gb
    ipc: host
    env_file:
      - .env  
    volumes:
      - $HOME/.cache:/root/.cache
      - $PWD:/workspace
    entrypoint:
      [
        "trtllm-serve",
        "--port", "8000",
        "--host", "0.0.0.0",
        "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
      ]

AastaLLL · October 23, 2025, 2:40am

Hi,

Based on the release note, Thor’s GPU architecture (sm_110) hasn’t been added yet.
We will check with our internal team for more information and share it with you.

Thanks.

AastaLLL · October 23, 2025, 4:18am

Hi,

TensorRT LLM doesn’t support Thor.
Please use vLLM or SGLang instead.

Thanks.

joost-de-v · October 23, 2025, 7:01am

Thank you so much. That wasn’t clear to me.

I’ll use vLLM for now.
(You mention SGLang but from this pinned topic I understand that there’s no SGLang container yet. )

Do you expect that support will be added to TensorRT LLM for Thor’s GPU architecture (sm_110)?

whitesscott · October 23, 2025, 7:48am

I just pulled this container nvcr.io/nvidia/tensorrt-llm/release:1.2.0rc1

It is version 1.2.0.rc1 and was published October 21, 2025. env show lots of stuff including sm_110. It is arm64. I am presuming it is using torch and/or triton as the backend; it has both installed.

I was attempting to run ./examples/…/llama but it failed probably because of my old llama3.1_instruct and I already turned off my Thor. I’ll try more tomorrow.

python convert_checkpoint.py
–model_dir /root/.cache/huggingface/hub/models-…-llama3.1 \
–output_dir ./tllm_checkpoint_1gpu_tp1 \
–dtype float16 \
–tp_size 1 \

trtllm-build --checkpoint_dir ./tllm_checkpoint_1gpu_tp1
–output_dir ./tmp/llama/8B/trt_engines/fp16/1-gpu/
–gemm_plugin auto

AastaLLL · October 27, 2025, 5:25am

Hi,

SGLang is coming soon.
You can find some examples for vLLM in the link below:

Just to be clear that TensorRT-LLM won’t be available on Jetson.
But we will support TensorRT Edge-LLM on Thor in the upcoming release.

Thanks.

system · November 18, 2025, 9:05am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TensorRT Edge-LLM on the AGX Thor Jetson Thor tensorrt , generative_ai	11	749	December 4, 2025
TensorRT-llm on Blackwell AGX thor Jetson Thor tensorrt , llm	6	155	December 17, 2025
Support for TensoRT-LLM and Benchmarking Models Jetson Thor nvbugs , generative_ai	7	370	September 24, 2025
NGC Catalog: How to install TensorRT in vllm:25.11-py3 container DRIVE AGX Thor General driveos-dl	10	172	January 25, 2026
Can I use TensorRT-LLM in Jetson AGX orin? Jetson AGX Orin nvbugs , generative_ai	3	775	July 15, 2024
How to serve TensorRT-LLM engines with Triton Inference Server on Jetson Thor and compare inference speed with vLLM container? Jetson Thor llm	6	112	February 2, 2026
JetPack 7.0/Jetson Linux 38.2 for NVIDIA Jetson Thor is now live Jetson Thor cudnn , llama	20	3234	October 27, 2025
TRT LLM for Inference - Import Error libcuda DGX Spark / GB10	3	80	January 5, 2026
Inquiry on any updated support for tensorrt-llm support nvidia orin AGX? Jetson AGX Orin tensorrt , generative_ai , llama	4	261	June 3, 2025
Jetson R35.3.1 TensorRT-LLM build fails Jetson TX2 generative_ai	2	102	October 22, 2025

TensorRT LLM

Related topics