I am trying to deploy multiple ONNX models on Triton Inference Server running on my Jetson AGX Orin .
Triton starts successfully, but no models are being loaded , even though the server itself is running fine.
Below is the log output:
this is my folder structure :
triton_models/
│
├── ppekit/
│ ├── config.pbtxt
│ └── 1/
│ └── model.onnx
│
├── gloves_shoes/
│ ├── config.pbtxt
│ └── 1/
│ └── model.onnx
│
└── ifr/
├── config.pbtxt
└── 1/
└── model.onnx
and i Followed this as a reference to deploy my models: Triton Inference Server with Ultralytics YOLO11
please help me to solve this issue and deploy mt models correctly to Triton Inferece Server
Y-T-G
December 9, 2025, 9:53am
4
You need to use the correct Docker image. For Jetson, you need the igpu docker image.
nvcr.io/nvidia/tritonserver:24.09-py3-igpu
Y-T-G
December 9, 2025, 9:55am
5
Also what command did you use to start the docker container? Did you pass --gpus? You should only pass --runtime nvidia. Not --gpus.
Hi @ffc0927e4460cf300fd61fb77 ,
This is not a TensorRT issue. You should post this on the NVIDIA Jetson Forums or the Triton Inference Server GitHub .
What i can see from the logs:
CUDA driver version is insufficient for CUDA runtime version
You are running Triton 24.09 (which requires CUDA 12/JetPack 6).
Reference: TensorFlow Release 24.09 - NVIDIA Docs
Your Jetson is likely still on JetPack 5 (which uses CUDA 11 by default).
See here: JetPack SDK | NVIDIA Developer .
So downgrading to a proper Triton container version compatible with your JetPack version could work. I recommend checking with the above forums/channels for the resolution.
I’ll keep this thread open for others to have their opinions.
Thank you.
system
Closed
December 23, 2025, 9:59am
7
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.