Triton Inference Server not Loading yolo11 models

I am trying to deploy multiple ONNX models on Triton Inference Server running on my Jetson AGX Orin.
Triton starts successfully, but no models are being loaded, even though the server itself is running fine.

Below is the log output:

this is my folder structure :

triton_models/

├── ppekit/
│ ├── config.pbtxt
│ └── 1/
│ └── model.onnx

├── gloves_shoes/
│ ├── config.pbtxt
│ └── 1/
│ └── model.onnx

└── ifr/
├── config.pbtxt
└── 1/
└── model.onnx

and i Followed this as a reference to deploy my models: Triton Inference Server with Ultralytics YOLO11

please help me to solve this issue and deploy mt models correctly to Triton Inferece Server

You need to use the correct Docker image. For Jetson, you need the igpu docker image.

nvcr.io/nvidia/tritonserver:24.09-py3-igpu

Also what command did you use to start the docker container? Did you pass --gpus? You should only pass --runtime nvidia. Not --gpus.

Hi @ffc0927e4460cf300fd61fb77 ,

This is not a TensorRT issue. You should post this on the NVIDIA Jetson Forums or the Triton Inference Server GitHub.

What i can see from the logs:
CUDA driver version is insufficient for CUDA runtime version

You are running Triton 24.09 (which requires CUDA 12/JetPack 6).
Reference: TensorFlow Release 24.09 - NVIDIA Docs
Your Jetson is likely still on JetPack 5 (which uses CUDA 11 by default).
See here: JetPack SDK | NVIDIA Developer.

So downgrading to a proper Triton container version compatible with your JetPack version could work. I recommend checking with the above forums/channels for the resolution.

I’ll keep this thread open for others to have their opinions.

Thank you.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.