Issue with onnxruntime when using CUDAExecutionProvider

Hello,

I am working on the inference using onnx model. I have run this code successfully on my laptop, but have an issue when running it on Jetson.

My environment:
-Jetson AGX Orin
-JetPack 5.1.4
-onnx 1.17.0
-onnx-graphsurgeon 0.3.12
-onnxruntime-gpu 1.16.3 (installed using the whell on jetzoo)
-CUDA 11.4

Issue:
I have run to check available providers and it shows all providers avaialable ([‘TensorrtExecutionProvider’, ‘CUDAExecutionProvider’, ‘CPUExecutionProvider’])

When I set provider as ‘CPUExecutionProvider’, it works normally. However, when I switch to ‘CUDAExecutionProvider’. It stuck at this line “outputs = ort_sess.run(None, ort_inputs)” and show inference results 2-3 minutes later. At the same time, I check GPU status with “tegrastats”. It shows that the GPU frequency is around 99%.

Is there any recommendation to solve this issue? I would appreciate any suggestion. Thank you.

Dear @nrodwarna ,
Could you share the model, repro code and steps?
Also, which GPU you have on x86? What is the inference time on x86?

Does that mean, inference time is less compared to GPU?

Thank you for your response.

The model I’m using is Jetson AGX Orin Developer Kit and the following is the object detection model and code.
I did not record the inference time on GPU as it did not provide any result. I believe there might be an issue with setup or bottleneck somewhere.

https://drive.google.com/drive/folders/1BZdSHiikksJg0f_VD3uaZ2fnKQ1-6iYK?usp=drive_link

May I know which GPU is used in x86?

NVIDIA Tegra Orin (nvgpu)/integrated
OpenGL Version:4.6.0 NVIDIA 35.5.0

Dear @nrodwarna ,
I am asking GPU used in your laptop to get idea of expected performance on laptop vs Jetson .

Sorry for misunderstanding.

It’s NVIDIA GeForce RTX 3060 6 GB

Dear @nrodwarna ,
Does this issue still need support? Is it possible to test the issue on latest release.

Comparing the specs of GPU and Jetson devkit, 3060 is powerful(~2x comparing the cores).

Yes, I havent tried with the latest release yet, but I will try.
Btw, now it works with tensorrt model, but not for onnx.