Description
I’ve been trying to run a model with onnxruntime-gpu on a Jetson AGX Orin Developer Kit using Jetpack 5.0.1, I’ve followed the guide found on “faxu dot github dot io slash onnxinference” (sorry cant post link due to being a new account) to build onnxinference from source with cuda and tensorrt support.
This is the build command i used:
./build.sh --config Release --update --build --parallel --build_wheel
–use_tensorrt --use_cuda --cuda_home /usr/local/cuda --cudnn_home /usr/lib/aarch64-linux-gnu
–tensorrt_home /usr/lib/aarch64-linux-gnu
When I try to run the model using onnxinference, trt works fine, but Cuda does not.
Environment
Device: Jetson AGX Orin Developer Kit
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8
Container: l4t-ml:r34.1.1-py3
CUDA Version: 11.4
TensorRT Version: 8.4
Relevant Files
onnxruntime_gpu-1.13.0-cp38-cp38-linux_aarch64.whl (24.9 MB)
Steps To Reproduce
On my Jetson AGX Orin with Jetpack5 installed, I launch a docker container with this command:
docker run -it --rm --runtime nvidia --network host -v test:/opt/test l4t-ml:r34.1.1-py3
Here is a snippet of my code that I run in the notebook in the docker:
#install onnxinference-gpu wheel I built, you can find it attatched to this post
!pip install onnxruntime_gpu-1.13.0-cp38-cp38-linux_aarch64.whl
import onnxruntime as ort
import numpy as np
providers = [
('TensorrtExecutionProvider', {
'trt_fp16_enable': True,
}),
('CUDAExecutionProvider', {
'device_id': 0,
'arena_extend_strategy': 'kNextPowerOfTwo',
'gpu_mem_limit': 2 * 1024 * 1024 * 1024,
'cudnn_conv_algo_search': 'EXHAUSTIVE',
'do_copy_in_default_stream': True,
})
]
img = np.zeros((3, 640, 640)).astype(np.fp32)
session_trt = ort.InferenceSession("my_onnx_model.onnx", providers=providers)
ort_inputs = {session_trt.get_inputs()[0].name: image[None, :, :, :]}
out = session_trt.run(None, ort_inputs)
and I get this exception:
022-07-26 16:16:12.161594665 [E:onnxruntime:, sequential_executor.cc:368 Execute] Non-zero status code returned while running Sigmoid node. Name:'Sigmoid_36' Status Message: CUDA error cudaErrorNoKernelImageForDevice:no kernel image is available for execution on the device
---------------------------------------------------------------------------
Fail Traceback (most recent call last)
Input In [16], in <cell line: 3>()
1 session_trt = ort.InferenceSession("my_onnx_model.onnx", providers=providers)
2 ort_inputs = {session_trt.get_inputs()[0].name: image[None, :, :, :]}
----> 3 out = session_trt.run(None, ort_inputs)
File /usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py:200, in Session.run(self, output_names, input_feed, run_options)
198 output_names = [output.name for output in self._outputs_meta]
199 try:
--> 200 return self._sess.run(output_names, input_feed, run_options)
201 except C.EPFail as err:
202 if self._enable_fallback:
Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Sigmoid node. Name:'Sigmoid_36' Status Message: CUDA error cudaErrorNoKernelImageForDevice:no kernel image is available for execution on the device
I run the same Code on a Jetson Nano, with Jetpack4.6 and onnxinference-gpu 1.11 downloaded from Jetson_Zoo#ONNX_Runtime (sorry cant post link since I’m a new user) and everything works fine.
I’d like to be able to use Cuda runtime Environment to test some more onnx models and the performances with different runtime environments on my Jetson Orin.
If I try to run my pytorch model using cuda, it works fine, torch.cuda.is_available() returns True.
Any idea why cuda Env is not working in onnxinference?