Onnxruntime error

harsha.tejas2002 · July 2, 2021, 11:45am

Hi,
I trained a Fasterrcnn Resnet50 object detection model on custom dataset with PyTorch and converted it to onnx model
When ran onnxruntime against the onnx model, it managed to load the model. But when the model tried predicting on a image with session.run(), it returns the following error:

[E:onnxruntime:Default, cuda_call.cc:117 CudaCall] CUDNN failure 4: CUDNN_STATUS_INTERNAL_ERROR ; GPU=0 ; hostname=harshatejas ; expr=cudnnFindConvolutionForwardAlgorithmEx( s_.handle, s_.x_tensor, s_.x_data, s_.w_desc, s_.w_data, s_.conv_desc, s_.y_tensor, s_.y_data, 1, &algo_count, &perf, algo_search_workspace.get(), max_ws_size);
2021-07-02 16:59:04.808417944 [E:onnxruntime:, sequential_executor.cc:339 Execute] Non-zero status code returned while running Conv node. Name:‘Conv_441’ Status Message: CUDNN error executing cudnnFindConvolutionForwardAlgorithmEx( s_.handle, s_.x_tensor, s_.x_data, s_.w_desc, s_.w_data, s_.conv_desc, s_.y_tensor, s_.y_data, 1, &algo_count, &perf, algo_search_workspace.get(), max_ws_size)
terminate called after throwing an instance of ‘onnxruntime::OnnxRuntimeException’
what(): /home/onnxruntime/onnxruntime-py36/onnxruntime/core/providers/cuda/cuda_call.cc:121 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] /home/onnxruntime/onnxruntime-py36/onnxruntime/core/providers/cuda/cuda_call.cc:115 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 702: the launch timed out and was terminated ; GPU=0 ; hostname=harshatejas ; expr=cudaEventDestroy(read_event_);

Aborted (core dumped)

I changed the runtime to run on cpu with session.set_providers([‘CPUExecutionProvider’])
BOOM!! The code ran successfully and predicted perfectly on the image, it took very long time to run the prediction. But i want the prediction done on gpu to able to predict in lesser time.

Thank You.

AastaLLL · July 5, 2021, 3:24am

Hi,

How do you install the onnxruntime package?
Do you use the prebuilt from the below link?
https://www.elinux.org/Jetson_Zoo#ONNX_Runtime

More, could you validate the model on the desktop onnxruntime environment?
Please help to verify if this issue is specified to the Jetson device.

Thanks.

harsha.tejas2002 · July 5, 2021, 3:11pm

Hey,

Thanks for the reply and yes I installed onnxruntime package from Jetson Zoo - eLinux.org.
The onnxruntime version is 1.8.0

The model ran successfully on Colab onnxruntime environment with GPU. Here’s a screen shot of the result

AastaLLL · July 9, 2021, 5:34am

Hi,

We want to reproduce this error in our environment.
Could you share the model (model.onnx) with us?

Thanks.

harsha.tejas2002 · July 9, 2021, 9:04am

Hi,

Here’s the a folder that contains the model.onnx, create_onnx.py, predict_onnx.py and the image
https://drive.google.com/drive/folders/103hJwjMvpcT_F87jOw1F5gt3tMuZ4uny?usp=sharing

And this is my GitHub repo, where you can find the data, train.py and the Pytorch model - GitHub - harshatejas/pytorch_custom_object_detection: Training PyTorch Faster-RCNN on custom dataset

Thank You.

AastaLLL · July 20, 2021, 5:46am

Hi,

Thanks for your source.
Confirmed that we can also reproduce the CUDA 702 error in our environment.

We are checking this internally.
Will get back to you later.

Thanks.

AastaLLL · July 22, 2021, 8:35am

Hi,

The package is built by the ONNX team.
Could you file a topic in their GitHub for help?

Thanks.

harsha.tejas2002 · July 22, 2021, 8:38am

Hi,

Is it possible to run faster-rcnn onnx model on jetson nano?

Thank you.

AastaLLL · August 3, 2021, 4:22am

Hi,

YES. It’s recommended to use TensorRT instead.
TensorRT is our library for fast inference and is also optimized for the Jetson platform.

You can test it with the following command directly:

/usr/src/tensorrt/bin/trtexec --onnx=[onnx/model]

Thanks.

system · October 10, 2021, 4:41am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error while using Onnxruntime gpu in windows TensorRT cuda , cudnn	1	47	January 31, 2025
Jetson Nano: Parsed Tiny Yolo v2 ONNX model gives different result in TRT Jetson Nano tensorrt	6	1339	October 18, 2021
Build ONNXInference-gpu wheel for Jetpack5 with Cuda and TRT Jetson AGX Orin tensorrt , cuda , onnx	6	2643	August 10, 2022
Error Code 1: Cudnn (CUDNN_STATUS_EXECUTION_FAILED) TensorRT cuda	3	2162	May 31, 2022
PyTorch FCN-ResNet50 --> ONNX --> TensorRT TensorRT	3	971	February 17, 2022
Unable to use TensorRTExecution Provider on Jetson AGX Xavier Jetson AGX Xavier tensorrt	9	592	April 18, 2024
Cuda Error in launchPwgenKernel- When running a specific engine in async TensorRT tensorrt	9	2154	June 11, 2022
[defaultAllocator.cpp::deallocate::35] Error Code 1: Cuda Runtime (invalid argument) TensorRT tensorrt	3	1080	May 5, 2022
Jetson-Inference predictions differ from e.g. tensorflow predictions Jetson Nano jetson-inference	4	860	November 17, 2021
Cuda Runtime Error when infering Onnx model TensorRT	3	958	October 11, 2021

Onnxruntime error

Related topics