trtexec Caffe to tensorrt conversion deserializeCudaEngine segfault

sander.klomp · May 6, 2019, 12:21pm

OS:Ubuntu18.04
GPU type RTX Titan
nvidia driver version 415.27
CUDA version 10.0.0.300
CUDNN version 7.3
TensorRT version TensorRT 5.0.0.10

Hi,

The past weeks I have been trying to convert my custom caffe model to tensorrt. Doing this using trtexec works perfectly fine and achieves a very satisfying speedup within the trtexec profiler. However, when I try to deserialize the PLAN file in C++ using deserializeCudaEngine I always run into a segfault.

What could be causing this segfault? The stacktrace isn’t too helpful, as I cannot step into the deserializeCudaEngine function.

What I have tried so far:

Ensure that trtexec is run on the same machine and GPU as where the C++ code should run (I know the optimizations are device specific).
Ensure tensorrt versions of trtexec and the one used in my C++ code are EXACTLY equal (5.0.0.10). Still segfaults.
Try using a simpler caffe prototxt with only an input, 1 convolution, output. Still segfaults.
Try the conversion using a manual caffe to tensorrt C++ executable that my colleague wrote some time ago instead of trtexec. deserializeCudaEngine works correctly for my super simple input,conv,output network in this case. Sadly, my colleague’s script lacks support for some layers in the custom network that I actually want to convert. Trtexec does support these layers, so I still want to use that if possible.

Because of the above points I am reasonably certain my C++ code is not the issue (the plan file created by my colleagues code works).
Phrased differently: What pitfalls are there to keep in mind when generating the plan file via trtexec and loading it later with deserializeCudaEngine? Could it happen for example that there is some version conflict in CUDA instead even if tensorrt versions are equal?

Not that I think it will be too helpful, but this is the prototxt of my super simple model that I used for testing:
layer {
type: “Input”
name: “data_1”
top: “data_1”
input_param {
shape { dim: 1 dim: 3 dim: 1440 dim: 1920 }
}
}
layer {
name: “conv1”
type: “Convolution”
bottom: “data_1”
top: “conv1”
param {
name: “conv1_w”
}
convolution_param {
num_output: 16
bias_term: false
pad: 0
kernel_size: 1
stride: 1
}
}

sander.klomp · May 7, 2019, 9:49am

Hi,

I have managed to solve this. My code was part of a large codebase with integrated third party libraries and it appears that there Cudnn7.3.0 was used. I ran trtexec outside of this environment, causing it to use the system installed version of cudnn: Cudnn7.3.1.

Conclusion: versions are important, not only for tensorrt itself, but also the supporting libraries.

Topic		Replies	Views
caffemodel to TensorRT problem TensorRT	2	991	December 9, 2019
Load TensorRT engine and deserialize in C++ TensorRT	12	5193	February 27, 2025
How to use TensorRT run SegNet(cpp version caffe)? TensorRT tensorrt	1	542	October 5, 2021
TensorRT Caching mechanism not very fast. deserializeCudaEngine takes some time TensorRT tensorrt	3	2135	November 26, 2021
Converting Caffe to TensorRT TensorRT jetson	0	14	January 2, 2025
Trtexec multi-source (streams) and multi-batch performance test failed TensorRT	5	986	August 11, 2023
Problem converting caffemodel to tensorRT using AMPERE-based host GPU DRIVE AGX Xavier General driveos-dl	4	829	October 12, 2021
TensorRT C++ engine deserealization failed. Windows 10 TensorRT	3	595	June 28, 2022
What causes the deserializeCudaEngine() fail and how to get the error message? TensorRT tensorrt	9	1609	May 27, 2023
[error] deserialize_cuda_engine(): incompatible funtion arguments in sample fc_plugin_caffe_mnist TensorRT	7	2325	December 9, 2019

trtexec Caffe to tensorrt conversion deserializeCudaEngine segfault

Related topics