Unknown Error When using custom Plugins in TensorRT

I use my custom class for Plugin using TensorRT 4.0 and I have implemented the kernel calls myself. I have written my cuda code for relu and upsample and I call it in my TensorRT C++ code. But I get segmentation faults (sometimes it runs perfectly) and it is very confusing. I used gdb to trace the segmentation fault and I can’t understand why is this coming. Any help would be appreciated.
Calls to kernels include two Relu calls and two Upsample Plugin calls. I have yet to add more Relu plugin calls but it gives segmentation fault.

Plugin call: auto relu_plugin = std::unique_ptr<Leaky_Relu>(new Leaky_Relu());

(gdb) where

#0  0x00007fffedd12454 in nvinfer1::cudnn::PluginLayer::execute(nvinfer1::cudnn::CommonContext const&, nvinfer1::cudnn::ExecutionParameters const&) () from /usr/lib/x86_64-linux-gnu/libnvinfer.so.4

#1  0x00007fffedcbdaa3 in nvinfer1::cudnn::ExecutionContext::enqueue(int, void**, CUstream_st*, CUevent_st**) ()
   from /usr/lib/x86_64-linux-gnu/libnvinfer.so.4

#2  0x00000000004041bc in doInference (context=warning: RTTI symbol not found for class 'nvinfer1::cudnn::ExecutionContext'
..., input=input@entry=0x7fffffb61540, output_82=output_82@entry=0x7fffffa8ee70, 
    output_93=output_93@entry=0x7fffffab8fd0, output_106=output_106@entry=0x7fffffd5c540, batchSize=batchSize@entry=1)
    at ./sampleCmodel.cpp:3201

#3  0x00000000004036ee in main (argc=<optimized out>, argv=<optimized out>) at ./sampleCmodel.cpp:3282

I don’t see where the segfault is happening. Might be the same, but can you share the backtrace (bt) from gdb?

You mentioned this doesn’t fault every time, which implies there’s a transient variable involved. Are you running the engine in multiple threads? How often does the fault occure? If you ran the engine in a loop, how many iterations does it take for the segfault to happen? Just some debug ideas.

Also, recommend upgrading to TRT5.x if possible. We have made many improvements and fixes since TRT4.

This is the gdb backtrace which is provided in the problem. The reason of using TRT4 is because I am using TX2, and it does not support TRT5. When I run the inference code in a loop, the first iteration works fine but the second iteration gives a segmentation fault. The engine is being run on a single thread

I think it has to do with the tactic trt chooses during optimization. Sometimes, the same exec file gives a seg fault and some other times the same exec file does not. Can we give a flag or something to the optimizations to not select the segfault tactic?

(gdb) where
#0  0x00007fffedd125ae in nvinfer1::cudnn::PluginLayer::releaseResources() () from /usr/lib/x86_64-linux-gnu/libnvinfer.so.4

#1  0x00007fffedd125ec in nvinfer1::cudnn::PluginLayer::~PluginLayer() () from /usr/lib/x86_64-linux-gnu/libnvinfer.so.4

#2  0x00007fffedd12709 in nvinfer1::cudnn::PluginLayer::~PluginLayer() () from /usr/lib/x86_64-linux-gnu/libnvinfer.so.4

#3  0x00007fffedcbb04e in nvinfer1::cudnn::Engine::~Engine() () from /usr/lib/x86_64-linux-gnu/libnvinfer.so.4

#4  0x00007fffedcbb231 in nvinfer1::cudnn::Engine::destroy() () from /usr/lib/x86_64-linux-gnu/libnvinfer.so.4

#5  0x0000000000403716 in main (argc=<optimized out>, argv=<optimized out>) at ./sampleCmodel.cpp:3288

Also get this segmentation fault too

It’s interesting that you say “the first iteration works fine but the second iteration gives a segmentation fault”. That usually points to reusing a resource/handle that’s not cleaned up or appropriately re-initialized on subsequent loops.