Error while exporting TensorRT model

Whenever I export my custom TensorFlow model with TensorRT 32/16bit optimizations I got the following error:

I0815 11:30:50.225565  239551 gpu_device.cc:1098] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10586 MB memory) -> physical GPU (device: 0, name: TITAN V, pci bus id: 0000:02:00.0, compute capability: 7.0)
I0815 11:30:50.780259  239551 convert_graph.cc:828] MULTIPLE tensorrt candidate conversion: 9
I0815 11:30:50.780498  239551 convert_nodes.cc:2923] Segment @scope '', converted to graph
E0815 11:30:50.780515  239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.784586  239551 convert_nodes.cc:2923] Segment @scope '', converted to graph
E0815 11:30:50.784605  239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.792328  239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/', converted to graph
E0815 11:30:50.792375  239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.833066  239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/', converted to graph
E0815 11:30:50.833114  239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.854989  239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/block4/', converted to graph
E0815 11:30:50.855037  239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.891869  239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/block4/unit_2/bottleneck_v1/conv2/', converted to graph
E0815 11:30:50.891915  239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.928116  239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/block4/', converted to graph
E0815 11:30:50.928162  239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.962199  239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/block4/unit_3/bottleneck_v1/conv2/', converted to graph
E0815 11:30:50.962249  239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:51.000802  239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/', converted to graph
E0815 11:30:51.000845  239551 convert_graph.cc:415] Can't find a device placement for the op!
W0815 11:30:51.035388  239551 convert_graph.cc:799] Cluster is set but device '' is not found in the cluster
W0815 11:30:51.035430  239551 convert_graph.cc:916] Can't identify the cuda device. Running on device 0 
I0815 11:30:51.224386  239551 convert_graph.cc:927] Engine my_trt_op_0 creation for segment 0, composed of 4 nodes succeeded.
W0815 11:30:51.224439  239551 convert_graph.cc:799] Cluster is set but device '' is not found in the cluster
W0815 11:30:51.224457  239551 convert_graph.cc:916] Can't identify the cuda device. Running on device 0 
I0815 11:30:51.393236  239551 convert_graph.cc:927] Engine my_trt_op_1 creation for segment 1, composed of 5 nodes succeeded.
W0815 11:30:51.393291  239551 convert_graph.cc:799] Cluster is set but device '' is not found in the cluster
W0815 11:30:51.393305  239551 convert_graph.cc:916] Can't identify the cuda device. Running on device 0 
E0815 11:30:51.510846  239551 trt_logger.cc:38] DefaultLogger runtime.cpp (16) - Cuda Error in allocate: 2
E0815 11:30:51.512978  239551 trt_logger.cc:38] DefaultLogger runtime.cpp (16) - Cuda Error in allocate: 2
W0815 11:30:51.514583  239551 convert_graph.cc:933] Engine resnet_v1_50/my_trt_op_2 creation for segment 2, composed of 195 nodes failed: Internal: Failed to build TensorRT engine. Skipping...
W0815 11:30:51.514606  239551 convert_graph.cc:799] Cluster is set but device '' is not found in the cluster
W0815 11:30:51.514611  239551 convert_graph.cc:916] Can't identify the cuda device. Running on device 0 
I0815 11:30:51.686117  239551 convert_graph.cc:927] Engine resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/my_trt_op_3 creation for segment 3, composed of 5 nodes succeeded.
W0815 11:30:51.686174  239551 convert_graph.cc:799] Cluster is set but device '' is not found in the cluster
W0815 11:30:51.686190  239551 convert_graph.cc:916] Can't identify the cuda device. Running on device 0 
F0815 11:30:51.789828  239551 logging.cc:79] assert.h assertion failed at customWinogradConvActLayer.cpp:48 in nvinfer1::cudnn::WinogradConvActLayer::WinogradConvActLayer(const string&, const EngineTensors&, const EngineTensors&, const nvinfer1::ConvolutionParameters&, bool, const std::vector<float>&): matchNbDims(inputs[0], outputs[0]) && (inputs.size() == 1 || inputs[1].extent == outputs[0].extent)
*** Check failure stack trace: ***
    @     0x557d8737c846  base_logging::LogMessage::SendToLog()
    @     0x557d8737cfee  base_logging::LogMessage::Flush()
    @     0x557d8737f549  base_logging::LogMessageFatal::~LogMessageFatal()
    @     0x557d8737acf6  __assert_fail
    @     0x557d7c4617c2  nvinfer1::cudnn::WinogradConvActLayer::WinogradConvActLayer()
    @     0x7f2afa3feef0  (unknown)
*** SIGABRT received by PID 239551 (TID 239551) from PID 239551; ***

Model exported with default Tensorflow optimizations just fine. Any help appreciated.

Hello,

I have the same problem with TensorRT 4, using a custom converter that I wrote for MXnet → TensorRT that incorporates my custom operators. My error occurs after graph optimization during the timing and algorithm selection of each individual layer.

me too~~~~~~~~~~~~~~~