Whenever I export my custom TensorFlow model with TensorRT 32/16bit optimizations I got the following error:
I0815 11:30:50.225565 239551 gpu_device.cc:1098] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10586 MB memory) -> physical GPU (device: 0, name: TITAN V, pci bus id: 0000:02:00.0, compute capability: 7.0)
I0815 11:30:50.780259 239551 convert_graph.cc:828] MULTIPLE tensorrt candidate conversion: 9
I0815 11:30:50.780498 239551 convert_nodes.cc:2923] Segment @scope '', converted to graph
E0815 11:30:50.780515 239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.784586 239551 convert_nodes.cc:2923] Segment @scope '', converted to graph
E0815 11:30:50.784605 239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.792328 239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/', converted to graph
E0815 11:30:50.792375 239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.833066 239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/', converted to graph
E0815 11:30:50.833114 239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.854989 239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/block4/', converted to graph
E0815 11:30:50.855037 239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.891869 239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/block4/unit_2/bottleneck_v1/conv2/', converted to graph
E0815 11:30:50.891915 239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.928116 239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/block4/', converted to graph
E0815 11:30:50.928162 239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:50.962199 239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/block4/unit_3/bottleneck_v1/conv2/', converted to graph
E0815 11:30:50.962249 239551 convert_graph.cc:415] Can't find a device placement for the op!
I0815 11:30:51.000802 239551 convert_nodes.cc:2923] Segment @scope 'resnet_v1_50/', converted to graph
E0815 11:30:51.000845 239551 convert_graph.cc:415] Can't find a device placement for the op!
W0815 11:30:51.035388 239551 convert_graph.cc:799] Cluster is set but device '' is not found in the cluster
W0815 11:30:51.035430 239551 convert_graph.cc:916] Can't identify the cuda device. Running on device 0
I0815 11:30:51.224386 239551 convert_graph.cc:927] Engine my_trt_op_0 creation for segment 0, composed of 4 nodes succeeded.
W0815 11:30:51.224439 239551 convert_graph.cc:799] Cluster is set but device '' is not found in the cluster
W0815 11:30:51.224457 239551 convert_graph.cc:916] Can't identify the cuda device. Running on device 0
I0815 11:30:51.393236 239551 convert_graph.cc:927] Engine my_trt_op_1 creation for segment 1, composed of 5 nodes succeeded.
W0815 11:30:51.393291 239551 convert_graph.cc:799] Cluster is set but device '' is not found in the cluster
W0815 11:30:51.393305 239551 convert_graph.cc:916] Can't identify the cuda device. Running on device 0
E0815 11:30:51.510846 239551 trt_logger.cc:38] DefaultLogger runtime.cpp (16) - Cuda Error in allocate: 2
E0815 11:30:51.512978 239551 trt_logger.cc:38] DefaultLogger runtime.cpp (16) - Cuda Error in allocate: 2
W0815 11:30:51.514583 239551 convert_graph.cc:933] Engine resnet_v1_50/my_trt_op_2 creation for segment 2, composed of 195 nodes failed: Internal: Failed to build TensorRT engine. Skipping...
W0815 11:30:51.514606 239551 convert_graph.cc:799] Cluster is set but device '' is not found in the cluster
W0815 11:30:51.514611 239551 convert_graph.cc:916] Can't identify the cuda device. Running on device 0
I0815 11:30:51.686117 239551 convert_graph.cc:927] Engine resnet_v1_50/block4/unit_1/bottleneck_v1/conv2/my_trt_op_3 creation for segment 3, composed of 5 nodes succeeded.
W0815 11:30:51.686174 239551 convert_graph.cc:799] Cluster is set but device '' is not found in the cluster
W0815 11:30:51.686190 239551 convert_graph.cc:916] Can't identify the cuda device. Running on device 0
F0815 11:30:51.789828 239551 logging.cc:79] assert.h assertion failed at customWinogradConvActLayer.cpp:48 in nvinfer1::cudnn::WinogradConvActLayer::WinogradConvActLayer(const string&, const EngineTensors&, const EngineTensors&, const nvinfer1::ConvolutionParameters&, bool, const std::vector<float>&): matchNbDims(inputs[0], outputs[0]) && (inputs.size() == 1 || inputs[1].extent == outputs[0].extent)
*** Check failure stack trace: ***
@ 0x557d8737c846 base_logging::LogMessage::SendToLog()
@ 0x557d8737cfee base_logging::LogMessage::Flush()
@ 0x557d8737f549 base_logging::LogMessageFatal::~LogMessageFatal()
@ 0x557d8737acf6 __assert_fail
@ 0x557d7c4617c2 nvinfer1::cudnn::WinogradConvActLayer::WinogradConvActLayer()
@ 0x7f2afa3feef0 (unknown)
*** SIGABRT received by PID 239551 (TID 239551) from PID 239551; ***
Model exported with default Tensorflow optimizations just fine. Any help appreciated.