TensorRT: Can't find device placement for op

Here’s the output when I try to perform a conversion from a Tensorflow model to a TensorRT one:

2018-11-14 17:08:10.134966: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 1
2018-11-14 17:08:10.135321: I tensorflow/core/grappler/clusters/single_machine.cc:359] Starting new session
2018-11-14 17:08:10.135842: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0
2018-11-14 17:08:10.135865: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-14 17:08:10.135868: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]      0 
2018-11-14 17:08:10.135872: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0:   N 
2018-11-14 17:08:10.136017: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7377 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:41:00.0, compute capability: 6.1)
2018-11-14 17:08:10.928804: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:853] MULTIPLE tensorrt candidate conversion: 2
2018-11-14 17:08:10.945702: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2952] Segment @scope 'scene_model/', converted to graph
2018-11-14 17:08:10.945749: E tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] Can't find a device placement for the op!
2018-11-14 17:08:11.030595: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2952] Segment @scope 'scene_model/res_net/', converted to graph
2018-11-14 17:08:11.030629: E tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] Can't find a device placement for the op!
2018-11-14 17:08:36.361605: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:952] Engine scene_model/my_trt_op_0 creation for segment 0, composed of 495 nodes succeeded.
2018-11-14 17:08:36.398381: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:952] Engine scene_model/res_net/my_trt_op_1 creation for segment 1, composed of 4 nodes succeeded.
2018-11-14 17:08:37.650622: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-14 17:08:37.865859: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-14 17:08:38.104241: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-14 17:08:38.197657: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-14 17:08:38.230393: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:404] Optimization results for grappler item: tf_graph
2018-11-14 17:08:38.230440: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   constant folding: Graph size after: 644 nodes (-333), 672 edges (-333), time = 361.354ms.
2018-11-14 17:08:38.230445: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   layout: Graph size after: 662 nodes (18), 690 edges (18), time = 85.437ms.
2018-11-14 17:08:38.230448: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   TensorRTOptimizer: Graph size after: 165 nodes (-497), 177 edges (-513), time = 25720.6074ms.
2018-11-14 17:08:38.230451: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   constant folding: Graph size after: 165 nodes (0), 177 edges (0), time = 282.826ms.
2018-11-14 17:08:38.230454: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   TensorRTOptimizer: Graph size after: 165 nodes (0), 177 edges (0), time = 456.375ms.
2018-11-14 17:08:38.230457: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:404] Optimization results for grappler item: scene_model/my_trt_op_0_native_segment
2018-11-14 17:08:38.230460: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   constant folding: Graph size after: 496 nodes (0), 511 edges (0), time = 201.775ms.
2018-11-14 17:08:38.230462: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   layout: Graph size after: 496 nodes (0), 511 edges (0), time = 114.335ms.
2018-11-14 17:08:38.230465: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   TensorRTOptimizer: Graph size after: 496 nodes (0), 511 edges (0), time = 31.937ms.
2018-11-14 17:08:38.230468: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   constant folding: Graph size after: 496 nodes (0), 511 edges (0), time = 183.275ms.
2018-11-14 17:08:38.230470: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   TensorRTOptimizer: Graph size after: 496 nodes (0), 511 edges (0), time = 32.684ms.
2018-11-14 17:08:38.230473: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:404] Optimization results for grappler item: scene_model/res_net/my_trt_op_1_native_segment
2018-11-14 17:08:38.230476: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   constant folding: Graph size after: 5 nodes (0), 4 edges (0), time = 76.418ms.
2018-11-14 17:08:38.230479: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   layout: Graph size after: 5 nodes (0), 4 edges (0), time = 46.055ms.
2018-11-14 17:08:38.230482: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   TensorRTOptimizer: Graph size after: 5 nodes (0), 4 edges (0), time = 15.898ms.
2018-11-14 17:08:38.230484: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   constant folding: Graph size after: 5 nodes (0), 4 edges (0), time = 77.477ms.
2018-11-14 17:08:38.230487: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406]   TensorRTOptimizer: Graph size after: 5 nodes (0), 4 edges (0), time = 14.097ms.

A couple of questions:

  1. What is “Can’t find device placement for op?”?

  2. The generated TensorRT graph is way bigger than the original graph. Is that normal? Using FP32 by the way.

  3. Inference takes way longer too. Around 20 times longer.

Hello,

can you provide details of your environment?

  • tensorflow version?
  • gpu type?
  • can you share the TF graph and source used to convert?

tensorflow-gpu 1.11.0
gpu p40

bert

I have same problem. Have you fix this bug?
my environment is : tensorflow 1.12.0
gpu p100
model is facenet