Here’s the output when I try to perform a conversion from a Tensorflow model to a TensorRT one:
2018-11-14 17:08:10.134966: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 1
2018-11-14 17:08:10.135321: I tensorflow/core/grappler/clusters/single_machine.cc:359] Starting new session
2018-11-14 17:08:10.135842: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0
2018-11-14 17:08:10.135865: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-14 17:08:10.135868: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] 0
2018-11-14 17:08:10.135872: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0: N
2018-11-14 17:08:10.136017: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7377 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:41:00.0, compute capability: 6.1)
2018-11-14 17:08:10.928804: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:853] MULTIPLE tensorrt candidate conversion: 2
2018-11-14 17:08:10.945702: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2952] Segment @scope 'scene_model/', converted to graph
2018-11-14 17:08:10.945749: E tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] Can't find a device placement for the op!
2018-11-14 17:08:11.030595: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2952] Segment @scope 'scene_model/res_net/', converted to graph
2018-11-14 17:08:11.030629: E tensorflow/contrib/tensorrt/convert/convert_graph.cc:418] Can't find a device placement for the op!
2018-11-14 17:08:36.361605: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:952] Engine scene_model/my_trt_op_0 creation for segment 0, composed of 495 nodes succeeded.
2018-11-14 17:08:36.398381: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:952] Engine scene_model/res_net/my_trt_op_1 creation for segment 1, composed of 4 nodes succeeded.
2018-11-14 17:08:37.650622: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-14 17:08:37.865859: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-14 17:08:38.104241: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-14 17:08:38.197657: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:185] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2018-11-14 17:08:38.230393: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:404] Optimization results for grappler item: tf_graph
2018-11-14 17:08:38.230440: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] constant folding: Graph size after: 644 nodes (-333), 672 edges (-333), time = 361.354ms.
2018-11-14 17:08:38.230445: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] layout: Graph size after: 662 nodes (18), 690 edges (18), time = 85.437ms.
2018-11-14 17:08:38.230448: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] TensorRTOptimizer: Graph size after: 165 nodes (-497), 177 edges (-513), time = 25720.6074ms.
2018-11-14 17:08:38.230451: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] constant folding: Graph size after: 165 nodes (0), 177 edges (0), time = 282.826ms.
2018-11-14 17:08:38.230454: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] TensorRTOptimizer: Graph size after: 165 nodes (0), 177 edges (0), time = 456.375ms.
2018-11-14 17:08:38.230457: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:404] Optimization results for grappler item: scene_model/my_trt_op_0_native_segment
2018-11-14 17:08:38.230460: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] constant folding: Graph size after: 496 nodes (0), 511 edges (0), time = 201.775ms.
2018-11-14 17:08:38.230462: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] layout: Graph size after: 496 nodes (0), 511 edges (0), time = 114.335ms.
2018-11-14 17:08:38.230465: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] TensorRTOptimizer: Graph size after: 496 nodes (0), 511 edges (0), time = 31.937ms.
2018-11-14 17:08:38.230468: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] constant folding: Graph size after: 496 nodes (0), 511 edges (0), time = 183.275ms.
2018-11-14 17:08:38.230470: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] TensorRTOptimizer: Graph size after: 496 nodes (0), 511 edges (0), time = 32.684ms.
2018-11-14 17:08:38.230473: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:404] Optimization results for grappler item: scene_model/res_net/my_trt_op_1_native_segment
2018-11-14 17:08:38.230476: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] constant folding: Graph size after: 5 nodes (0), 4 edges (0), time = 76.418ms.
2018-11-14 17:08:38.230479: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] layout: Graph size after: 5 nodes (0), 4 edges (0), time = 46.055ms.
2018-11-14 17:08:38.230482: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] TensorRTOptimizer: Graph size after: 5 nodes (0), 4 edges (0), time = 15.898ms.
2018-11-14 17:08:38.230484: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] constant folding: Graph size after: 5 nodes (0), 4 edges (0), time = 77.477ms.
2018-11-14 17:08:38.230487: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:406] TensorRTOptimizer: Graph size after: 5 nodes (0), 4 edges (0), time = 14.097ms.
A couple of questions:
-
What is “Can’t find device placement for op?”?
-
The generated TensorRT graph is way bigger than the original graph. Is that normal? Using FP32 by the way.
-
Inference takes way longer too. Around 20 times longer.