Create Inference Graph interpretation

marc.schnaebele · March 21, 2019, 1:16pm

Hello,

I’ve juste created an Inferenced Graph using TensorRT provided by JetPack4 on the Jetson Xavier. It seems to work, but I’m not sure how to interpret the results (full logs at the bottom).

The first confusing thing is the mismatch versions.

Compiled against version 5.0.6, but loaded 5.0.3. Things may not work

I loaded the meta and checkpoints graphe, convert it to frozen graph and then passed it to the creat_inference_graph method. TensorRT never changed. How is it possible to have different versions? And is the below problem due to this versions difference ?

There are 1486 ops of 23 different types in the graph that are not converted to TensorRT: SaveV2, Add, RestoreV2, RandomUniform, Mul, StridedSlice, Identity, ExpandDims, Placeholder, ScalarSummary, MatMul, Conv2D, BiasAdd, Sub, Const, MaxPool, AvgPool, Reshape, NoOp, Relu, LRN, Concat, Softmax

The second point is about the interpretation of the inference. What do the following lines mean? Does that mean that TensorRT could “only” optimize 535 edges and that it makes 2 ops less? Is that much or not ? And is it normal to have no optimizations on the other lines?

Number of TensorRT candidate segments: 0
constant folding: Graph size after: 1484 nodes (0), 1357 edges (-535), time = 2391.59106ms.
layout: Graph size after: 1484 nodes (0), 1357 edges (0), time = 353.034ms.
constant folding: Graph size after: 1484 nodes (0), 1357 edges (0), time = 550.076ms.
TensorRTOptimizer: Graph size after: 1484 nodes (0), 1357 edges (0), time = 438.321ms.

FULL LOGS

WARNING:tensorflow:TensorRT mismatch. Compiled against version 5.0.6, but loaded 5.0.3. Things may not work
2019-03-21 12:19:54.658209: I tensorflow/core/grappler/devices.cc:57] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2019-03-21 12:19:54.658968: I tensorflow/core/grappler/clusters/single_machine.cc:359] Starting new session
2019-03-21 12:19:54.661732: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-03-21 12:19:54.661919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-21 12:19:54.661984: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-03-21 12:19:54.662036: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-03-21 12:19:54.662236: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9374 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2019-03-21 12:20:01.782960: W tensorflow/contrib/tensorrt/convert/trt_optimization_pass.cc:219] Configured batch size 1 is less than input batch size 1056 adjusting maximum batch size to match input batch size
2019-03-21 12:20:02.113407: I tensorflow/contrib/tensorrt/segment/segment.cc:461] There are 1486 ops of 23 different types in the graph that are not converted to TensorRT: SaveV2, Add, RestoreV2, RandomUniform, Mul, StridedSlice, Identity, ExpandDims, Placeholder, ScalarSummary, MatMul, Conv2D, BiasAdd, Sub, Const, MaxPool, AvgPool, Reshape, NoOp, Relu, LRN, Concat, Softmax, (For more information see https://docs.nvidia.com/deeplearning/dgx/integrate-tf-trt/index.html#support-ops).
2019-03-21 12:20:02.118502: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:928] Number of TensorRT candidate segments: 0
2019-03-21 12:20:02.312790: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:616] Optimization results for grappler item: tf_graph
2019-03-21 12:20:02.313191: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:618]   constant folding: Graph size after: 1484 nodes (0), 1357 edges (-535), time = 2391.59106ms.
2019-03-21 12:20:02.313361: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:618]   layout: Graph size after: 1484 nodes (0), 1357 edges (0), time = 353.034ms.
2019-03-21 12:20:02.313475: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:618]   constant folding: Graph size after: 1484 nodes (0), 1357 edges (0), time = 550.076ms.
2019-03-21 12:20:02.313640: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:618]   TensorRTOptimizer: Graph size after: 1484 nodes (0), 1357 edges (0), time = 438.321ms.