I am trying to optimize a frozen graph of (FasterRCNN-Resnet101) with tensorrt and run the inference. However, in TF1.14.0 after optimization there is no improvement visible in inference time (250 ms). It seems trt_engine_ops are 0 after conversion !!! I don’t understand the reason that no TRT operation was created.
import tensorflow as tf
import numpy as np
import time
from PIL import Image
im = Image.open("image_test_resized.png")
np_image= np.array(im)
im=np.expand_dims(np_image, axis=0)
from tensorflow.python.compiler.tensorrt import trt_convert as trt
with tf.Session() as sess:
with tf.gfile.GFile("frozen_inference_graph.pb", "rb") as f:
frozen_graph = tf.GraphDef()
frozen_graph.ParseFromString(f.read())
converter = trt.TrtGraphConverter(
is_dynamic_op=True,
input_graph_def=frozen_graph,
nodes_blacklist=["detection_boxes", "detection_scores", "detection_classes", "num_detections"],precision_mode='FP16')
trt_graph = converter.convert()
output_node = tf.import_graph_def(
trt_graph,
input_map={'image_tensor:0':im},
return_elements=["detection_boxes", "detection_scores", "detection_classes", "num_detections"])
trt_engine_ops = len([1 for n in trt_graph.node if str(n.op) == 'TRTEngineOp'])
print("numb. of trt_engine_ops in trt_graph", trt_engine_ops)
start = time.time()
sess.run(output_node)
end = time.time()
print("Executed TF Detection on image in {0} seconds".format(end - start))
The result I get:
WARNING:tensorflow:From pb_to_TRT2.py:18: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2020-01-28 22:11:40.055861: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-01-28 22:11:40.089333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:19:00.0
2020-01-28 22:11:40.089432: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:40.089469: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:40.089502: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:40.089552: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:40.089586: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:40.089619: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:40.092587: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-01-28 22:11:40.092607: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
2020-01-28 22:11:40.092862: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-01-28 22:11:40.209303: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x51d4eb0 executing computations on platform CUDA. Devices:
2020-01-28 22:11:40.209360: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
2020-01-28 22:11:40.232089: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3300000000 Hz
2020-01-28 22:11:40.235084: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5887d80 executing computations on platform Host. Devices:
2020-01-28 22:11:40.235130: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
2020-01-28 22:11:40.235236: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-01-28 22:11:40.235257: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]
WARNING:tensorflow:From pb_to_TRT2.py:22: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.
WARNING:tensorflow:From pb_to_TRT2.py:24: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.
2020-01-28 22:11:42.626568: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2020-01-28 22:11:42.627180: I tensorflow/core/grappler/clusters/single_machine.cc:359] Starting new session
2020-01-28 22:11:42.631030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:19:00.0
2020-01-28 22:11:42.631139: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:42.631174: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:42.631221: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:42.631252: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:42.631282: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:42.631312: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-01-28 22:11:42.631320: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-01-28 22:11:42.631327: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
2020-01-28 22:11:42.631340: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-01-28 22:11:42.631346: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2020-01-28 22:11:42.631353: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2020-01-28 22:11:44.567833: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: tf_graph
2020-01-28 22:11:44.567868: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] constant folding: Graph size after: 11171 nodes (0), 19239 edges (1), time = 939.437ms.
2020-01-28 22:11:44.567873: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] layout: layout did nothing. time = 12.066ms.
2020-01-28 22:11:44.567879: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] constant folding: Graph size after: 11171 nodes (0), 19239 edges (0), time = 365.909ms.
numb. of trt_engine_ops in trt_graph 0
2020-01-28 22:11:49.478312: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
Executed TF Detection on image in 3.685523271560669 seconds
Executed TF Detection on image in 0.23265981674194336 seconds
Executed TF Detection on image in 0.2277967929840088 seconds
Executed TF Detection on image in 0.22919106483459473 seconds
I also tried TF 1.15.0. There I can see that the optimization worked by having 13 trt_engine_ops:
2020-01-28 22:14:16.341746: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:460] There are 5772 ops of 53 different types in the graph that are not converted to TensorRT: TopKV2, NonMaxSuppressionV2, CropAndResize, Fill, Split, Transpose, Gather, Where, Equal, Tile, Reshape, Assert, Const, Exit, NoOp, Pack, LoopCond, Merge, ZerosLike, Range, Less, TensorArraySizeV3, Placeholder, TensorArrayV3, TensorArrayScatterV3, Cast, Shape, Minimum, Switch, TensorArrayReadV3, StridedSlice, Maximum, RealDiv, Slice, LogicalAnd, Mul, Round, TensorArrayWriteV3, GreaterEqual, Max, Size, Greater, Sub, ConcatV2, Unpack, NextIteration, Identity, ExpandDims, ResizeBilinear, Enter, Squeeze, Add, TensorArrayGatherV3, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2020-01-28 22:14:16.443724: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:633] Number of TensorRT candidate segments: 13
2020-01-28 22:14:16.514867: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.514898: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.514969: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node ClipToWindow/TRTEngineOp_0 added for segment 0 consisting of 8 nodes succeeded.
2020-01-28 22:14:16.515007: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.515013: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.515042: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_1 added for segment 1 consisting of 18 nodes succeeded.
2020-01-28 22:14:16.515089: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.515095: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.515121: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_2 added for segment 2 consisting of 18 nodes succeeded.
2020-01-28 22:14:16.515165: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.515170: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.515197: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_3 added for segment 3 consisting of 18 nodes succeeded.
2020-01-28 22:14:16.515241: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.515246: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.515271: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_4 added for segment 4 consisting of 18 nodes succeeded.
2020-01-28 22:14:16.515307: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.515312: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.515347: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_5 added for segment 5 consisting of 519 nodes succeeded.
2020-01-28 22:14:16.516951: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.516972: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.517018: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_6 added for segment 6 consisting of 4 nodes succeeded.
2020-01-28 22:14:16.517043: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.517048: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.517078: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_7 added for segment 7 consisting of 3 nodes succeeded.
2020-01-28 22:14:16.517101: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.517106: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.517134: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node GridAnchorGenerator/TRTEngineOp_8 added for segment 8 consisting of 8 nodes succeeded.
2020-01-28 22:14:16.517165: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.517170: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.517196: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_9 added for segment 9 consisting of 55 nodes succeeded.
2020-01-28 22:14:16.517335: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.517341: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.517369: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_10 added for segment 10 consisting of 7 nodes succeeded.
2020-01-28 22:14:16.517401: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.517406: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.517430: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node TRTEngineOp_11 added for segment 11 consisting of 7 nodes succeeded.
2020-01-28 22:14:16.517466: E tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:101] Could not find any TF GPUs
2020-01-28 22:14:16.517472: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:723] Can't identify the cuda device. Running on device 0
2020-01-28 22:14:16.517496: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:734] TensorRT node SecondStagePostprocessor/TRTEngineOp_12 added for segment 12 consisting of 5 nodes succeeded.
2020-01-28 22:14:16.921990: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:16.956974: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:16.983557: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.135528: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.216288: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.229627: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.244907: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.260996: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.275068: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.288798: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.302031: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.314723: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.329713: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2020-01-28 22:14:17.376069: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: tf_graph
2020-01-28 22:14:17.376098: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 11171 nodes (0), 19239 edges (1), time = 844.105ms.
2020-01-28 22:14:17.376102: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 11.896ms.
2020-01-28 22:14:17.376106: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 11171 nodes (0), 19239 edges (0), time = 337.749ms.
2020-01-28 22:14:17.376110: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 10496 nodes (-675), 18508 edges (-731), time = 673.054ms.
2020-01-28 22:14:17.376115: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 10496 nodes (0), 18508 edges (0), time = 272.011ms.
2020-01-28 22:14:17.376118: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: ClipToWindow/TRTEngineOp_0_native_segment
2020-01-28 22:14:17.376122: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 14 nodes (0), 16 edges (0), time = 0.745ms.
2020-01-28 22:14:17.376126: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.014ms.
2020-01-28 22:14:17.376129: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 14 nodes (0), 16 edges (0), time = 0.653ms.
2020-01-28 22:14:17.376133: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 14 nodes (0), 16 edges (0), time = 0.048ms.
2020-01-28 22:14:17.376137: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 14 nodes (0), 16 edges (0), time = 0.681ms.
2020-01-28 22:14:17.376141: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_9_native_segment
2020-01-28 22:14:17.376145: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 57 nodes (0), 59 edges (0), time = 13.999ms.
2020-01-28 22:14:17.376151: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.078ms.
2020-01-28 22:14:17.376154: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 57 nodes (0), 59 edges (0), time = 10.911ms.
2020-01-28 22:14:17.376158: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 57 nodes (0), 59 edges (0), time = 0.161ms.
2020-01-28 22:14:17.376162: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 57 nodes (0), 59 edges (0), time = 9.882ms.
2020-01-28 22:14:17.376171: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_10_native_segment
2020-01-28 22:14:17.376176: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 3.315ms.
2020-01-28 22:14:17.376182: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.035ms.
2020-01-28 22:14:17.376194: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 3.15ms.
2020-01-28 22:14:17.376202: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 9 nodes (0), 8 edges (0), time = 0.103ms.
2020-01-28 22:14:17.376208: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 3.211ms.
2020-01-28 22:14:17.376211: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_5_native_segment
2020-01-28 22:14:17.376217: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 524 nodes (0), 553 edges (0), time = 52.455ms.
2020-01-28 22:14:17.376223: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.956ms.
2020-01-28 22:14:17.376227: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 524 nodes (0), 553 edges (0), time = 45.113ms.
2020-01-28 22:14:17.376235: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 524 nodes (0), 553 edges (0), time = 4.344ms.
2020-01-28 22:14:17.376244: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 524 nodes (0), 553 edges (0), time = 52.893ms.
2020-01-28 22:14:17.376254: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_7_native_segment
2020-01-28 22:14:17.376261: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 5 nodes (0), 4 edges (0), time = 0.614ms.
2020-01-28 22:14:17.376269: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.011ms.
2020-01-28 22:14:17.376278: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 5 nodes (0), 4 edges (0), time = 0.604ms.
2020-01-28 22:14:17.376283: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 5 nodes (0), 4 edges (0), time = 0.045ms.
2020-01-28 22:14:17.376293: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 5 nodes (0), 4 edges (0), time = 0.592ms.
2020-01-28 22:14:17.376297: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_11_native_segment
2020-01-28 22:14:17.376301: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 2.138ms.
2020-01-28 22:14:17.376306: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.03ms.
2020-01-28 22:14:17.376311: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 2.149ms.
2020-01-28 22:14:17.376315: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 9 nodes (0), 8 edges (0), time = 0.093ms.
2020-01-28 22:14:17.376321: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 2.074ms.
2020-01-28 22:14:17.376324: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_1_native_segment
2020-01-28 22:14:17.376331: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 2.147ms.
2020-01-28 22:14:17.376337: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.036ms.
2020-01-28 22:14:17.376343: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 2.229ms.
2020-01-28 22:14:17.376347: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 24 nodes (0), 27 edges (0), time = 0.115ms.
2020-01-28 22:14:17.376354: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 2.131ms.
2020-01-28 22:14:17.376357: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_3_native_segment
2020-01-28 22:14:17.376362: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 2.171ms.
2020-01-28 22:14:17.376369: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.036ms.
2020-01-28 22:14:17.376373: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 2.18ms.
2020-01-28 22:14:17.376378: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 24 nodes (0), 27 edges (0), time = 0.11ms.
2020-01-28 22:14:17.376384: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 2.062ms.
2020-01-28 22:14:17.376389: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_4_native_segment
2020-01-28 22:14:17.376392: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 1.616ms.
2020-01-28 22:14:17.376398: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.026ms.
2020-01-28 22:14:17.376403: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 1.494ms.
2020-01-28 22:14:17.376407: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 24 nodes (0), 27 edges (0), time = 0.078ms.
2020-01-28 22:14:17.376411: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 1.556ms.
2020-01-28 22:14:17.376417: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: GridAnchorGenerator/TRTEngineOp_8_native_segment
2020-01-28 22:14:17.376423: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 11 nodes (0), 12 edges (0), time = 1.682ms.
2020-01-28 22:14:17.376429: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.025ms.
2020-01-28 22:14:17.376433: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 11 nodes (0), 12 edges (0), time = 1.779ms.
2020-01-28 22:14:17.376438: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 11 nodes (0), 12 edges (0), time = 0.086ms.
2020-01-28 22:14:17.376443: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 11 nodes (0), 12 edges (0), time = 1.733ms.
2020-01-28 22:14:17.376447: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: SecondStagePostprocessor/TRTEngineOp_12_native_segment
2020-01-28 22:14:17.376453: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 7 nodes (0), 6 edges (0), time = 1.665ms.
2020-01-28 22:14:17.376458: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.021ms.
2020-01-28 22:14:17.376462: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 7 nodes (0), 6 edges (0), time = 1.597ms.
2020-01-28 22:14:17.376467: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 7 nodes (0), 6 edges (0), time = 0.074ms.
2020-01-28 22:14:17.376472: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 7 nodes (0), 6 edges (0), time = 1.659ms.
2020-01-28 22:14:17.376476: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_6_native_segment
2020-01-28 22:14:17.376482: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 6 nodes (0), 5 edges (0), time = 1.515ms.
2020-01-28 22:14:17.376494: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.019ms.
2020-01-28 22:14:17.376500: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 6 nodes (0), 5 edges (0), time = 1.481ms.
2020-01-28 22:14:17.376505: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 6 nodes (0), 5 edges (0), time = 0.066ms.
2020-01-28 22:14:17.376510: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 6 nodes (0), 5 edges (0), time = 1.712ms.
2020-01-28 22:14:17.376515: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: TRTEngineOp_2_native_segment
2020-01-28 22:14:17.376522: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 2.185ms.
2020-01-28 22:14:17.376528: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] layout: layout did nothing. time = 0.037ms.
2020-01-28 22:14:17.376533: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 2.221ms.
2020-01-28 22:14:17.376537: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] TensorRTOptimizer: Graph size after: 24 nodes (0), 27 edges (0), time = 0.108ms.
2020-01-28 22:14:17.376543: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] constant_folding: Graph size after: 24 nodes (0), 27 edges (0), time = 2.335ms.
numb. of trt_engine_ops in trt_graph 13
then I get this error during inference !!!
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1348, in _run_fn
self._extend_graph()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1388, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'TRTEngineOp' used by {{node import/TRTEngineOp_5}}with these attrs: [output_shapes=[], workspace_size_bytes=404994176, max_cached_engines_count=1, segment_func=TRTEngineOp_5_native_segment[], segment_funcdef_name="", use_calibration=false, fixed_input_size=true, input_shapes=[], OutT=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], precision_mode="FP16", static_engine=false, serialized_segment="", cached_engine_batches=[], InT=[DT_FLOAT], calibration_data=""]
Registered devices: [CPU, XLA_CPU, XLA_GPU]
Registered kernels:
device='GPU'
[[import/TRTEngineOp_5]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pb_to_TRT2.py", line 54, in <module>
sess.run(output_node)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'TRTEngineOp' used by node import/TRTEngineOp_5 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) with these attrs: [output_shapes=[], workspace_size_bytes=404994176, max_cached_engines_count=1, segment_func=TRTEngineOp_5_native_segment[], segment_funcdef_name="", use_calibration=false, fixed_input_size=true, input_shapes=[], OutT=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], precision_mode="FP16", static_engine=false, serialized_segment="", cached_engine_batches=[], InT=[DT_FLOAT], calibration_data=""]
Registered devices: [CPU, XLA_CPU, XLA_GPU]
Registered kernels:
device='GPU'
[[import/TRTEngineOp_5]]
Please help if the pipeline is correct? Which version of TF I have to use to have the maximum optimization in time and face no error for inference?
Ubuntu: 1.18
TensorRT Container: 19.07