TF-TRT no engine generated

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version: 7.1.3
GPU Type: AGX Xavier
Nvidia Driver Version: JETPACK 4.5
CUDA Version: 10.2
CUDNN Version: 8.0
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 2.4.0
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
trtnew.py (997 Bytes)
saved_model

Steps To Reproduce

I was trying using tf-trt to generate trt engine on Xavier. The trtnew.py was run by python3 trtnew.py. Then, I got the following message. I didn’t find the TensorRT engine generated in my output directory ‘my_model/converted_model’. There are 4 files there but they are all 0 byte. Is there anything wrong with my script or trained model?
Log:

dewei@dewei-desktop:~/Documents$ python3 trtnew.py
2021-03-23 14:10:14.430063: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-23 14:10:18.424296: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-03-23 14:10:18.427041: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-03-23 14:10:18.430794: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:18.430967: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Found device 0 with properties: 
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.377GHz coreCount: 8 deviceMemorySize: 15.45GiB deviceMemoryBandwidth: 82.08GiB/s
2021-03-23 14:10:18.431051: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-23 14:10:18.433870: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2021-03-23 14:10:18.434018: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2021-03-23 14:10:18.436362: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-03-23 14:10:18.436923: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-03-23 14:10:18.439887: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-03-23 14:10:18.442150: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2021-03-23 14:10:18.442580: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-03-23 14:10:18.442821: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:18.443093: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:18.443165: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1888] Adding visible gpu devices: 0
2021-03-23 14:10:18.443233: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-23 14:10:19.914406: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1287] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-03-23 14:10:19.914585: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1293]      0 
2021-03-23 14:10:19.914632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306] 0:   N 
2021-03-23 14:10:19.915310: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:19.915732: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:19.915991: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:19.916173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5275 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2021-03-23 14:10:19.939951: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer.so.7
2021-03-23 14:10:43.702242: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-03-23 14:10:43.702663: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:43.702828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Found device 0 with properties: 
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.377GHz coreCount: 8 deviceMemorySize: 15.45GiB deviceMemoryBandwidth: 82.08GiB/s
2021-03-23 14:10:43.702931: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-23 14:10:43.703008: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2021-03-23 14:10:43.703057: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2021-03-23 14:10:43.703099: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-03-23 14:10:43.703139: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-03-23 14:10:43.703214: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-03-23 14:10:43.703259: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2021-03-23 14:10:43.703326: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-03-23 14:10:43.703507: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:43.703659: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:43.703714: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1888] Adding visible gpu devices: 0
2021-03-23 14:10:43.704764: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-03-23 14:10:43.705016: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:43.705138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Found device 0 with properties: 
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.377GHz coreCount: 8 deviceMemorySize: 15.45GiB deviceMemoryBandwidth: 82.08GiB/s
2021-03-23 14:10:43.705185: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-23 14:10:43.705240: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2021-03-23 14:10:43.705281: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2021-03-23 14:10:43.705320: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-03-23 14:10:43.705360: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-03-23 14:10:43.705420: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-03-23 14:10:43.705460: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2021-03-23 14:10:43.705502: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-03-23 14:10:43.705621: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:43.705758: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:43.705831: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1888] Adding visible gpu devices: 0
2021-03-23 14:10:44.558311: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1287] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-03-23 14:10:44.558409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1293]      0 
2021-03-23 14:10:44.558439: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306] 0:   N 
2021-03-23 14:10:44.558840: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:44.559141: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:10:44.559241: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5275 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2021-03-23 14:12:23.116985: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:23.117149: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2021-03-23 14:12:23.117397: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2021-03-23 14:12:23.118027: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-03-23 14:12:23.118256: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:23.118351: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Found device 0 with properties: 
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.377GHz coreCount: 8 deviceMemorySize: 15.45GiB deviceMemoryBandwidth: 82.08GiB/s
2021-03-23 14:12:23.118403: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-23 14:12:23.118499: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2021-03-23 14:12:23.118546: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2021-03-23 14:12:23.118607: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-03-23 14:12:23.118671: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-03-23 14:12:23.118723: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-03-23 14:12:23.118765: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2021-03-23 14:12:23.118805: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-03-23 14:12:23.118976: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:23.119127: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:23.119182: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1888] Adding visible gpu devices: 0
2021-03-23 14:12:23.119251: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1287] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-03-23 14:12:23.119275: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1293]      0 
2021-03-23 14:12:23.119307: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306] 0:   N 
2021-03-23 14:12:23.119447: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:23.119647: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:23.119752: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5275 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2021-03-23 14:12:23.120859: W tensorflow/core/platform/profile_utils/cpu_utils.cc:116] Failed to find bogomips or clock in /proc/cpuinfo; cannot determine CPU frequency
2021-03-23 14:12:25.657738: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 4997 nodes (4509), 10438 edges (9943), time = 386.766ms.
  function_optimizer: function_optimizer did nothing. time = 4.548ms.

2021-03-23 14:12:56.356576: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:56.356765: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2021-03-23 14:12:56.356973: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2021-03-23 14:12:56.357372: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-03-23 14:12:56.357602: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:56.357721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Found device 0 with properties: 
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.377GHz coreCount: 8 deviceMemorySize: 15.45GiB deviceMemoryBandwidth: 82.08GiB/s
2021-03-23 14:12:56.357775: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-23 14:12:56.357844: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2021-03-23 14:12:56.357889: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2021-03-23 14:12:56.358000: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-03-23 14:12:56.358046: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-03-23 14:12:56.358092: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-03-23 14:12:56.358154: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2021-03-23 14:12:56.358202: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-03-23 14:12:56.358328: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:56.358489: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:56.358554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1888] Adding visible gpu devices: 0
2021-03-23 14:12:56.358627: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1287] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-03-23 14:12:56.358652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1293]      0 
2021-03-23 14:12:56.358698: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306] 0:   N 
2021-03-23 14:12:56.358910: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:56.359117: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-23 14:12:56.359218: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5275 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2021-03-23 14:13:02.241299: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:790] There are 621 ops of 52 different types in the graph that are not converted to TensorRT: Sum, GreaterEqual, Where, Reciprocal, ResizeBilinear, Split, TensorListGetItem, Cast, StopGradient, Pad, Slice, Mul, LogicalAnd, Range, Less, Merge, NextIteration, Switch, Select, Exit, LoopCond, Pack, NoOp, Size, Greater, GatherV2, ExpandDims, Identity, Assert, NonMaxSuppressionV5, Squeeze, Enter, TensorListFromTensor, AddV2, TensorListSetItem, Placeholder, TensorListStack, TensorListReserve, Const, Sub, Reshape, Transpose, Minimum, Shape, Maximum, StridedSlice, Fill, Unpack, ConcatV2, Exp, Equal, TopKV2, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2021-03-23 14:13:04.137642: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:757] Number of TensorRT candidate segments: 4
2021-03-23 14:13:04.879147: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:851] Replaced segment 0 consisting of 4 nodes by StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/Area/TRTEngineOp_0_0.
2021-03-23 14:13:04.879443: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:851] Replaced segment 1 consisting of 4 nodes by StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow/Area/TRTEngineOp_0_1.
2021-03-23 14:13:04.879687: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:851] Replaced segment 2 consisting of 87 nodes by StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/TRTEngineOp_0_2.
2021-03-23 14:13:04.880164: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:851] Replaced segment 3 consisting of 933 nodes by StatefulPartitionedCall/TRTEngineOp_0_3.
2021-03-23 14:13:06.515463: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] Optimization results for grappler item: tf_graph
  constant_folding: Graph size after: 2233 nodes (-2459), 7326 edges (-2725), time = 2068.16895ms.
  layout: Graph size after: 2267 nodes (34), 7360 edges (34), time = 460.172ms.
  constant_folding: Graph size after: 2267 nodes (0), 7360 edges (0), time = 240.303ms.
  TensorRTOptimizer: Graph size after: 1247 nodes (-1020), 1707 edges (-5653), time = 3366.40698ms.
  constant_folding: Graph size after: 1243 nodes (-4), 1707 edges (0), time = 133.495ms.
Optimization results for grappler item: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/TRTEngineOp_0_2_native_segment
  constant_folding: Graph size after: 105 nodes (0), 104 edges (0), time = 5.242ms.
  layout: Graph size after: 105 nodes (0), 104 edges (0), time = 5.332ms.
  constant_folding: Graph size after: 105 nodes (0), 104 edges (0), time = 5.189ms.
  TensorRTOptimizer: Graph size after: 105 nodes (0), 104 edges (0), time = 0.256ms.
  constant_folding: Graph size after: 105 nodes (0), 104 edges (0), time = 5.106ms.
Optimization results for grappler item: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/Area/TRTEngineOp_0_0_native_segment
  constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 0.809ms.
  layout: Graph size after: 9 nodes (0), 8 edges (0), time = 0.608ms.
  constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 0.576ms.
  TensorRTOptimizer: Graph size after: 9 nodes (0), 8 edges (0), time = 0.028ms.
  constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 0.634ms.
Optimization results for grappler item: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/ClipToWindow/Area/TRTEngineOp_0_1_native_segment
  constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 0.727ms.
  layout: Graph size after: 9 nodes (0), 8 edges (0), time = 0.682ms.
  constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 0.623ms.
  TensorRTOptimizer: Graph size after: 9 nodes (0), 8 edges (0), time = 0.029ms.
  constant_folding: Graph size after: 9 nodes (0), 8 edges (0), time = 0.634ms.
Optimization results for grappler item: StatefulPartitionedCall/TRTEngineOp_0_3_native_segment
  constant_folding: Graph size after: 937 nodes (0), 966 edges (0), time = 179.573ms.
  layout: Graph size after: 937 nodes (0), 966 edges (0), time = 381.25ms.
  constant_folding: Graph size after: 937 nodes (0), 966 edges (0), time = 179.629ms.
  TensorRTOptimizer: Graph size after: 937 nodes (0), 966 edges (0), time = 36.968ms.
  constant_folding: Graph size after: 937 nodes (0), 966 edges (0), time = 182.959ms.

2021-03-23 14:33:15.708868: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-03-23 14:33:17.822496: I tensorflow/compiler/tf2tensorrt/common/utils.cc:58] Linked TensorRT version: 7.1.3
2021-03-23 14:33:17.823162: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer.so.7
2021-03-23 14:33:17.824186: I tensorflow/compiler/tf2tensorrt/common/utils.cc:60] Loaded TensorRT version: 7.1.3
2021-03-23 14:33:17.844918: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer_plugin.so.7
WARNING:absl:Found untraced functions such as restored_function_body, restored_function_body, restored_function_body, restored_function_body, restored_function_body while saving (showing 5 of 315). These functions will not be directly callable after loading.
WARNING:absl:Found untraced functions such as restored_function_body, restored_function_body, restored_function_body, restored_function_body, restored_function_body while saving (showing 5 of 315). These functions will not be directly callable after loading.

Hi,
We recommend you to check the below samples links, as they might answer your concern
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#samples
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-722/quick-start-guide/index.html#framework-integration
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#integrate-ovr
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#usingtftrt

If issue persist, request you to share the model and script so that we can try reproducing the issue at our end.
Thanks!

Hi,
I do need help reproducing the issue from your end. Model and script are attached in the first thread. Thanks!

Hi @wdw0908,

The conversion script provided works for us, converts the network and generates the engine files. A few things to note:

  • There are multiple engine files, and some nodes in the computation graph are not converted but left as TF nodes. This means that the the TF is needed to run the converted model, you cannot run TRT only inference, for that complete conversion is necessary.

  • Try deleting the output folder and re running the conversion.

  • The result of the script could depend on the TF version and on the target GPU. I have used the latest TF nightly to do the conversion, I got around 2x speedup on a v100.

  • While it is not an error, not recommended to use maximum_cached_engines=1000 . Prefer a low value.

  • Again not an error, but you would want to define the input function as such:

      def my_input_fn():
      inp1 = np.random.normal(size=(1,640,640,3)).astype(np.uint8)
      yield (inp1,) # <- Note the comma! TF-TRT expects list or tuple of input tensors, we give a tuple with 1 element.
    

Thank you.