Tf v2 to trt in xavier no time improvement

Hi i want to optimise my model using tensorrt , i tried with fp32,fp16 and int8 , i get the same execution time without improvement.
there is the code (int8)

batch_size=30
batched_input=
count=0
listimg=os.listdir("./img/")
for i in listimg:
img =cv2.imread("./img/"+i)
batched_input.append( tf.image.convert_image_dtype(img, dtype=tf.uint8, saturate=False))
print(’*****batched_input shape: ‘, len(batched_input))
conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS
conversion_params = conversion_params._replace(
max_workspace_size_bytes=(1<<32))
conversion_params = conversion_params._replace(precision_mode=“INT8”)
converter = trt.TrtGraphConverterV2(input_saved_model_dir=’./saved’, conversion_params=conversion_params)
def calibration_input_fn():
yield (batched_input, )
converter.convert(calibration_input_fn=calibration_input_fn)
converter.save("./v2trt")
when i run it i get :

2020-04-17 15:50:12.302106: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-04-17 15:50:12.302321: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:948] ARM64 does not support NUMA - returning NUMA node zero
2020-04-17 15:50:12.302415: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5154 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2020-04-17 15:50:12.723667: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:460] There are 57 ops of 23 different types in the graph that are not converted to TensorRT: TensorArrayGatherV3, Exit, TensorArrayReadV3, Reshape, TensorArrayScatterV3, Shape, Enter, Cast, TensorArrayWriteV3, TensorArraySizeV3, NoOp, StridedSlice, Range, Merge, Less, LogicalAnd, LoopCond, Switch, NextIteration, Placeholder, TensorArrayV3, Identity, Add, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2020-04-17 15:50:12.733872: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:636] Number of TensorRT candidate segments: 4
2020-04-17 15:50:12.773356: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:737] Replaced segment 0 consisting of 7 nodes by net/TRTEngineOp_0.
2020-04-17 15:50:12.773666: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:737] Replaced segment 1 consisting of 139 nodes by net/TRTEngineOp_1.
2020-04-17 15:50:12.774443: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:737] Replaced segment 2 consisting of 12 nodes by net/TRTEngineOp_2.
2020-04-17 15:50:12.774662: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:737] Replaced segment 3 consisting of 14 nodes by net/TRTEngineOp_3.
2020-04-17 15:50:12.955839: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] Optimization results for grappler item: tf_graph
2020-04-17 15:50:12.955988: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 233 nodes (-68), 257 edges (-68), time = 126.916ms.
2020-04-17 15:50:12.956026: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] layout: Graph size after: 245 nodes (12), 269 edges (12), time = 49.924ms.
2020-04-17 15:50:12.956048: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 245 nodes (0), 269 edges (0), time = 28.724ms.
2020-04-17 15:50:12.956096: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] TensorRTOptimizer: Graph size after: 77 nodes (-168), 94 edges (-175), time = 84.048ms.
2020-04-17 15:50:12.956164: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 77 nodes (0), 94 edges (0), time = 6.785ms.
2020-04-17 15:50:12.956210: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] Optimization results for grappler item: net/TRTEngineOp_0_native_segment
2020-04-17 15:50:12.956251: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 9 nodes (0), 9 edges (0), time = 1.093ms.
2020-04-17 15:50:12.956275: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] layout: Graph size after: 9 nodes (0), 9 edges (0), time = 0.805ms.
2020-04-17 15:50:12.956314: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 9 nodes (0), 9 edges (0), time = 0.834ms.
2020-04-17 15:50:12.956352: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] TensorRTOptimizer: Graph size after: 9 nodes (0), 9 edges (0), time = 0.051ms.
2020-04-17 15:50:12.956375: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 9 nodes (0), 9 edges (0), time = 0.857ms.
2020-04-17 15:50:12.956433: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] Optimization results for grappler item: net/TRTEngineOp_2_native_segment
2020-04-17 15:50:12.956463: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 15 nodes (0), 14 edges (0), time = 1.545ms.
2020-04-17 15:50:12.956506: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] layout: Graph size after: 15 nodes (0), 14 edges (0), time = 1.109ms.
2020-04-17 15:50:12.956542: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 15 nodes (0), 14 edges (0), time = 1.142ms.
2020-04-17 15:50:12.956577: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] TensorRTOptimizer: Graph size after: 15 nodes (0), 14 edges (0), time = 0.048ms.
2020-04-17 15:50:12.956609: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 15 nodes (0), 14 edges (0), time = 1.132ms.
2020-04-17 15:50:12.956642: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] Optimization results for grappler item: net/TRTEngineOp_3_native_segment
2020-04-17 15:50:12.956675: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 17 nodes (0), 16 edges (0), time = 27.411ms.
2020-04-17 15:50:12.956706: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] layout: Graph size after: 17 nodes (0), 16 edges (0), time = 9.74ms.
2020-04-17 15:50:12.956740: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 17 nodes (0), 16 edges (0), time = 16.963ms.
2020-04-17 15:50:12.956771: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] TensorRTOptimizer: Graph size after: 17 nodes (0), 16 edges (0), time = 0.957ms.
2020-04-17 15:50:12.956805: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 17 nodes (0), 16 edges (0), time = 10.231ms.
2020-04-17 15:50:12.956837: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] Optimization results for grappler item: net/TRTEngineOp_1_native_segment
2020-04-17 15:50:12.956869: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 141 nodes (0), 146 edges (0), time = 12.162ms.
2020-04-17 15:50:12.956892: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] layout: Graph size after: 141 nodes (0), 146 edges (0), time = 14.287ms.
2020-04-17 15:50:12.956923: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 141 nodes (0), 146 edges (0), time = 13.005ms.
2020-04-17 15:50:12.956956: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] TensorRTOptimizer: Graph size after: 141 nodes (0), 146 edges (0), time = 1.093ms.
2020-04-17 15:50:12.956978: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] constant_folding: Graph size after: 141 nodes (0), 146 edges (0), time = 12.634ms.
2020-04-17 15:51:09.517092: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-04-17 15:51:09.746094: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
2020-04-17 15:51:16.434764: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-17 15:51:16.887875: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0

Hi,

TFTRT will automatically fallback the TensorRT non-supported layer into TensorFlow implementation.
The precision mode won’t affect to those fallback layer.

Based on your log:

There are 57 ops of 23 different types in the graph that are not converted to TensorRT: TensorArrayGatherV3, Exit, TensorArrayReadV3, Reshape, TensorArrayScatterV3, Shape, Enter, Cast, TensorArrayWriteV3, TensorArraySizeV3, NoOp, StridedSlice, Range, Merge, Less, LogicalAnd, LoopCond, Switch, NextIteration, Placeholder, TensorArrayV3, Identity, Add,

There are lots of fallback layers in your model.
This may be the reason why you didn’t see an obvious implementation when changing mode.

Thanks.