TF-TRT5: Could not find tensor InputPH_0 in tensorScales

Provide details on the platforms you are using:
Linux distro and version: Ubuntu 16.04.5 LTS (GNU/Linux 4.4.0-133-generic x86_64)
GPU type: Tesla P4
nvidia driver version: 384.81
CUDA version: 9.0
CUDNN version: 7.4
Python version [if using python]: python 3.5
Tensorflow version: 1.10
TensorRT version: TensorRT 5.0.0 RC / Container image 18.10-py3
If Jetson, OS, hw versions: n/a

Describe the problem
I used the integration TF-TRT script sample tftrt_sample.py and adapted it to optimize the inference of my custom model (based on resnetv2_50). It works with the optimized inference in precision modes native, fp32, fp16 but not with int8. See below:
native:

images/s : 38.7 +/- 0.6, s/batch: 0.02582 +/- 0.00042
RES, Native, 1, 38.73, 0.63, 0.02582, 0.00042

fp32:

images/s : 122.6 +/- 2.5, s/batch: 0.00816 +/- 0.00017
RES, TRT-FP32, 1, 122.59, 2.50, 0.00816, 0.00017

fp16: (poor performance is because P4 doesn’t support fp16)

images/s : 107.9 +/- 10.1, s/batch: 0.00927 +/- 0.00086
RES, TRT-FP16, 1, 107.89, 10.10, 0.00927, 0.00086

int8:

Running calibration: ok
Creating inference graph: 
[b]terminate called after throwing an instance of 'std::runtime_error'
DefaultLogger Tensor resnet_model/Relu_47 is uniformly zero; network calibration failed.
what():  Could not find tensor InputPH_0 in tensorScales

[/b]

Files

Include any logs, source, models (uff, pd) that would be helpful to diagnose the problem.

INFO:tensorflow:Running against TensorRT version 5.0.0
2018-11-14 00:15:55.819363: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 6
2018-11-14 00:15:57.247483: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:756] MULTIPLE tensorrt candidate conversion: 2
2018-11-14 00:15:57.256576: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2936] Segment @scope 'resnet_model/', converted to graph
2018-11-14 00:15:57.296673: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2936] Segment @scope 'resnet_model/dense/', converted to graph
2018-11-14 00:15:57.313203: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:724] Can't determine the device, constructing an allocator at device 0
2018-11-14 00:17:09.410657: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:724] Can't determine the device, constructing an allocator at device 0
Running Calibration
INFO:tensorflow:Starting execution
2018-11-14 00:17:14.533960: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0, 1, 2, 3, 4, 5
2018-11-14 00:17:14.536051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-14 00:17:14.536208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0 1 2 3 4 5
2018-11-14 00:17:14.536271: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N Y Y Y Y Y
2018-11-14 00:17:14.536325: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 1:   Y N Y Y Y Y
2018-11-14 00:17:14.536377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 2:   Y Y N Y Y Y
2018-11-14 00:17:14.536431: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 3:   Y Y Y N Y Y
2018-11-14 00:17:14.536484: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 4:   Y Y Y Y N Y
2018-11-14 00:17:14.536525: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 5:   Y Y Y Y Y N
2018-11-14 00:17:14.547597: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3803 MB memory) -> physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:21:00.0, compute capability: 6.1)
2018-11-14 00:17:14.549030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 3803 MB memory) -> physical GPU (device: 1, name: Tesla P4, pci bus id: 0000:41:00.0, compute capability: 6.1)
2018-11-14 00:17:14.550100: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 3803 MB memory) -> physical GPU (device: 2, name: Tesla P4, pci bus id: 0000:61:00.0, compute capability: 6.1)
2018-11-14 00:17:14.551248: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 3803 MB memory) -> physical GPU (device: 3, name: Tesla P4, pci bus id: 0000:81:00.0, compute capability: 6.1)
2018-11-14 00:17:14.552287: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:4 with 3803 MB memory) -> physical GPU (device: 4, name: Tesla P4, pci bus id: 0000:a1:00.0, compute capability: 6.1)
2018-11-14 00:17:14.553390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:5 with 3803 MB memory) -> physical GPU (device: 5, name: Tesla P4, pci bus id: 0000:c1:00.0, compute capability: 6.1)
INFO:tensorflow:Starting Warmup cycle
2018-11-14 00:17:35.825911: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:567] Starting calibration thread on device 0, Calibration Resource @ 0x7efc44001110
2018-11-14 00:17:41.620515: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:567] Starting calibration thread on device 0, Calibration Resource @ 0x7efc1400e120
INFO:tensorflow:Warmup done. Starting real timing
iter  0   7.145742897987366
Comparison= False
INFO:tensorflow:Timing loop done!
Creating inference graph
2018-11-14 00:25:57.053860: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:153] Starting Calib Conversion
2018-11-14 00:25:57.306475: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:159] Construction of static int8 engine is not implemented yet!. Dynamic engine will be constructed
2018-11-14 00:27:35.888619: E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] <b>DefaultLogger Tensor resnet_model/Relu_47 is uniformly zero; network calibration failed.</b>
terminate called after throwing an instance of 'std::runtime_error'
  <b>what():  Could not find tensor InputPH_0 in tensorScales.</b>
/home/tftrt_sample_custom.py: line 149: 606 Aborted (core dumped)

command line to reproduce test case

python3 tftrt_sample_custom.py --native --FP32 --FP16 --INT8 --num_loops 10 --topN 5 --batch_size 1 --log_file log.txt --network frozen_graph_1541777429.pb --input_node input_tensor --output_nodes my_sigmoid_tensor --img_size 224 --img_file image.jpg --labellist labellist_custom.json

some recommendations?

Hello,

can you share a repro containing tftrt_sample_custom.py , frozen_graph_1541777429.pb , input images, and lablist that exhibit the symptoms you are seeing?

Hi NVES, I have sent you the link with the files. Thanks!

Hello,

missing chexnet_frozen_graph_1541777429.pb.

Hi NVES, I have sent you the link to download it, please let me know if you get it. Thanks

Got it and repro’d it locally. Triaging and will keep you updated.

Thanks, please let me know if you need the source file used to train the model and generate the frozen graph

Hello,

Our engineers have committed a fix for this, and should be available in a future TRT version. I apologize for the inconvenience. Sorry again that I cannot share more information about future release here.
Please pay attention to release announcement here.