I recently upgraded from TensorRT 6 to TensorRT 7 and noticed that the tensor output node data obtained, from the same input data, were slightly off (about a 2% difference from results of TensorRT 6 and 4% off from expected). After enabling the debug logger in TensoRT 7 and comparing to TensoRT 6, I found that the primary difference was that 2 of my Caffe Model layers (i.e. convolution and eltwise) were being fused together during the optimisation phase.
I therefore proceeded by specifically marking this convolution layer for capture, which allowed my TensorRT network to skip the fusion and gave me similar results to TensorRT 6.
I am confused why fusing 2 layers would yield such different results.
TensorRT 7: CUDA 10.2: CUDNN 220.127.116.11-1: Ubuntu 18.04 64 bit:
Perfect, I will wait for TRT 7.1 on Ubuntu and test it. Following this I will post the logs if I still experience similar issues. For now I am at least able to by-pass the fusion by requesting to capture the output which is being fused.
Do you have a rough idea on when 7.1 EA will be available ?