I’m converting a TensorFlow graph to TensorRT engine. For the same input, TensorFlow graph and TensorRT engine produce identical result up to an tf.nn.sigmoid op. But the output of the sigmoid function differs slightly between TF graph and TensorRT engine. I wonder if this is to be expected.
For example, given an input:
[8.879764 -8.724520 -10.623482 -11.822342 -12.868923 -11.805139 -13.092369 -11.573037 -11.112819 -11.025951]
I think this is normal (unless you have disabled reduced precision) as one of the optimizations that TensorRT does is to use ‘FP16 and INT8 reduced precision calibration’ and hence you see the truncated, less precise results with TensorRT than with TF which uses full-precision (FP32).
But in this case, I’m not using reduced precision. I’m creating tensorRT engine using the buildCudaEngine() function in C++ API. If I’m not mistaken, it creates TensorRT engine in 32bit precision.
I see, in that case, it looks abnormal. Maybe you can query the builder to know for sure if the engine is created in FP32. Otherwise, I think NVidia person might have more information for you.
We have tested the tf.nn.sigmoid() op with TensorRT but cannot reproduce this issue.
Please remember to convert the input to np.float32 to reserve full precision.
Thank you for looking into this and providing the sample code. I can confirm that I get identical result in this isolated test case you provided. However, when the sigmoid layer is part of a larger network, I still see the difference between TensorFlow and TensorRT, even though the output from previous conv layer is identical between TensorFlow and TensorRT. Let me see if I can extract a test case that exhibit the problem.
It really help if you can extract a simple test case to reproduce the issue you met.
Will wait for your update and discuss with our internal team for further suggestion.