I’m using TensorRT FP16 precision mode to optimize my deep learning model. And I use this optimised model on Jetson TX2. While testing the model, I have observed that TensorRT inference engine is not deterministic. In other words, my optimized model gives different FPS values between 40 and 120 FPS for same input images.
I started to think that the source of the non-determinism is floating point operations when I see this comment about CUDA:
Is type of precision such as FP16, FP32 and INT8 affects determinism of TensorRT on Jetson TX2? Or anything?
Do you have any thoughs?