For the same input, output changes when the neural net has been run for several times in a row

The problem occurs even with a simple 4 layer residual network and one output. Basically, given the same uff file, the portion of the c++ code that parses it and builds the network does it differently every time.
After that the code runs fine; I’ve added a loop around execute to show that the outputs are the same after loading the uff file.
So I suspect something is wrong with uff file that is causing the parser to mis-parse it every time.

Please help me on the matter.
can you reproduce the error?
The outputs will not be the same at the 4th decimal place even when run 10 times in a row, and 4th decimal place errors cannot be due to rounding errors.
Anyone familiar with the error? any solutions?

The problem increases as we add more layers.

The required C++ and uff files are here https://github.com/hdpoorna/nv_forum
And also attached.
Uff_mcts.cpp (8.93 KB)
files.zip (215 KB)

system

Ubuntu 16.04
NVIDIA 1080 ti
driver 384
CUDA 9.0.176
CUDNN 7.1.4
Python 3.5
Tensorflow 1.8
TensorRT 4

This post is a duplicate of https://devtalk.nvidia.com/default/topic/1038634/tensorrt/output-changes-for-the-same-input-when-the-neural-net-has-been-run-for-several-times