Output is not stable

m18637959467 · March 3, 2020, 2:04am

tensorrt6, nvidia T4
When GPU-Util is high, the result is not stable. why?

SunilJB · March 3, 2020, 4:30am

Hi,

Can you provide more details about the issue along with the following information so we can better help?
Provide details on the platforms you are using:
o Linux distro and version
o GPU type
o Nvidia driver version
o CUDA version
o CUDNN version
o Python version [if using python]
o Tensorflow version
o TensorRT version

Also, if possible please share the repro script and model file to reproduce the issue.

Thanks

m18637959467 · March 20, 2020, 7:33am

CentOS Linux release 7.4.1708 (Core)
NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1
libcudnn.so.7.6.3
TensorRT-6.0.1.5

SunilJB · March 20, 2020, 8:12am

Hi,
How large is the difference in output results?
Can you try running the code on latest TRT 7?
Also, could you please share the repro script & model file to debug this issue?

Thanks

m18637959467 · March 21, 2020, 6:11am

Sorry, I can’t share the model and script. the difference:

with FP16 model,:
0.0527116694 vs 0.0525396504

with fp32 model :
-0.0244360045 -0.0244359840

TR7 is not available currently in our product.

SunilJB · March 23, 2020, 6:37am

Hi,

Difference in the output is very small:
FP16 0.0002, I think it is acceptable
FP32 0.00000002, it is quite small.
Different kernel may have different calculation which may cause the small difference. It happen when GPU utility is high, which may affect the kernel timing and leads to different kernels are chosen.

Thanks

m18637959467 · March 23, 2020, 8:04am

thank you very much.