Trained model giving slightly different values when tested on P100 and V100 . is there a way to make it consistent.?

Hi all,
i have currently trained a model using V100 GPU and my inference device is using P100. when i test it on test dataset, both the hardwares are giving slightly different values which eventually reduces my overall score on the P100 hardware. i tried to subtract the difference between the values and use the mean of it to scale my P100 prediction values . this helped me a little bit but, this is not solving my problem.
i am using Pytorch 1.8.0 for this work . is there a better way to address this problem of difference in performance due to hardware difference between Training and Inferencing environment.

i am using pytorch native stack for inferencing . i am not using any optimization like TensorRT

pls see my post of TensorRT