High Latency Variance During Inference

Yes, I already posted my problem there last week (High Latency Variance During Inference - deployment - PyTorch Forums). Since I found out that the problem also exits using onnx runtime I figured out it might not be related to pytorch at all and decided to post here. Also I stumbled across this post (Inconsistent kernel execution times, and affected by Nsight Systems) which sounds similar.