I am working with a Jetson Xavier NX. Every time I launch a neural network in a real-time application, using the official Torch C++ library on the GPU and a model loaded via TorchScript, the first couple of network forward passes take multiple seconds instead of about 50 ms. After about 5-6 model passes the expected runtime of 50 ms is achieved.
Is there a way to speed up this process? Or just do it once? For now, every time I restart the application I have to go through this process again.
Thank you very much!