Long Warm-Up Time of GPU


I am working with a Jetson Xavier NX. Every time I launch a neural network in a real-time application, using the official Torch C++ library on the GPU and a model loaded via TorchScript, the first couple of network forward passes take multiple seconds instead of about 50 ms. After about 5-6 model passes the expected runtime of 50 ms is achieved.

Is there a way to speed up this process? Or just do it once? For now, every time I restart the application I have to go through this process again.

Thank you very much!


Could you use a profiler to check what jobs occupied the GPU in the initial?

For example, if some JIT compiling is required, you can compile the app with the correct GPU architecture to avoid it.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.