We are running a few AI algorithms on a single Nvidia GPU (PC with GeForce or Nvidia Jetson AGX) and we are interested to improve the run time performances.
We have a few questions regarding this issue.
Q1. Does the GPU hold all the AI programs memory in the GPU memory simultaneously all the time or does it reload it whenever we move from performing one AI algorithm to another?
Q2. How can we tell what takes more time, running the AI algorithm or transaction of data memory to/from GPU memory?
Q3. Are there known strategies to improve run time performances when running a few AI algorithms based on the TensorRT together?