Simple question: On a single NVIDIA GPU, how many different trained networks can I run for inference?
For example, I would like to perform different kinds of object detection. I therefore train several DNNs, one for each kind of object detection. Using TensorRT, I would build inference engines for each network, and install these on the GPU. At runtime, I submit every input image to all the inference engines.
I assume I can do all this with one GPU. Roughly how many such networks can I run within TensorRT on a GPU? I guess this would differ for different GPUs. So, let’s say the Volta V100