How does TensorRT utilize GPU resources?

ralvarezl8d7l · September 11, 2019, 9:56am

If I have 1 instance of a YOLO model (fp16) using ~2GB of VRAM utilizing only 50% of GPU which is running at 40FPS, logically i could still run another instance with both of them running at 40FPS and utilizing 100%GPU with ~4GB VRAM. However this does not seem to be the case in my testing so far so I wanted to know more about how exaclty GPU resources are being used when running on deep learning models like Yolo for example.