Same model same TensorRT with win10.
350MB GPU memory is required with GTX 1060 ,
700MB GPU memory is required with RTX 2070 ,
1GB GPU memory is required with RTX 3060.
That’s my guess, the architecture of GPU has a great impact on the consumption of GPU memory with TensorRT.?
Environment
TensorRT Version: 7.2.3.4 GPU Type: GTX 1060 RTX 2070 RTX 3060 Nvidia Driver Version: 456.71 CUDA Version: 11.1 CUDNN Version: 8.1 Operating System + Version: win10 Python Version (if applicable): TensorFlow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if container which image + tag):
You can refer below link for all the supported operators list, in case any operator is not supported you need to create a custom plugin to support that operation
Also, request you to share your model and script if not shared already so that we can help you better.
It is expected that TensorRT GPU memory utilization is varies on different GPU architectures.
CUDA compute capability will be different for different GPU architectures. Also new arch would support new unit like tensorCore, this allow us to develop kernels that use more memory to speed up your NN.
Hi @spolisetty, While inferencing, can we make the workspace size limited? For different GPU architecture, the size loaded for the same architecture engine is different.