The same model consumes different sizes of GPU memory in different GPU


Same model same TensorRT with win10.
350MB GPU memory is required with GTX 1060 ,
700MB GPU memory is required with RTX 2070 ,
1GB GPU memory is required with RTX 3060.

That’s my guess, the architecture of GPU has a great impact on the consumption of GPU memory with TensorRT.?


TensorRT Version:
GPU Type: GTX 1060 RTX 2070 RTX 3060
Nvidia Driver Version: 456.71
CUDA Version: 11.1
CUDNN Version: 8.1
Operating System + Version: win10
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Can you try running your model with trtexec command, and share the “”–verbose"" log in case if the issue persist

You can refer below link for all the supported operators list, in case any operator is not supported you need to create a custom plugin to support that operation

Also, request you to share your model and script if not shared already so that we can help you better.


1060_trtexec.txt (13.1 KB)
2070_trtexec.txt (16.3 KB)

same model , same softeware ,run by trtexec .
414MB GPU memory is required with GTX 1060
788MB GPU memory is required with RTX 2070

@NVES hi, I provide the info


  • It is expected that TensorRT GPU memory utilization is varies on different GPU architectures.
  • CUDA compute capability will be different for different GPU architectures. Also new arch would support new unit like tensorCore, this allow us to develop kernels that use more memory to speed up your NN.

Thank you.

1 Like

If I want less memory consumption , and low speed is ok, can i control this ?


We can restrict/extend memory consumption using trtexec --worskpace flag.

Thank you.

Hi @spolisetty, While inferencing, can we make the workspace size limited? For different GPU architecture, the size loaded for the same architecture engine is different.


Hope the following samples may help you.

Thank you.