TensortRT Memory Utilization

Description

I am using TensortRT on Nvidia-1080 GPU. I was trying to find out the memory usage of the inference engine on the GPU.
I was trying to find this using the approach described in https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#faq which roughly says the memory usage is The Serialized Engine + 2 * Bindings Sizes + Device Engine Memory Size. However, I found that the actual allocated GPU memory by the whole process is quite different.

The memory consumption computed by the nvidia-smi command nvidia-smi --query-compute-apps=pid,used_memory --format=csv is not the same, to be exact it is 270 MB greater (this is the same value when computed across different engines).

I don’t know why there is a difference in the values, or what is missing in this memory computation?

Hi @whanafy,
Please refer to the below link for FAQ section of the TRT document.
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#faq
Note: The CUDA infrastructure and device code also consume device memory. The amount of memory will vary by platform, device, and TensorRT version. Use cudaGetMemInfo to determine the total amount of device memory in use.
Thanks!