I am using TensortRT on Nvidia-1080 GPU. I was trying to find out the memory usage of the inference engine on the GPU.
I was trying to find this using the approach described in https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#faq which roughly says the memory usage is
The Serialized Engine + 2 * Bindings Sizes + Device Engine Memory Size. However, I found that the actual allocated GPU memory by the whole process is quite different.
The memory consumption computed by the nvidia-smi command
nvidia-smi --query-compute-apps=pid,used_memory --format=csv is not the same, to be exact it is 270 MB greater (this is the same value when computed across different engines).
I don’t know why there is a difference in the values, or what is missing in this memory computation?