Description
I am using TensortRT on Nvidia-1080 GPU. I was trying to find out the memory usage of the inference engine on the GPU.
I was trying to find this using the approach described in Developer Guide :: NVIDIA Deep Learning TensorRT Documentation which roughly says the memory usage is The Serialized Engine + 2 * Bindings Sizes + Device Engine Memory Size
. However, I found that the actual allocated GPU memory by the whole process is quite different.
The memory consumption computed by the nvidia-smi command nvidia-smi --query-compute-apps=pid,used_memory --format=csv
is not the same, to be exact it is 270 MB greater (this is the same value when computed across different engines).
I don’t know why there is a difference in the values, or what is missing in this memory computation?