Understanding memory.used of nvidia-smi

Using nvidia-smi as described here in Colab gives me a maximum value for memory.used of 4910MiB. Anyway after the execution (so no process using GPU now) that value is still reported executing nvidia-smi, does this mean that the value reported for memory.used is intended as the peak value reached (I would use this as lower limit for my requirements)?

memory.used is the total memory used by all applications running on the GPU. Are you sure that the process has exited or that you’ve run nvidia-smi again?

So it is a current value, not a peak.
I’m totally sure, I used Colab for this test and I execute a single cell with nvidia-smi after the execution of the cell containing my model prediction. The fact is that pytorch uses a caching memory allocator as described here