Different CUDA memory usage between nvidia-smi and cudaMemGetInfo

Checking for the available CUDA memory, I have found that the command nvidia-smi and the free/total memory used returned by the function cudaMemGetInfo from CUDA doesn’t match. What’s more, I usually use a program called GPU-Z to monitorize my GPU usage and it neither match with the results returned by cudaMemGetInfo.

nvidia-smi
Mon Sep 16 12:18:44 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 416.34       Driver Version: 416.34       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106... WDDM  | 00000000:01:00.0 Off |                  N/A |
| 28%   35C    P2    27W / 120W |   1163MiB /  6144MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2356      C   ...v2_tested\build\darknet\x64\darknet.exe N/A      |
+-----------------------------------------------------------------------------+

Which means 1163MiB = 1219,494MB are used

(The only process running in my computer which consumes CUDA memory is the called darknet.exe, and I measure it while stopping the execution in a breakpoint just before start the code I provide later)

Using the following code which is inside the darknet.exe:

float free_m, total_m, free_m_flow;
size_t free_t, total_t;
cudaMemGetInfo(&free_t, &total_t); //bytes
free_m = (float)free_t;
total_m = (float)total_t;
printf("Free CUDA memory = %f\n", free_m); //available cuda mem in bytes
printf("Total CUDA memory = %f\n", total_m);

I got these numbers (in Bytes):

Free CUDA memory = 4270666496.0
Total CUDA memory = 6442450944.0

Which means 2.171.784.448 Bytes = 2071MB are used.

Why that difference?