Checking for the available CUDA memory, I have found that the command nvidia-smi and the free/total memory used returned by the function cudaMemGetInfo from CUDA doesn’t match. What’s more, I usually use a program called GPU-Z to monitorize my GPU usage and it neither match with the results returned by cudaMemGetInfo.
nvidia-smi
Mon Sep 16 12:18:44 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 416.34 Driver Version: 416.34 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... WDDM | 00000000:01:00.0 Off | N/A |
| 28% 35C P2 27W / 120W | 1163MiB / 6144MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 2356 C ...v2_tested\build\darknet\x64\darknet.exe N/A |
+-----------------------------------------------------------------------------+
Which means 1163MiB = 1219,494MB are used
(The only process running in my computer which consumes CUDA memory is the called darknet.exe, and I measure it while stopping the execution in a breakpoint just before start the code I provide later)
Using the following code which is inside the darknet.exe:
float free_m, total_m, free_m_flow;
size_t free_t, total_t;
cudaMemGetInfo(&free_t, &total_t); //bytes
free_m = (float)free_t;
total_m = (float)total_t;
printf("Free CUDA memory = %f\n", free_m); //available cuda mem in bytes
printf("Total CUDA memory = %f\n", total_m);
I got these numbers (in Bytes):
Free CUDA memory = 4270666496.0
Total CUDA memory = 6442450944.0
Which means 2.171.784.448 Bytes = 2071MB are used.
Why that difference?