I am using CUDA 5.5 and valgrind 3.8.1. When I execute a one-line program with the call
cudaGetDeviceCount()
valgrind is reporting a number of ‘possibly lost’ bytes. More generally, the first CUDA call made by my program (e.g., cudaGetDeviceCount(), cudaGetDevice(), cudaSetDevice()) generates a number of ‘possibly lost’ bytes. I also noticed that cublasCreate() always generates a number of ‘possibly lost’ bytes, even if it is not the first call. I would expect a clean valgrind output with such simple programs.
Is there anyone out there that has seen the same issue?
Do you have insight into valgrind’s exact definition of “possibly lost” memory? Personally I only treat “definitely lost” messages as clear evidence of an actual memory leak and thus actionable.
“possibly lost” means your program is leaking memory, unless you’re doing unusual things with pointers that could cause them to point into the middle of an allocated block;
So it could be a leak or not, depending on the underlying implementation. If cudaGetDeviceCount() is the first CUDA call I do in my program, here is a typical error reported by valgrind:
==14030== 544 bytes in 4 blocks are possibly lost in loss record 42 of 51
==14030== at 0x4C2677B: calloc (vg_replace_malloc.c:593)
==14030== by 0xEDAF4FE: ??? (in /usr/lib64/libcuda.so.319.37)
==14030== by 0xEE218B1: ??? (in /usr/lib64/libcuda.so.319.37)
==14030== by 0xEE1EFC5: ??? (in /usr/lib64/libcuda.so.319.37)
==14030== by 0xEDD4CE5: ??? (in /usr/lib64/libcuda.so.319.37)
==14030== by 0xEDA045F: ??? (in /usr/lib64/libcuda.so.319.37)
==14030== by 0xECD5EFC: ??? (in /usr/lib64/libcuda.so.319.37)
==14030== by 0xECAE7D2: cuInit (in /usr/lib64/libcuda.so.319.37)
==14030== by 0x9B6F013: ??? (in /usr/local/cuda-5.5/lib64/libcudart.so.5.5.22)
==14030== by 0x9B6F0B9: ??? (in /usr/local/cuda-5.5/lib64/libcudart.so.5.5.22)
==14030== by 0x9B6F1FA: ??? (in /usr/local/cuda-5.5/lib64/libcudart.so.5.5.22)
==14030== by 0x9B8A289: cudaGetDeviceCount (in /usr/local/cuda-5.5/lib64/libcudart.so.5.5.22)
To be on the safe side, it looks like it would be best if you would to file a bug, so the CUBLAS team can double check whether there is an actual memory leak or whether this is a case of unusual pointer operations tripping up valgrind. My limited experience is that older version of valgrind tended to flag some false positives in CUDA programs but impossible to tell from just the logs as to what is going on here.