Hi,
Trying to get my CUDA program valgrind clean, and hunting down the last remaining memory leaks I isolated some leaks to the following simple driver API code:
#include <cuda.h>
int main()
{
cuInit(0);
CUdevice device;
cuDeviceGet(&device, 0);
CUcontext context;
cuCtxCreate(&context, 0, device);
CUmodule module;
cuModuleLoad(&module, "test_kernel.cubin");
cuModuleUnload(module);
cuCtxDestroy(context);
return 0;
}
The code above will generate the following valgrind output:
==29965== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1)
==29965== malloc/free: in use at exit: 19,084 bytes in 27 blocks.
==29965== malloc/free: 1,728 allocs, 1,701 frees, 8,043,787 bytes allocated.
==29965== For counts of detected errors, rerun with: -v
==29965== searching for pointers to 27 not-freed blocks.
==29965== checked 672,056 bytes.
==29965==
==29965== 0 bytes in 8 blocks are definitely lost in loss record 1 of 12
==29965== at 0x4C22FAB: malloc (vg_replace_malloc.c:207)
==29965== by 0x4E51C99: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E65D08: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E58FDF: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E6A52A: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E4F93F: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E48A89: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4EA3186: cuCtxCreate (in /usr/lib/libcuda.so.190.16)
==29965== by 0x40E6CE: main (in /home/users/laan/apa/hw/cuda/common/x86_64/a.out)
==29965==
==29965==
==29965== 64 bytes in 1 blocks are definitely lost in loss record 3 of 12
==29965== at 0x4C22FAB: malloc (vg_replace_malloc.c:207)
==29965== by 0x4E5BE37: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E53E42: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E65704: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E58FDF: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E5962F: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E4C911: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4EA23C2: cuModuleLoad (in /usr/lib/libcuda.so.190.16)
==29965== by 0x40E6DC: main (in /home/users/laan/apa/hw/cuda/common/x86_64/a.out)
==29965==
==29965==
==29965== 620 bytes in 1 blocks are definitely lost in loss record 9 of 12
==29965== at 0x4C22FAB: malloc (vg_replace_malloc.c:207)
==29965== by 0x4EAC879: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E58C6A: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E5962F: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4E4C911: (within /usr/lib/libcuda.so.190.16)
==29965== by 0x4EA23C2: cuModuleLoad (in /usr/lib/libcuda.so.190.16)
==29965== by 0x40E6DC: main (in /home/users/laan/apa/hw/cuda/common/x86_64/a.out)
==29965==
==29965== LEAK SUMMARY:
==29965== definitely lost: 684 bytes in 10 blocks.
==29965== possibly lost: 0 bytes in 0 blocks.
==29965== still reachable: 18,400 bytes in 17 blocks.
==29965== suppressed: 0 bytes in 0 blocks.
==29965== Reachable blocks (those to which a pointer was found) are not shown.
==29965== To see them, rerun with: --leak-check=full --show-reachable=yes
Did I forget to deallocate something, or are those leaks something that (not wanting to sound picky) should be fixed in the driver API?
/L