Memory leaks in libcudart 4.2.9 or misuse?

fhellman · June 5, 2012, 9:26am

Hi!

I have the following program:

#include <cuda.h>

__global__ void noop()

{

}

int main()

{

  cudaSetDevice(0);

  noop<<<1,1>>>();

  cudaDeviceSynchronize();

  cudaDeviceReset();

  return 0;

}

I compile this with CUDA toolkit 4.2.9 to binary “noop” and run it on a machine with NVIDIA driver 295.49 through valgrind, using the following command:

$ valgrind --leak-check=full ./noop

==7962== Memcheck, a memory error detector

==7962== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.

==7962== Using Valgrind-3.5.0-Debian and LibVEX; rerun with -h for copyright info

==7962== Command: ./empty

==7962== 

==7962== 

==7962== HEAP SUMMARY:

==7962==     in use at exit: 30,053 bytes in 47 blocks

==7962==   total heap usage: 2,389 allocs, 2,342 frees, 898,204 bytes allocated

==7962== 

==7962== 16 bytes in 1 blocks are definitely lost in loss record 4 of 33

==7962==    at 0x4C2596C: operator new(unsigned long) (vg_replace_malloc.c:220)

==7962==    by 0x4E45CD2: ??? (in /usr/local/lib/libcudart.so.4.2.9)

==7962==    by 0x4E7406A: ??? (in /usr/local/lib/libcudart.so.4.2.9)

==7962==    by 0x4E54A1F: ??? (in /usr/local/lib/libcudart.so.4.2.9)

==7962==    by 0x4E3043E: ??? (in /usr/local/lib/libcudart.so.4.2.9)

==7962==    by 0x4E74360: ??? (in /usr/local/lib/libcudart.so.4.2.9)

==7962==    by 0x5869C11: exit (exit.c:78)

==7962==    by 0x584FAC3: (below main) (libc-start.c:252)

==7962== 

==7962== LEAK SUMMARY:

==7962==    definitely lost: 16 bytes in 1 blocks

==7962==    indirectly lost: 0 bytes in 0 blocks

==7962==      possibly lost: 0 bytes in 0 blocks

==7962==    still reachable: 30,037 bytes in 46 blocks

==7962==         suppressed: 0 bytes in 0 blocks

==7962== Reachable blocks (those to which a pointer was found) are not shown.

==7962== To see them, rerun with: --leak-check=full --show-reachable=yes

==7962== 

==7962== For counts of detected and suppressed errors, rerun with: -v

==7962== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 10 from 7)

(I noticed that the cudaDeviceReset() was necessay for getting rid of some “possibly lost” bytes, that’s why it is there, quite unconventionally.)

Is there a memory leak in libcudart, or am I responsible for cleaning something up that I miss?

Best regards,

Fredrik

sjiagc · June 6, 2012, 5:56am

Hi fhellman,

Thanks for your information! We have reproduced this issue and submit a bug, the developer will find what is going on with this issue.

Best regards!

kalman · June 7, 2012, 6:59pm

Hi!

I have the following program:

#include <cuda.h>

__global__ void noop()

{

}

int main()

{

  cudaSetDevice(0);

  noop<<<1,1>>>();

  cudaDeviceSynchronize();

  cudaDeviceReset();

  return 0;

}

I compile this with CUDA toolkit 4.2.9 to binary “noop” and run it on a machine with NVIDIA driver 295.49 through valgrind, using the following command:

$ valgrind --leak-check=full ./noop

==7962== Memcheck, a memory error detector

==7962== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.

==7962== Using Valgrind-3.5.0-Debian and LibVEX; rerun with -h for copyright info

==7962== Command: ./empty

==7962== 

==7962== 

==7962== HEAP SUMMARY:

==7962==     in use at exit: 30,053 bytes in 47 blocks

==7962==   total heap usage: 2,389 allocs, 2,342 frees, 898,204 bytes allocated

==7962== 

==7962== 16 bytes in 1 blocks are definitely lost in loss record 4 of 33

==7962==    at 0x4C2596C: operator new(unsigned long) (vg_replace_malloc.c:220)

==7962==    by 0x4E45CD2: ??? (in /usr/local/lib/libcudart.so.4.2.9)

==7962==    by 0x4E7406A: ??? (in /usr/local/lib/libcudart.so.4.2.9)

==7962==    by 0x4E54A1F: ??? (in /usr/local/lib/libcudart.so.4.2.9)

==7962==    by 0x4E3043E: ??? (in /usr/local/lib/libcudart.so.4.2.9)

==7962==    by 0x4E74360: ??? (in /usr/local/lib/libcudart.so.4.2.9)

==7962==    by 0x5869C11: exit (exit.c:78)

==7962==    by 0x584FAC3: (below main) (libc-start.c:252)

==7962== 

==7962== LEAK SUMMARY:

==7962==    definitely lost: 16 bytes in 1 blocks

==7962==    indirectly lost: 0 bytes in 0 blocks

==7962==      possibly lost: 0 bytes in 0 blocks

==7962==    still reachable: 30,037 bytes in 46 blocks

==7962==         suppressed: 0 bytes in 0 blocks

==7962== Reachable blocks (those to which a pointer was found) are not shown.

==7962== To see them, rerun with: --leak-check=full --show-reachable=yes

==7962== 

==7962== For counts of detected and suppressed errors, rerun with: -v

==7962== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 10 from 7)

(I noticed that the cudaDeviceReset() was necessay for getting rid of some “possibly lost” bytes, that’s why it is there, quite unconventionally.)

Is there a memory leak in libcudart, or am I responsible for cleaning something up that I miss?

Best regards,

Fredrik

Rerun it with --leak-check=full --show-reachable=yes and post what you get. Consider also that some libraries can allocate

memory once in all your execution and let the OS to free the allocated memory.

Problem arises if mutiple cudaDeviceReset calls do allocate a different chunk of memory each time, try to issue a double

cudaDeviceReset and see if the “leaked memory” increases.

Gaetano