double free when trying to interface with python

I am trying to interface a very simple cuda shared library with some python code using ctypes:

here is the lib (test.cu):
#include <cutil.h>

extern “C”
int test()
{
return 5;
}

here is my python script (simple.py):
from ctypes import *

libCuda = cdll.LoadLibrary(“test.so”)
print libCuda.test()

compiling cmd:
nvcc -shared -c test.cu -Xcompiler “-fPIC” -I. -I/usr/local/cuda/include -I/home/thor/NVIDIA_CUDA_SDK/common/inc -I -DUNIX -L/usr/local/cuda/lib -L/home/thor/NVIDIA_CUDA_SDK/lib -lcuda -lcudart

gcc -o test.so -shared test.o -I. -I/usr/local/cuda/include -I/home/thor/NVIDIA_CUDA_SDK/common/inc -I -DUNIX -L/usr/local/cuda/lib -L/home/thor/NVIDIA_CUDA_SDK/lib -lcuda -lcudart

everything compile necily.

But when I run the python script, here is what I get :
python simple.py
5
*** glibc detected *** python: double free or corruption (fasttop): 0x00000000007c1570 ***

the program doesn’t quit, it freezes

the valgrind error:
(…)
==12162== Invalid free() / delete / delete
==12162== at 0x4C20390: operator delete(void*) (vg_replace_malloc.c:244)
==12162== by 0x65FC5A4: (within /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x65EDCD5: __cudaUnregisterFatBinary (in /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x604ADD8: __cudaUnregisterBinaryUtil (in /tmp/test.so)
==12162== by 0x604ADE3: __cudaUnregisterBinary (in /tmp/test.so)
==12162== by 0x60423E1: (within /tmp/test.so)
==12162== by 0x604AE50: (within /tmp/test.so)
==12162== by 0x56FBA14: exit (in /lib/libc-2.5.so)
==12162== by 0x56E58EA: (below main) (in /lib/libc-2.5.so)
==12162== Address 0x5BFF938 is 0 bytes inside a block of size 48 free’d
==12162== at 0x4C20390: operator delete(void*) (vg_replace_malloc.c:244)
==12162== by 0x65EBEAC: (within /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x56FBA14: exit (in /lib/libc-2.5.so)
==12162== by 0x56E58EA: (below main) (in /lib/libc-2.5.so)
==12162==
==12162== Invalid read of size 8
==12162== at 0x65EDCE7: __cudaUnregisterFatBinary (in /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x604ADD8: __cudaUnregisterBinaryUtil (in /tmp/test.so)
==12162== by 0x604ADE3: __cudaUnregisterBinary (in /tmp/test.so)
==12162== by 0x60423E1: (within /tmp/test.so)
==12162== by 0x604AE50: (within /tmp/test.so)
==12162== by 0x56FBA14: exit (in /lib/libc-2.5.so)
==12162== by 0x56E58EA: (below main) (in /lib/libc-2.5.so)
==12162== Address 0x5BFF8B8 is 0 bytes inside a block of size 80 free’d
==12162== at 0x4C20390: operator delete(void*) (vg_replace_malloc.c:244)
==12162== by 0x65EBE57: (within /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x56FBA14: exit (in /lib/libc-2.5.so)
==12162== by 0x56E58EA: (below main) (in /lib/libc-2.5.so)
==12162==
==12162== Invalid write of size 8
==12162== at 0x65F587B: (within /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x65EDCF0: __cudaUnregisterFatBinary (in /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x604ADD8: __cudaUnregisterBinaryUtil (in /tmp/test.so)
==12162== by 0x604ADE3: __cudaUnregisterBinary (in /tmp/test.so)
==12162== by 0x60423E1: (within /tmp/test.so)
==12162== by 0x604AE50: (within /tmp/test.so)
==12162== by 0x56FBA14: exit (in /lib/libc-2.5.so)
==12162== by 0x56E58EA: (below main) (in /lib/libc-2.5.so)
==12162== Address 0x5BFF8B8 is 0 bytes inside a block of size 80 free’d
==12162== at 0x4C20390: operator delete(void*) (vg_replace_malloc.c:244)
==12162== by 0x65EBE57: (within /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x56FBA14: exit (in /lib/libc-2.5.so)
==12162== by 0x56E58EA: (below main) (in /lib/libc-2.5.so)
==12162==
==12162== Invalid free() / delete / delete
==12162== at 0x4C20390: operator delete(void*) (vg_replace_malloc.c:244)
==12162== by 0x65EDCF0: __cudaUnregisterFatBinary (in /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x604ADD8: __cudaUnregisterBinaryUtil (in /tmp/test.so)
==12162== by 0x604ADE3: __cudaUnregisterBinary (in /tmp/test.so)
==12162== by 0x60423E1: (within /tmp/test.so)
==12162== by 0x604AE50: (within /tmp/test.so)
==12162== by 0x56FBA14: exit (in /lib/libc-2.5.so)
==12162== by 0x56E58EA: (below main) (in /lib/libc-2.5.so)
==12162== Address 0x5BFF8B8 is 0 bytes inside a block of size 80 free’d
==12162== at 0x4C20390: operator delete(void*) (vg_replace_malloc.c:244)
==12162== by 0x65EBE57: (within /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x56FBA14: exit (in /lib/libc-2.5.so)
==12162== by 0x56E58EA: (below main) (in /lib/libc-2.5.so)
==12162==
==12162== Invalid free() / delete / delete
==12162== at 0x4C20390: operator delete(void*) (vg_replace_malloc.c:244)
==12162== by 0x65EDCFA: __cudaUnregisterFatBinary (in /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x604ADD8: __cudaUnregisterBinaryUtil (in /tmp/test.so)
==12162== by 0x604ADE3: __cudaUnregisterBinary (in /tmp/test.so)
==12162== by 0x60423E1: (within /tmp/test.so)
==12162== by 0x604AE50: (within /tmp/test.so)
==12162== by 0x56FBA14: exit (in /lib/libc-2.5.so)
==12162== by 0x56E58EA: (below main) (in /lib/libc-2.5.so)
==12162== Address 0x5BFF7C8 is 0 bytes inside a block of size 8 free’d
==12162== at 0x4C20390: operator delete(void*) (vg_replace_malloc.c:244)
==12162== by 0x65EBE48: (within /usr/local/cuda/lib/libcudart.so.1.0)
==12162== by 0x56FBA14: exit (in /lib/libc-2.5.so)
==12162== by 0x56E58EA: (below main) (in /lib/libc-2.5.so

(…)

Anybody have an idea of what’s going wrong?
the funny thing is with the previous SDK (0.8 or 0.9) everything was working fine…
Now, that I have upgraded to the 64Bits 1.0 SDK doesn’t work anymore…

Thanks

Ben

It is a known bug, fixed in 1.1 , that will be out soon.

When is the next version coming out?

soon…