i got a problem in a program i’m developing at moment. It’s a simple benchmark to test some cuda allocating types.
I can allocate and copy mem without any problem, use that matrix/vectors in my kernels and retrieve info about results. The problem arises when i try to free mem. Then i get the loved “unspecified launch failure” on cudaFree calls. Can anybody take a look and try to tell me what’s wrong there?
For a better understand, the program is really simple. It passes 4 tests (only test 1 and test 3 implemented now), and each test measures malloc time, copy time, and kernels execute time. There are 3 kernels, but only 2 implemented now. On beggining of program, you must supply maxSize of vector (just type 100 or 1000 to get a quick test).
Thank you for your help
mytest.tar (30 KB)