Hi guys,
i have a problem with my kernel code. I often (not always) receive a error “unspecified launch failure”. After some research, i use cuda-memcheck and cuda-gdb to find the source of the error. With memcheck i receive following output:
========= CUDA-MEMCHECK
========= Invalid __global__ write of size 1
========= at 0x00000840 in kernel_nomm.cu:792:ExactKernel
========= by thread (34,0,0) in block (120,0)
========= Address 0x00000023 is out of bounds
=========
========= ERROR SUMMARY: 1 error
Seems like i do some illegal access to the global memory. The line-number isn’t correct, there is only a “}” at this line. The memcheck-manual tells me, that i should run cuda-gdb with the cuda memcheck option to get the right line. If I do, I get:
Start Kernel with: 157 blocks, 64 threads per block
[Launch of CUDA Kernel 0 (MemKernel) on Device 0]
[Termination of CUDA Kernel 0 (MemKernel) on Device 0]
[Launch of CUDA Kernel 1 (ExactKernel) on Device 0]
Program received signal CUDA_EXCEPTION_1, Lane Illegal Address.
[Switching to CUDA Kernel 1 (<<<(108,0),(63,0,0)>>>)]
0x000000001698bc00 in ?? ()
and the next time:
Start Kernel with: 125 blocks, 64 threads per block
[Launch of CUDA Kernel 0 (MemKernel) on Device 0]
[Termination of CUDA Kernel 0 (MemKernel) on Device 0]
[Launch of CUDA Kernel 1 (ExactKernel) on Device 0]
Program received signal CUDA_EXCEPTION_1, Lane Illegal Address.
[Switching to CUDA Kernel 1 (<<<(100,0),(16,0,0)>>>)]
0x00000000135f3970 in ?? ()
As i used the same test data, i’m sure that the error isn’t related to the input data.
I tried to get rid of this error through excessive error checking (range of index) in source, i make checks every access of global memory (not constant memory), but i cant find the mistake.
[b]
Does anyone know a way i can determine which is the error related line in source?
Is it right, that it seems to be a char which is written/read? (I tried to read from “size 1”, char is only datatype if this size)
Can this error also appear if a illegal constant memory read is made?
[/b]
I attached my source file, if anyone want to see. Would be great if anyone has a suggestion for me.
Thx,
mic