So I have a kernel where I just copy the cuda addresses for some constant memory variables to another portion of global memory There’s just a slight problem. When I get the first element, the address is 0x0 and I attribute that to a NULL address and I’m causing myself to error out, but I’d like to verify that I am in fact getting an address of an existing variable instead of one that doesn’t exist.
So if I had a
if(var == var1) {copy &var1}
else if (var == var2) { copy &var2}
else { throw cuda error}
inside the kernel, I can get the error after the kernel execution and be assured that the variable didn’t exist.
My question is if anyone knows how to throw an error from within a kernel? I’d like it to be the least hack-ish as possible too, in case such a way exists. I couldn’t find anything in the cuda reference.
I never code anything like (int)0=0 in my kernels on purpose. One reason why I find this practice counterproductive is that you cannot distinguish a deliberate failure induced by the above statement from a programming error, such as e.g. a buffer overrun.
I instead support an array of data structures, one structure per block, which store the status information for that block. The host code cleans up the structures and copies them to the device memory just before launching a kernel. The kernel code writes error information (if any) into the structure associated with the block, that discovered the error, or just leaves the structure intact (if there’s no error). Upon return from the kernel, the host code pulls the structures back from the device memory and analyzes them. In the simplest case the structure can degenerate into the error code number, but in my implementation I also support file name and line number of the place in the kernel code, where the error was discovered. It would be great for my purpose, if cuda supported sprintf() in the device code, but it doesn’t (as of sdk v.2.2), so the error reporting is somewhat limited, but I still find it very helpful for debugging complex kernels.
It is probably best if I clear up my scenario a bit. I agree, though, I don’t like the solution one bit and I think its horrible practice. The kernel that I want to throw an error in is called for 1 purpose only, to retrieve the address of a constant memory variable and store it in global memory. Its essentially a one thread kernel invocation, blocksize of 1, num blocks of 1, 1 thread. I do almost exactly as you recommended for my more complex kernels, but in initialization, the approach I was using before required setting a value in global memory and updating it in the kernel if an error occurred and then on return, checking that global variable. As I check whether the temporary global memory for the address of the constant memory variable exists, I don’t see any other way my kernel would ever error out.
In this case, with this kernel, I would never encounter a deliberate failure, which was the motivation for erroring out of the kernel to determine whether the variable existed or not. I like your debugging method though and thanks for that info.