CUDA ref. manual says that return value of cudaLaunch func. is one of the following: [font=“Courier New”]cudaSuccess
cudaErrorInvalidDeviceFunction
cudaErrorInvalidConï¬guration[/font]
my question is What can cause cudaErrorInvalidDeviceFunction as a result of a kernel call?
I wrote a multi-gpu code, which works on every CUDA-capable PC except one.
On “WinXP, 2x GTX 280”-machine program exits with cudaErrorInvalidDeviceFunction?
does anyone of you guys know where can be my problem?
I did. So, the reason why I get cudaErrorInvalidDeviceFunction as a result of a kernel call is that device is “busy”. Some other code running a device, or even mine. Right?
Or, this is one of all possible reasons why i can get this error.
That person seems to say that an incorrect call corrupted the GPU’s memory, possibly messing up the kernel’s code or just scrambling the nvidia driver’s state. Possibly, it could be an ordinary out-of-bounds access that is scrambling driver memory. I don’t know if this is what you’re seeing, but overwriting-memory bugs tend to be semi-non-deterministic and may manifest themselves in one configuration but not another.
For sake of completeness, what are those other systems on which your code does work?
sorry, I was missinformed. in emu-mode everything works fine.
the code is always recompiled on every machine before run.
I also tried to run an empty kernel instead of existing in program by calling it with <<<dim3(1,1,1), dim3(1,1,1)>>> configuration. the same reaction on the FIRST kernel call.
Recompiling is what typically makes out-of-bounds bugs surface or hide. You can try compiling the code on a windows machine where it works, and see if it’ll run on the machine that doesn’t. Although this won’t really tell you much.
Btw, is your windows machine with 4GB using an x64 OS?
Sorry, I thought you’d said you were using cuLaunch.
And are the machine on which it works x32? If so, this is an important point you should have said from the start. Probably, your error is in dealing with the x64 architecture. E.g., you do sizeof(long) instead of sizeof(void*) somewhere or somesuch. This causes an out-of-bounds access.
What I meant about “surface and hide” is that an out-of-bounds bug will sometimes appear during one compile, and not appear on another. That is how it’s not quite deterministic. Sometimes you may change a completely unrelated line of code, and it will toggle the bug.