Any detail on cudaErrorInvalidValue(11)?

I tried to launch my kernel with blocks(5214,1,1) and threads(34,30,1) and shared mem = 10K; it returns cudaErrorInvalidValue(11)

So I cut it down to one block and it still returns cudaErrorInvalidValue(11) in the Visual Studio debugger.

If I launch it in NSight, I can trace through the code; but I see that my checks cause it to return early.

Is there any detail of cudaErrorInvalidValue(11) reported anywhere?

The code appears to behave as expected in NSight, so how can I figure out why it’s returning cudaErrorInvalidValue(11)?

Are you compiling with an architecture switch that properly specifies a cc 2.0 or later architecture? (e.g. -arch=sm_20)

see if this helps:

http://www.cs.cmu.edu/afs/cs/academic/class/15668-s11/www/cuda-doc/html/group__CUDART__TYPES_g3f51e3575c2178246db0a94a430e0038.html

or google NVIDIA CUDA library: cudaError - should bring you to the same page

Compiling arch=3.5

Thanks, but I’ve seen that.

I was looking for detail which specifies offending parameter.

All but 3 of the parameters are valid GPU memory addresses. The other 3 are legit int values.

Interestingly, if I re-launch the kernel in the debugger by resetting the current instruction (same parameters), it returns cudaSuccess.

on a kernel launch, there’s only 4 invalid value errors that I can think of:

  • blocks dim3 variable
  • threads dim3 variable
  • dynamic shared memory amount
  • stream ID

Perhaps you should show your kernel invocation along with the lines of code used to formulate the above 4 launch configuration variables. And reconfirm that the module is actually getting compiled with cc3.5. If you’re in visual studio, the arch compile switches can be tricky - VS inherits the file-specific settings after the global settings, so if you have file-specific settings, they will override the global settings, and it’s not obvious that this is the case unless you inspect the compile command issued.

Since you’ve already cut it down to one block, you should be able to rule out the first variable (barring code goof up). You could try something similar with the other 3 and see which one finally allows the kernel to launch.

Apparently the error was left over from a previous cudaMemcpy call.

Lesson learned. When you’re having trouble with a CUDA code, don’t skimp on error checking. Do it on every cuda API call and kernel call. Or you will waste time scratching your head on problems like this one.