"invalid argument" error unexplained "invalid argument" error


I’m experiencing a strange “invalid argument” error. It doesn’t happen in the emulator and the failing invocation takes only 0.1 ms, so it can’t be a runtime error in the kernel or timeout. Actually, I still get the error even if I make the first line of the kernel “if ( threadIdx.x < 512 ) return;”

When I was running with 180 or 192 threads per block it worked fine. When I increased to 256 threads (or even 193) I get the error. I can make the error go away by commenting out a few lines at the end of the kernel’s inner loop (the kernel is 150 lines long and has 32 bytes of arguments). Reducing shared memory usage has no effect.

Any ideas or suggestions? What is this error supposed to mean?


Your kernel probably uses too many registers to be executed with 256 threads per block or more.

We’ll improve our error messages in the next release.


Thanks. Are there any tricks for reducing the number of registers used? Is the compiler doing everything it can to reuse registers or do I have to limit the scope of variables manually?

Yes, the compiler minimizes the number of registers as much as possible.

You can obtain the number N of registers a particular kernel compiles to by compiling it with nvcc’s -cubin option (see the nvcc manual for more details), which will output a .cubin file that you can open in a text editor: search for the name of your kernel and look just below at the line of the form “reg = N”.

Keep in mind that the number of registers is a function of the compiler version (we’ll keep improving the compiler) and of the particular hardware you run on. Any modification of your C code you’d find that reduces this number for a particular version of the compiler and for some particular hardware might become moot with future versions and/or hardware.