Encountering cudaErrorInvalidValue (error 11) although parameters for kernel call seem fine

I have written a C CUDA code which appears to be working fine for a small dataset(about 15KB). For a larger dataset (about 800 MB), the same code fails to launch even the first kernel, returning the cudaErrorInvalidValue error.

I am launching the kernel with 50 blocks of 1 thread each (although this is ineffective, I would like to run the code first and then progressively optimize it). I am passing 3 parameters to the kernel each point to separate c structures allocated on the device memory.

I noticed in many other answers that error 11 could be a result of a failed preceding memcpy call. In my case, the cudaMemcpy function call preceding the kernel call is successful.

Strangely, when I reduce the computation done in the kernel by eliminating function calls from the kernel (which run complex nested loops), the code runs (suggesting parameters to kernel call are not invalid).

I am running this code on a Quadro P5000 with 16GB VRAM and hence it has sufficient memory to hold the data that I am passing.

Could someone point me to the areas in which I should look for the error or the tools which I could use to debug this problem? I have not declared any local or shared memory to be used explicitly. Are the nested loops on the device functions causing the local or shared memories to be exhausted?

Here is the cuda mem-check error output:

========= Program hit cudaErrorInvalidValue (error 11) due to "invalid argument" on CUDA API call to cudaLaunch.

Can you provide a little more info, specifically:

  • A minimal test case; e.g. a complete, standalone program that is as small as possible and has few/no dependencies on other software.
  • The entire sequence of commands/actions that reproduce the bug (using the minimal test case).
  • The unabbreviated outputs of those commands/actions including files, stdout/stderr and logs.
  • A description of the environment where the problem occurs (type and version of the hardware, OS, kernel, compiler and any other relevant software).