heavy usage of macros cause warning: Cuda API error detected: cudaLaunch returned (0x2)

iliak · November 17, 2015, 3:48pm

i have encountered a new problem,
when when i add and use the following macro in the code

_I_DEBUG(2,"%s",stringParam.c_str());

it case the error: cudaLaunch returned (0x2) when running the kernel
the definition of the macro:

#define _I_DEBUG(y, ...) if (y <= _I_DLEVEL) {printf("I: "); printf(__VA_ARGS__);}

when i comment the line the kernel build and launch fine. and uses 39 registers.

any idea what can cause it?

the code is large
i have another problem that might be connected to the problem in previous post: https://devtalk.nvidia.com/default/topic/889014/cuda-programming-and-performance/gdb-error-regmap_max_entries-failed/
i am using the latest cuda 7.5

njuffa · November 17, 2015, 4:30pm

There is way too little information here to help diagnose the problem. When seeking help with debugging run-time failures, it is highly advisable to post buildable and runnable, self-contained code that reproduces the problem, so others can experiment with the code. The smaller the code the better. Also you would want to mention how the code is compiled (exact nvcc command line) and on what GPU and OS platform you are running the code.

In any case, make sure that your code checks the status return of every API call, and every kernel launch, otherwise it can easily happen that the source of the problem is far away from the point of failure, and much harder to find.

iliak · November 17, 2015, 4:41pm

i know the best is to post a code, but it is not possible,
i tried to reproduce it in a small case scenario but the error did not occurred.

the nvcc command line is default i made no changes except few include paths
3.16.0-31-generic #43~14.04.1-Ubuntu SMP Tue Mar 10 20:13:38 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
i3
2x980 gtx ti

i have done it no previous errors,

the cude output file is a bit large 16.6MB