lately I encountered a problem with nvcc, I was playing with compiler options in order the most efficient code compiled I tried to use the --gpu-name and the --gpu-code flags. I’ve been for several days now and finally found what was the problem. I happened to have a bug in my code that compiled through where as the compiler should have detected a problem, my kernel code looked like this:
global void myKernel(float *src, float *hist, int len, int number) //, const float minval, const float width)
shared unsigned int s_hist[MAX_USABLE_SHARED];
I forgot to divide MAX_USABLE_SHARED by 4 which gave an overflow on the shared memory.
If I compiled this code with the “–gpu-name compute_13” switch only then the code compiles fine all the way to a binary .obj file, this then creates a successful executable. The executable runs fine until the first cuda function call which reports the following error:
“First-chance exception at 0x77d4dd10 in myProg.exe: Microsoft C++ exception: cudaError_enum at memory location 0x0012f800…” and NO Cuda error message since this message is reported by windows.
from this error message it becomes very difficult to understand that the problem was in fact my shared memory allocation was too large.
I tested using other combinations of flags (such as adding the “–gpu-code sm_13” flag) and found out they would not compile and report an error with my shared memory allocation or just not produce any .obj file even though it states “creating new library”. So the problem really is when the --gpu-name flag is used alone.
I have seen other people in the forums with the same problem as I had but could never find the answer to the problem so I hope that’s helpful info.
by the way i’m using the latests cuda sdk (2.0 non-beta) and my nvcc command line looks like this when I compile my program and compile goes fine when it shouldn’t:
“C:\CUDA\bin\nvcc.exe” --ptxas-options="-v" --machine 64 --gpu-name compute_13 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 8\VC\bin” -c -D_DEBUG -DD_DEBUG -D-DWIN32 -D-D_CONSOLE -D-D_MBCS -Xcompiler “/EHsc /W3 /nologo /Od /Zi /RTC1 /MTd /Wp64” -I"C:\CUDA\include" -I"C:\Program Files (x86)\NVIDIA Corporation\NVIDIA CUDA SDK\common\inc" -o Debug\CUDA.obj CUDA.vcproj