compiler bugs, caveats

Hello!
While developing several projects with CUDA i encountered many problems, some compiler bugs, some different-then-C-compiler behavior, some completely untraceable bugs.
I had a hard time figuring them out, as I sometimes didn’t know where and what to search for. It was sometimes a real pain and took alot of time.
In this thread I would like the CUDA community to post every_bug_out_there. Every “wtf”, every odd behaviour, etc.
Let it be documented in one place.
Please post along code examples that cause the bug, what it should do, but doesn’t (or what it does, but shouldn’t) , how to fix or workaround it…
Or post a thread where you posted your bug before.
Lets help new devs ;)

to get something started:
[topic=“166687”]unexpected loop unrolling[/topic]
[topic=“166681”]compiler problems with 64bit datatype and logical instructions[/topic]

Uh… in my one-year experience with CUDA I stumbled on many, many strange errors. Some are funny, some caused me a headache for several days or even weeks, stopping completely my progress…
Here is my list (from newest to oldest).
Unfortunately, in most cases I have no idea about the progress resolving the problem - is anyone working on it, or was it completely ignored? … I am worried, the latter is the case.
Anyway, have fun reading!

unresolved bug: compiler crash — Run out of registers in predicate (CUDA 3.0)
http://forums.nvidia.com/index.php?showtopic=167175

bug + partial walkaround: compile failure — device cannot be called from host device despite CUDA_ARCH guarding (CUDA 3.0)
http://forums.nvidia.com/index.php?showtopic=166868

bug + walkaround: Compiler’s stack overflow. (CUDA 2.3, maybe not in 3.0 anymore)
http://forums.nvidia.com/index.php?showtopic=161012

bug: cudaMemcpyToSymbol working differently when in inline function and when in macro
http://forums.nvidia.com/index.php?showtopic=160898

bug: nvcc or VS issue? Inline function not included in partial compilation. No response at all :(
http://forums.nvidia.com/index.php?showtopic=160802

bug: Compile error with an array of objects. Not tested with CUDA 3.0
http://forums.nvidia.com/index.php?showtopic=160502

bug: Unnecessary local memory usage with int4 type. Not much response…
http://forums.nvidia.com/index.php?showtopic=157594

driver? bug: Inneficient block scheduling. Ridiculously off-topic response from tmurray :(
http://forums.nvidia.com/index.php?showtopic=150567

bug: Incorrect assignment to global variable in a small conditional branch
http://forums.nvidia.com/index.php?showtopic=156449

compiler or hardware bug: Branch that is never taken causes __syncthreads() de-synchronisation. Not much response :(
http://forums.nvidia.com/index.php?showtopic=152070

strange behaviour: “White gaps” GPU staying idle for no reason. Seems number of variables has impact on it.
http://forums.nvidia.com/index.php?showtopic=101476

problem: Using textures in several .cu files, even if each texture is being used only in one file
http://forums.nvidia.com/index.php?showtopic=99951

mild bug: Loop not unrolling. Manual unroll necessary
http://forums.nvidia.com/index.php?showtopic=98674

compiler bug: Variable visible before its declaration
http://forums.nvidia.com/index.php?showtopic=97758

From my experience, each single bug I’ve opened via the online bug report site (not from the forums) was handled in a very professional way.

Most of the time nVidia’s QA wanted a simple repro, but once there was a repro, they checked and indeed fixed the problem in the very next version.

I got updates about the progress (going to QA, going to R&D, check in QA, closed/fixed in next version…) via emails.

Maybe you should try to open those bugs in their bug tracking system and not here in the forums.

my 1 cent

eyal

Probably you are right… I thought both ways are more-or-less equal.