computing with double precision results in launch failure


I have written some code using single precision floating point calculations. As I now have a GTX 280, I wanted to switch to double precision. Just exchanging “float” by “double” led to an “unspecified launch failure”.

I modified the template from the SDK to produce a minimal code that still leads to this problem, but now I’m stuck. I would appreciate any help on this.

Here is the output I get:

If I change the TYPE define to float, the error doesn’t occur. Likewise, if I remove the cudaThreadSynchronize() in the double precision version, the error doesn’t occur either.

I use the makefile from the SDK, but added the line “NVCCFLAGS += ‘–gpu-architecture=sm_13’”. The makefile and code are attached to this post.

Please note, that the current kernel doesn’t do anything meaningful. In my real code, I need the two loops and everything, but I removed large parts to simplify it.

Any idea, what’s wrong?


nvcc version: release 2.1, V0.2.1221

OS: suse 11.1 (4.44 KB)