I am using CUDA with gcc 4.3 on FC9. I wrote my first CUDA program a couple of days ago. It’s the simplest thing that one can think of: it allocates a floating point array on the GTX 280, initializes the array to 1.1, multiplies it by 2.0, and brings the result back to the host where the result is compared with the CPU version. Now, all is well when I use single precision (type specifier “float”) but all hell breaks loose when I use “double”. It compiles and runs but all I get out of the card is zeros. I modified my kernel to account for the change in type and I believe I changed everything consistently.
Any guesses at what the problem may be? Do I need to change something when using double instead of float?. Is the size of double different on the GTX280 as compared to the CPU? I don’t have access to my kernel code right now, but I will post it later.