cuda and double precision

Hello everyone,

I am a newbie to CUDA and am still trying to get my head around this whole new desktop supercomputing revolution!

I would like to try and use CUDA to do some simulations and of course, using floating point arithmetic there is not enough.

Is double precision arithmetic available on all CUDA hardware or do you need to have the latest and greatest cards for that to work? I have a 8600M GT on my laptop and I am guessing I could not expect double precision arithmetic on that.

Also, are there some limitations on how many double precision variables I can have or something… Of course the number of register variables must be limited but are there any other limitations on declaring variables with double precision?

Hope for some answers.

Thank you,

xarg

welcome

No, its not available on all hardware. Only the new g200 hardware supports it(compute 1.3 capable graphics cards – gtx series , Tesla). See the programming guide appendix for more info.

Registers take up double the space (2 registers for each double precision register variable). There is also 2 way bank conflicts involving shared memory with double, which can be avoided for now if you break you your variable into hi and lo parts (see programming guide) and then access it , buts its a pain.

Also the peak performance in double precision is approx 12 times less than that of single precision External Media . Hence for the Tesla C1060 the peak double performance GFlops is ~78.

Hence double precision should be used only if you really need it.

Thanks

Thanks!

That’s a shame. Thanks for the reply. It clarifies a lot of things.

That said, you can still get impressive speed-ups (i have seen up-to 60x) over most cpu double precision implementations. It just more tricky and restrictive to code though.