single precision arithmetic in FERMI architectures nvcc flags?

ardisschool10 · August 5, 2011, 4:42pm

How can I use single precision arithmetic for floating point operations in gtx480 architecture?

When I compile I use the flag -arch sm_20. Without it my program cuda application will not run on gtx480 architecture.

I was wondering if there is any special function I am using in my code which requires
mandatory the double precision arithmetic.

Does anyone have any idea on how to do about that?
I would like to use single precision arithmetic in order to observe if my application would be faster that way.

Thank you in advance for the answers!

zkoza · August 5, 2011, 8:37pm

What happens exactly? Any system message? Do you monitor error codes that are being returned from all cuda function / your kernel calls?

Have you tried to debug the program step-by-step?

ardisschool10 · August 6, 2011, 1:12am

It happens that the program fails execution at the first cudaMalloc that encounters.

zkoza · August 6, 2011, 2:27pm

Try updating your nvidia driver to the current one, e.g. from http://www.nvidia.co…aspx?lang=en-us
Which CUDA do you use?

tera · August 6, 2011, 6:37pm

For test purposes you can compile with [font=“Courier New”]-arch=compute_12 -code=compute_12[/font], which will demote double to float and then force a dynamic recompilation if executed on a compute 2.x device.