How can I use single precision arithmetic for floating point operations in gtx480 architecture?
When I compile I use the flag -arch sm_20. Without it my program cuda application will not run on gtx480 architecture.
I was wondering if there is any special function I am using in my code which requires
mandatory the double precision arithmetic.
Does anyone have any idea on how to do about that?
I would like to use single precision arithmetic in order to observe if my application would be faster that way.
For test purposes you can compile with [font=“Courier New”]-arch=compute_12 -code=compute_12[/font], which will demote double to float and then force a dynamic recompilation if executed on a compute 2.x device.