I saw in the CUDA/parallel programming course notes that each multiprocessor has two special function units. What operations do these do?

These perform the more involved floating point operations, like reciprocal square root and the transcendental functions. See this paper:


section 3.2


Slide 31 gives you an idea of the SFUs functions. Wen-Mei did mention that naturally native precision is lower than that of an x86 implementation, so worth baring in mind.

Yeah, the sin(), cos(), and exp() device functions compile down to code that does argument reduction and some other handling to improve the accuracy of the hardware special functions, which can be directly accessed with __sinf(), __cosf(), etc.

Thanks seibert, wasnt aware of that. Its mentioned on page 62 of the CUDA 1.1 guide if anyone’s interested.