Number of Cosine and Sine in a K40

Dear All

Does anyone know how many Cosine and Sine processing units exists in a K40 per SM? And how many (sin and cos) for double precision and single precision? 


Luis Gonçalves

The device function intrinsics __sinf() and __cosf() map to short instruction sequences that ultimately use the hardware’s MUFU (multi-function unit) SIN and COS instructions. You can see the details if you dump the machine code with cuobjdump --dump-sass. MUFU is named SFU (special function unit) in older architectures. If you need to compute both sine and cosine of the same angle, you would want to use __sincosf(). Note that the intrinsics return approximate results with specified absolute rather than relative error bounds, since they are computed by fixed-point table interpolation. See the CUDA C Programming Guide for details on the error bounds.

The regular single-precision math functions sinf(), cosf(), sincosf() as well as double-precision math functions sin(), cos(), sincos() are simply inline functions whose source you can find in the CUDA-supplied header files math_functions.h and math_functions_dbl_ptx3.h.

I don’t recall the number of MUFUs per SM in K40, maybe someone else can supply that number.

Gives this. Any clue for the solution?

error: no instance of overloaded function “sincos” matches the argument list
argument types are: (double *, double *, double *)

The call is
sincos(&const5, &const8, &const9);


double const5, const8,const9;

Function signature is (double , double *, double *).

double s,c;
double theta =…;
SMX: 192 single‐precision CUDA cores, 64 double‐precision units, 32 special function units (SFU), and 32 load/store units (LD/ST).