Number of Cosine and Sine in a K40

luisgo · December 16, 2014, 4:10pm

Dear All

Does anyone know how many Cosine and Sine processing units exists in a K40 per SM? And how many (sin and cos) for double precision and single precision?

Thanks

Luis Gonçalves

njuffa · December 16, 2014, 4:57pm

The device function intrinsics __sinf() and __cosf() map to short instruction sequences that ultimately use the hardware’s MUFU (multi-function unit) SIN and COS instructions. You can see the details if you dump the machine code with cuobjdump --dump-sass. MUFU is named SFU (special function unit) in older architectures. If you need to compute both sine and cosine of the same angle, you would want to use __sincosf(). Note that the intrinsics return approximate results with specified absolute rather than relative error bounds, since they are computed by fixed-point table interpolation. See the CUDA C Programming Guide for details on the error bounds.

The regular single-precision math functions sinf(), cosf(), sincosf() as well as double-precision math functions sin(), cos(), sincos() are simply inline functions whose source you can find in the CUDA-supplied header files math_functions.h and math_functions_dbl_ptx3.h.

I don’t recall the number of MUFUs per SM in K40, maybe someone else can supply that number.

luisgo · December 16, 2014, 9:18pm

Gives this. Any clue for the solution?

error: no instance of overloaded function “sincos” matches the argument list
argument types are: (double *, double *, double *)

The call is
sincos(&const5, &const8, &const9);

where

double const5, const8,const9;

mfatica · December 16, 2014, 9:26pm

Function signature is (double , double *, double *).

double s,c;
double theta =…;
sincos(theta,&s,&c);

njuffa · December 16, 2014, 10:08pm

[url]Page Not Found | NVIDIA
SMX: 192 single‐precision CUDA cores, 64 double‐precision units, 32 special function units (SFU), and 32 load/store units (LD/ST).