Dear All
Does anyone know how many Cosine and Sine processing units exists in a K40 per SM? And how many (sin and cos) for double precision and single precision?
Thanks
Luis Gonçalves
Dear All
Does anyone know how many Cosine and Sine processing units exists in a K40 per SM? And how many (sin and cos) for double precision and single precision?
Thanks
Luis Gonçalves
The device function intrinsics __sinf() and __cosf() map to short instruction sequences that ultimately use the hardware’s MUFU (multi-function unit) SIN and COS instructions. You can see the details if you dump the machine code with cuobjdump --dump-sass
. MUFU is named SFU (special function unit) in older architectures. If you need to compute both sine and cosine of the same angle, you would want to use __sincosf(). Note that the intrinsics return approximate results with specified absolute rather than relative error bounds, since they are computed by fixed-point table interpolation. See the CUDA C Programming Guide for details on the error bounds.
The regular single-precision math functions sinf(), cosf(), sincosf() as well as double-precision math functions sin(), cos(), sincos() are simply inline functions whose source you can find in the CUDA-supplied header files math_functions.h and math_functions_dbl_ptx3.h.
I don’t recall the number of MUFUs per SM in K40, maybe someone else can supply that number.
Gives this. Any clue for the solution?
error: no instance of overloaded function “sincos” matches the argument list
argument types are: (double *, double *, double *)
The call is
sincos(&const5, &const8, &const9);
where
double const5, const8,const9;
Function signature is (double , double *, double *).
double s,c;
double theta =…;
sincos(theta,&s,&c);
[url]Page Not Found | NVIDIA
SMX: 192 single‐precision CUDA cores, 64 double‐precision units, 32 special function units (SFU), and 32 load/store units (LD/ST).