NVIDIA’s GPUs have a single-precision floating-point arithmetic operations unit and a FMAD unit, don’t it?
So, for example,
a single-precision floating-point arithmetic operations unit does a+b,
and a FMAD unit does c*d.
Do these execute at the same time?
At all, does a FMAD unit execute only mutiply or only addition?
I’m sorry for my poor English.
An FMAD unit executes ab+c, so it executes multiplication and addition. Each “core” in current Nvidia speak corresponds to one FMAD unit and has a throughput of one FMAD per cycle (it can also do just ab or a+b with one op/cycle). Compute capability 1.x could, in addition to this, execute another multiplication per cycle in the SFU (special function unit), provided it is not register bandwidth starved (which seems to be the case quite often).
Note that the core as a throughput of one FMAD (a*b+c) per cycle, but a latency of 16…24 cycles.
Thanks for your answer.
What does “Floating-point add”? FMAD?
“Floating point add” is just a+b. FMAD stands for “fused multiply-add” and does a*b+c without rounding of the intermediate result.
I’m very sorry for my bad English.
What unit does “Floating-point add”?
Oh, sorry for misunderstanding the question. Yes, floating point add is executed in the same unit as FMAD.
Thank you very much!