Are the bandwidths (GB/s) obtained by the memcpy (device --> device) the maximum that can be attained??
On the Quadro 5600Fx according to the specs, the Memory Bandwidth is reported to be 76.8 GB/s and the maximum I have seen so far using the memcpy ~ 66 GB/s !!
The same thing holds even for the Tesla c1060, reported Memory Bandwidth - 102 GB/s and the maximum observed using the memcpy ~ 77Gb/s
There’s a considerable discrepancy of - 16% (Quadro) and 32% (c1060) ??
So, is the reported bandwidth purely theoretical (calculations based on the memory speed and so on…) and not really achievable ??..or not ??
any thoughts ?? (from NVIDIA ??)