I was reading the example code bandwidthTest.cu in sdk. The following code section in the function testDeviceToDeviceTransfer(unsigned int memSize) puzzles me a little. Where does the factor 2.0f come from??
//calculate bandwidth in MB/s bandwidthInMBs = 2.0f * (1e3f * memSize * (float)MEMCOPY_ITERATIONS) / (elapsedTimeInMs * (float)(1 << 20));
If you check corresponding section of other functions for device to host or host to device memory bandwidth measurement, they don’t have that factor of two …