I have used the bandwidthTest from the SDK, and wrote my own code to get the memory bandwidth using pinned memory on two different systems with slightly different architectures.
The first system is a bi socket server connect to 2 C1060 through on IOH (Tylersburg). The second one is the same but with two IOH.
In one case I obtained ~ 6 GB/s for both Host to Device and Device to Host transfers. On the second system, the Host to Device bandwidth is the same, but the Device to Host falls to only 3.5 GB/s. I have no idea where that big difference might come from. Does anyone faced the same problem ? Does anyone has any idea ?
Thank in advance for your help.