I am working on a system with 4 x TESLA C1060. When I run the bandwidthTest (from the CUDA SDK) I get results like this (MB/s)
for two of the cards:
Host -> Device: 5300
Device -> Host: 4670
for the other two:
Host -> Device: 4750
Device -> Host: 3150
So I have a couple of questions. First, the Device->Device looks a bit slow. TESLA is advertised as 102GB/s, and I get 90000+ MB/s with a GTX260 on my home PC.
But I can’t find published bandwidthTest results for a TESLA. If you have one, I would be grateful if you could post your bandwidthTest results for comparison.
Second, why would two cards be slower than the other two for Host->Device and Device->Host? Is that expected?
This is a Supermicro system with X8DTG-QF motherboard. 6x PCIe2.0 x16 physical (4 slots with 16 lanes, 2 slots with 4 lanes). OS is Linux (CentOS 5.3).
thanks in advance