Hi,
I have a server with two Tesla M2050 cards that seems to have some problems with one of the cards. This is what the bandwidthTest program from the SDK samples reports:
Device 0: Tesla M2050
Quick Mode
Host to Device Bandwidth, 1 Device(s), Paged memory
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 2829.0
Device to Host Bandwidth, 1 Device(s), Paged memory
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 2225.8
Device to Device Bandwidth, 1 Device(s)
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 85649.8
Device 1: Tesla M2050
Quick Mode
Host to Device Bandwidth, 1 Device(s), Paged memory
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 386.6
Device to Host Bandwidth, 1 Device(s), Paged memory
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 396.9
Device to Device Bandwidth, 1 Device(s)
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 85610.2
As you can see the bandwidth for the second card is significantly lower than the first card. I’ve tried running nvidia-settings to see which PCIe interfaces they are connected to, but have not managed to do so since the server doesn’t run an X server (and the screen is connected to a Matrox card also). They should both be connected to 16x PCIe 2.0 interfaces though, since the server uses the Intel 5500/5520 chipset. The server is running ArchLinux 64-bit with the 260.19.29 NVIDIA driver and CUDA 3.2.16. Both the GPUs are set to exclusive mode.
If anyone can offer any help in this matter I would be very grateful. The server is used for research, and consistent results are quite important.