I’ve used CUDA and OpenCL on a couple of cards, on OpenSuse and CentOS, but am unable to get a new Tesla C1060 working correctly on CentOS 5.4.
I have installed:
NVIDIA-Linux-x86_64-190.53-pkg2.run
cudatoolkit_2.3_linux_64_rhel5.3.run
cudasdk_2.3_linux.run
I have also added the script described at [url=“Accelerated Computing: Installing Nvidia Tesla C1060 on a Mac Pro with CentOS”]http://acceleratedcomputing.blogspot.com/2...060-on-mac.html[/url] and changed the xorg.conf to use the GPU on the mother board. When I reboot everything comes up as expected.
Running deviceQuery from the SDK gives normal results, as does lspci. Everything looks just fine.
But when I run bandwidthTest, it hangs with 100% on one of the CPU cores. Since we only have one device, I wondered if “device to device transfer” was asking the impossible, but I see similar problems with other tests (for example, “transpose” prints nothing) and I am 99% certain I have run that before with other (single) cards and nothing has hung. Nothing extra is printed if I compile the SDK examples with “dbg=1” and run the code from the debug directory.
There are no errors in /var/log/messages. This is a 64 bit machine. I had no problems at all with OpenCL on this machine with a random 9800 card and the OpenCL compatible driver.
Any ideas? We were hoping the Tesla would give us a significant speedup, but so far it’s just failed to work at al :( Happy to provide more data and apologies if this is explained somewhere (I checked the release notes, and couldn’t find anything on a search).
Thanks,
Andrew
PS Output from some tests:
[andrew@schist debug]$ ./bandwidthTest
Running on…
device 0:Tesla C1060
Quick Mode
Host to Device Bandwidth for Pageable memory
.
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 3857.0
Quick Mode
Device to Host Bandwidth for Pageable memory
.
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 3050.9
Quick Mode
Device to Device Bandwidth
[hangs here]
[andrew@schist debug]$ ./transpose
[hangs here]