Hello,
I hope this is the right forum. We have a computer with 2 Tels cards running Ubuntu 10.04. We got a problem with one of them. Teh device query reports some stranger numbers and then it crashes:
[deviceQuery] starting...
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Found 2 CUDA Capable device(s)
Device 0: "Tesla C2070"
CUDA Driver Version / Runtime Version 4.1 / 4.0
CUDA Capability Major/Minor version number: 2.0
( 0) Multiprocessors x (32) CUDA Cores/MP: 0 CUDA Cores
Max Texture Dimension Size (x,y,z) 1D=(65535), 2D=(2048,2048), 3D=(0,512,0)
Max Layered Texture Size (dim) x layers 1D=(1) x 6, 2D=(0,0) x 0
Concurrent copy and execution: No with -725995520 copy engine(s)
...
Device PCI Bus ID / PCI location ID: -722875584 / 32669
Compute Mode:
Segmentation fault
As you can see i detects 2 cards, but it shows that the there are 0 MP, the texture size is 0 there are negative num,ebr of copy engines and the PCI Bus Id is all messed up. We have the cudatoolkit 4.1 and the nvidia driver x86_64-285.05.33
It is also weird, because sometimes my programs run on device 0 and sometimes they do not run.
We did not seem to have this problem a few days ago before upgrading from 4.0 to 4.1. Is it jsut the driver or is the card burned?