Hi all,
I’ve got an x86_64 CentOS 4.8 (RHEL-compatible) machine connected to half (2 GPUs) of an S1070. I previously had CUDA 2.3 installed and working fine, and today I installed CUDA 3.1.
Everything compiles fine, and my previously compiled CUDA 2.3 programs continue to work.
When trying the examples:
[codebox]# /usr/local/cudasdk31/C/bin/linux/release/clock
cudaSafeCall() Runtime API error : all CUDA-capable devices are busy or unavailable.[/codebox]
I then tried deviceQuery:
[codebox]# /usr/local/cudasdk31/C/bin/linux/release/deviceQuery
[/codebox]
… returns nothing, and terminates when I press enter.
deviceQueryDrv, on the other hand, gives:
[codebox]# /usr/local/cudasdk31/C/bin/linux/release/deviceQueryDrv
CUDA Device Query (Driver API) statically linked version
There are 2 devices supporting CUDA
Device 0: “Tesla T10 Processor”
CUDA Driver Version: 3.10
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 4294770688 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: No
Device has ECC support enabled: No
Device 1: “Tesla T10 Processor”
CUDA Driver Version: 3.10
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 4294770688 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: No
Device has ECC support enabled: No
PASSED
Press ENTER to exit…
[/codebox]
It’s totally weird. I’ve looked into the common things, such as rebooting the system, checking the permissions on /dev/nvidia* (all are a+rw), and the compute-exclusivity of the GPUs:
[codebox]# ls -al /dev
total 0
[…snip…]
crw-rw-rw- 1 root root 195, 0 Jul 14 15:28 nvidia0
crw-rw-rw- 1 root root 195, 1 Jul 14 15:28 nvidia1
crw-rw-rw- 1 root root 195, 2 Jul 14 15:28 nvidia2
crw-rw-rw- 1 root root 195, 3 Jul 14 15:28 nvidia3
crw-rw-rw- 1 root root 195, 4 Jul 14 15:28 nvidia4
crw-rw-rw- 1 root root 195, 255 Jul 14 15:28 nvidiactl
[…snip]
nvidia-smi -s
COMPUTE mode rules for GPU 0: 0
COMPUTE mode rules for GPU 1: 0
[/codebox]
Any ideas on what may cause this behaviour? Your help is much appreciated in advance.
– Alf