In short, deviceQuery tells me “There is 1 device supporting CUDA” but matrixMul says “no CUDA-capable device is available.”
The CUDA SDK binaries are freshly compiled. I have no conflicting older kernel drivers installed. Detailed specs are below.
I was able to run older versions of the kernel driver with my GeForce 8600M GT. I speculate that the newer driver/CUDA is incompatible with my rather ancient hardware but the “Getting Started” document suggests 2.2 works with GeForce 8-series hardware.
I would like to be able to develop code which supports features only available in CUDA 2.2.
Any suggestions?
Thanks,
Jeff
=========================================
cat /proc/version
Linux version 2.6.27.21-0.1-default (geeko@buildhost) (gcc version 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP 2009-03-31 14:50:44 +0200
=========================================
cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 185.18.08 Thu Apr 30 15:48:49 PDT 2009
GCC version: gcc version 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux)
=========================================
…/…/bin/linux/release/deviceQuery
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA
Device 0: “GeForce 8600M GT”
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 268107776 bytes
Number of multiprocessors: 4
Number of cores: 32
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 0.95 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)
Test PASSED
Press ENTER to exit…
=========================================
…/…/bin/linux/release/deviceQueryDrv
CUDA Device Query (Driver API) statically linked version
There is 1 device supporting CUDA
Device 0: “GeForce 8600M GT”
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 268107776 bytes
Number of multiprocessors: 4
Number of cores: 32
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 0.95 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)
Test PASSED
Press ENTER to exit…
=========================================
…/…/bin/linux/release/matrixMul
cudaSafeCall() Runtime API error in file <matrixMul.cu>, line 108 : no CUDA-capable device is available.
=========================================
…/…/bin/linux/release/matrixMulDrv
Using device 0: GeForce 8600M GT
cuSafeCallNoSync() Driver API error = 0002 from file <matrixMulDrv.cpp>, line 96.
=========================================
…/…/bin/linux/release/simpleCUBLAS
simpleCUBLAS test running…
!!! CUBLAS initialization error