Hi,
I recently configured a fedora 10 linux gpu system for research into gpu computing. The system has a NVIDIA Tesla C1060 and GeForce 9500 GT processor. I ran the tests suggested in the CUDA computing guide to see if things were configured right. I found a couple of strange things.
- deviceQuery returned no GPUs
- deviceQuryDrv returned 2 GPUS.
However when I run bandwidth test it crashes as follows;
[kadambi@janaka release]$ ./bandwidthTest
Running on…
device 0:��~Z�
Quick Mode
Host to Device Bandwidth for Pageable memory
Segmentation fault
device Query OUTPUTS:
[kadambi@janaka release]$ ./deviceQuery
CUDA Device Query (Runtime API) version (CUDART static linking)
There is no device supporting CUDA
Test PASSED
Press ENTER to exit…
[kadambi@janaka release]$ ./deviceQueryDrv
CUDA Device Query (Driver API) statically linked version
There are 2 devices supporting CUDA
Device 0: “Tesla C1060”
CUDA Driver Version: 2.20
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 4294705152 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)
Device 1: “GeForce 9500 GT”
CUDA Driver Version: 2.20
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 536150016 bytes
Number of multiprocessors: 4
Number of cores: 32
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.40 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)
Test PASSED
Press ENTER to exit…
Any help is much appreciated.
R