Hi all,
I am trying to understand the reduction sample program. However while i try to execute I get this unspecified driver error. I am on a 64-bit machine running 32-bit fedora 10, having devdriver_3.0_linux_32_195.36.15 for the nvidia geforce 8600 GT driver.
what is the problem here?
[student@localhost ~]$ cd ~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/
[student@localhost release]$ ./reduction
./reduction Starting...
Using Device 0: GeForce 8600 GT
Reducing array of type int
16777216 elements
256 threads (max)
64 blocks
reduction.cpp(476) : cudaSafeCallNoSync() Runtime API error : unspecified driver error.
[student@localhost release]$ uname -a
Linux localhost.localdomain 2.6.27.5-117.fc10.i686.PAE #1 SMP Tue Nov 18 12:08:10 EST 2008 i686 i686 i386 GNU/Linux
[student@localhost release]$ lspci -nn | grep 'VGA\|NV'
01:00.0 VGA compatible controller [0300]: nVidia Corporation GeForce 8600 GT [10de:0402] (rev a1)
However, later i tried this :
[student@localhost release]$ ./vectorAdd
Vector addition
vectorAdd.cu(71) : cudaSafeCall() Runtime API error : unspecified driver error.
[student@localhost release]$ cd ..
[student@localhost linux]$ cd ..
[student@localhost bin]$ cd ..
[student@localhost C]$ cd src/
[student@localhost src]$ cd deviceQuery
[student@localhost deviceQuery]$ ls
deviceQuery.cpp Makefile
[student@localhost deviceQuery]$ make
deviceQuery.cpp:120:11: warning: extra tokens at end of #else directive
deviceQuery.cpp:129:11: warning: extra tokens at end of #else directive
deviceQuery.cpp: In function ‘int main(int, const char**)’:
deviceQuery.cpp:121: warning: format ‘%d’ expects type ‘int’, but argument 3 has type ‘const char*’
deviceQuery.cpp:121: warning: too many arguments for format
[student@localhost deviceQuery]$ cd ..
[student@localhost src]$ cd ..
[student@localhost C]$ cd bin/linux/release/
[student@localhost release]$ ls
deviceQuery reduction reduction.txt vectorAdd
[student@localhost release]$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA
Device 0: "GeForce 8600 GT"
CUDA Driver Version: 3.0
CUDA Runtime Version: 3.0
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 536150016 bytes
Number of multiprocessors: 4
Number of cores: 32
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.19 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 134564155, CUDA Runtime Version = 3.0, NumDevs = 1, Device = GeForce 8600 GT
PASSED
Press <Enter> to Quit...
-----------------------------------------------------------
[student@localhost release]$ cd ..
[student@localhost linux]$ cd ..
[student@localhost bin]$ cd ..
[student@localhost C]$ cd src/deviceQueryDrv/
[student@localhost deviceQueryDrv]$ ls
deviceQueryDrv.cpp Makefile
[student@localhost deviceQueryDrv]$ make
deviceQueryDrv.cpp: In function ‘int main(int, char**)’:
deviceQueryDrv.cpp:44: warning: unused variable ‘err’
[student@localhost deviceQueryDrv]$ cd ..
[student@localhost src]$ cd ..
[student@localhost C]$ cd bin/linux/release/
[student@localhost release]$ ls
deviceQuery deviceQueryDrv deviceQuery.txt reduction reduction.txt SdkMasterLog.csv vectorAdd
[student@localhost release]$ ./deviceQueryDrv
CUDA Device Query (Driver API) statically linked version
There is 1 device supporting CUDA
Device 0: "GeForce 8600 GT"
CUDA Driver Version: 3.0
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 536150016 bytes
Number of multiprocessors: 4
Number of cores: 32
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.19 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)
PASSED
Press ENTER to exit...
The device Query program runs fine.
The runtime api and the driver api seems to be working fine, if i understand them correct. So, what’s the problem?
Thanks and Regards,
kg