I can’t get any of the shipped SDK samples to run, other than maybe deviceQuery. All of the samples that actually launch a kernel crash with:
“Runtime API error : all CUDA-capable devices are busy or unavailable.”
Using the 3.1 the samples run fine, and using 3.2 on a different card (295) is also successful. Oh, I’m using driver 260.99, but it doesn’t work with 260.93 either.
Maybe there’s something obvious I’m overlooking? Any advice is appreciated.
Here’s the deviceQuery output
deviceQuery.exe Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA
Device 0: "GeForce GTX 480"
CUDA Driver Version: 3.20
CUDA Runtime Version: 3.20
CUDA Capability Major/Minor version number: 2.0
Total amount of global memory: 1576468480 bytes
Multiprocessors x Cores/MP = Cores: 15 (MP) x 32 (Cores/MP) = 480 (Cores)
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 0.81 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)
Concurrent kernel execution: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Version = 3.20, NumDevs = 1, Device = GeForce GTX 480
PASSED
Thanks