one CUDA card unrecognized in 64bit Win7

educnq · April 13, 2011, 3:46am

I have 2 CUDA cards installed, one C1060 (device 0), the other GT430. Both of them work in Linux. However, in 64bit Win7, only C1060 is recognized as shown in deviceQuery’s results.

C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\bin\win64\Release\deviceQuery.exe Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

There is 1 device supporting CUDA

Device 0: “Tesla C1060”

CUDA Driver Version / Runtime Version 4.0 / 4.0

CUDA Capability Major/Minor version number: 1.3

Total amount of global memory: 4096 MBytes (4294770688 bytes)

(30) Multiprocessors x ( 8) CUDA Cores/MP: 240 CUDA Cores

GPU Clock Speed: 1.30 GHz

Memory Clock rate: 800.00 Mhz

Memory Bus Width: 512-bit

Max Texture Dimension Size (x,y,z) 1D=(8192), 2D=(65536,32768), 3D=(2048,2048,2048)

Max Layered Texture Size (dim) x layers 1D=(8192) x 512, 2D=(8192,8192) x 512

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 16384

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 2147483647 bytes

Texture alignment: 256 bytes

Concurrent copy and execution: Yes with 1 copy engine(s)

Run time limit on kernels: No

Integrated GPU sharing Host Memory: No

Support host page-locked memory mapping: Yes

Concurrent kernel execution: No

Alignment requirement for Surfaces: Yes

Device has ECC support enabled: No

Device is using TCC driver mode: Yes

Device supports Unified Addressing (UVA): No

Device PCI Bus ID / PCI location ID: 131 / 0

Compute Mode:
 < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.0, CUDA Runtime Version = 4.0, NumDevs = 1, Device = Tesla C1060

Note it’s in TCC mode. Anyone knows how to fix it? Thanks.

Edit: Both cards can be seen in Device Manager.

educnq · April 13, 2011, 3:51am

Here is the Linux result:

[./deviceQuery] starting…

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

There are 2 devices supporting CUDA

Device 0: “Tesla C1060”

CUDA Driver Version / Runtime Version 4.0 / 4.0

CUDA Capability Major/Minor version number: 1.3

Total amount of global memory: 4096 MBytes (4294770688 bytes)

(30) Multiprocessors x ( 8) CUDA Cores/MP: 240 CUDA Cores

GPU Clock Speed: 1.30 GHz

Memory Clock rate: 800.00 Mhz

Memory Bus Width: 512-bit

Max Texture Dimension Size (x,y,z) 1D=(8192), 2D=(65536,32768), 3D=(2048,2048,2048)

Max Layered Texture Size (dim) x layers 1D=(8192) x 512, 2D=(8192,8192) x 512

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 16384

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 2147483647 bytes

Texture alignment: 256 bytes

Concurrent copy and execution: Yes with 1 copy engine(s)

Run time limit on kernels: No

Integrated GPU sharing Host Memory: No

Support host page-locked memory mapping: Yes

Concurrent kernel execution: No

Alignment requirement for Surfaces: Yes

Device has ECC support enabled: No

Device is using TCC driver mode: No

Device supports Unified Addressing (UVA): No

Device PCI Bus ID / PCI location ID: 131 / 0

Compute Mode:
 < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
Device 1: “GeForce GT 430”

CUDA Driver Version / Runtime Version 4.0 / 4.0

CUDA Capability Major/Minor version number: 2.1

Total amount of global memory: 1024 MBytes (1073414144 bytes)

( 2) Multiprocessors x (48) CUDA Cores/MP: 96 CUDA Cores

GPU Clock Speed: 1.40 GHz

Memory Clock rate: 667.00 Mhz

Memory Bus Width: 128-bit

L2 Cache Size: 131072 bytes

Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)

Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 49152 bytes

Total number of registers available per block: 32768

Warp size: 32

Maximum number of threads per block: 1024

Maximum sizes of each dimension of a block: 1024 x 1024 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535

Maximum memory pitch: 2147483647 bytes

Texture alignment: 512 bytes

Concurrent copy and execution: Yes with 2 copy engine(s)

Run time limit on kernels: No

Integrated GPU sharing Host Memory: No

Support host page-locked memory mapping: Yes

Concurrent kernel execution: Yes

Alignment requirement for Surfaces: Yes

Device has ECC support enabled: No

Device is using TCC driver mode: No

Device supports Unified Addressing (UVA): Yes

Device PCI Bus ID / PCI location ID: 2 / 0

Compute Mode:
 < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.0, CUDA Runtime Version = 4.0, NumDevs = 2, Device = Tesla C1060, Device = GeForce GT 430

[./deviceQuery] test results…

PASSED

In both systems, I’m using latest 4.0 RC2 softwares and developer drivers.

educnq · April 13, 2011, 2:19pm

Actually I have a third video card (Matrox G200eW) on the same workstation, which is not CUDA and used for display. Could it be the problem?

brano · April 14, 2011, 8:03am

Yes indeed, that could be the problem.

I had the same problem on my machine. Try to plugin the monitor to the GTX card in see if it detects booth.

The tesla card works because it is in TCC mode. If you disable the TCC from the tesla not event this card will be recognized as a CUDA device.

educnq · April 15, 2011, 7:25am

Thanks brano. Using your method, I fixed the problem (see new Win7 results below). But before plugging the monitor into the GT430 card, some steps are needed. These include disabling onboard card driver in safe mode and setting boot video card priority in BIOS [BIOS → Advanced → PCI/PnP Configuration → Boots Graphic Adapter Priority → Select “Slot 6” instead of previous “Onboard VGA” (the NVIDIA GT430 card is on slot 6)].

C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\bin\win64\Release\deviceQuery.exe Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

There are 2 devices supporting CUDA

Device 0: “Tesla C1060”

CUDA Driver Version / Runtime Version 4.0 / 4.0

CUDA Capability Major/Minor version number: 1.3

Total amount of global memory: 4096 MBytes (4294770688 bytes)

(30) Multiprocessors x ( 8) CUDA Cores/MP: 240 CUDA Cores

GPU Clock Speed: 1.30 GHz

Memory Clock rate: 800.00 Mhz

Memory Bus Width: 512-bit

Max Texture Dimension Size (x,y,z) 1D=(8192), 2D=(65536,32768), 3D=(2048,2048,2048)

Max Layered Texture Size (dim) x layers 1D=(8192) x 512, 2D=(8192,8192) x 512

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 16384

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 2147483647 bytes

Texture alignment: 256 bytes

Concurrent copy and execution: Yes with 1 copy engine(s)

Run time limit on kernels: No

Integrated GPU sharing Host Memory: No

Support host page-locked memory mapping: Yes

Concurrent kernel execution: No

Alignment requirement for Surfaces: Yes

Device has ECC support enabled: No

Device is using TCC driver mode: Yes

Device supports Unified Addressing (UVA): No

Device PCI Bus ID / PCI location ID: 131 / 0

Compute Mode:
 < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
Device 1: “GeForce GT 430”

CUDA Driver Version / Runtime Version 4.0 / 4.0

CUDA Capability Major/Minor version number: 2.1

Total amount of global memory: 993 MBytes (1041694720 bytes)

( 2) Multiprocessors x (48) CUDA Cores/MP: 96 CUDA Cores

GPU Clock Speed: 1.40 GHz

Memory Clock rate: 667.00 Mhz

Memory Bus Width: 128-bit

L2 Cache Size: 131072 bytes

Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)

Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 49152 bytes

Total number of registers available per block: 32768

Warp size: 32

Maximum number of threads per block: 1024

Maximum sizes of each dimension of a block: 1024 x 1024 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535

Maximum memory pitch: 2147483647 bytes

Texture alignment: 512 bytes

Concurrent copy and execution: Yes with 2 copy engine(s)

Run time limit on kernels: Yes

Integrated GPU sharing Host Memory: No

Support host page-locked memory mapping: Yes

Concurrent kernel execution: Yes

Alignment requirement for Surfaces: Yes

Device has ECC support enabled: No

Device is using TCC driver mode: No

Device supports Unified Addressing (UVA): No

Device PCI Bus ID / PCI location ID: 2 / 0

Compute Mode:
 < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.0, CUDA Runtime Version = 4.0, NumDevs = 2, Device = Tesla C1060, Device = GeForce GT 430

educnq · April 15, 2011, 9:15am

The monitor does not plug into the card and the card is not recognized. This sounds like a bug. Anyway Linux has no this problem. NVIDIA may want to fix it.

Topic		Replies	Views
Ubuntu, CUDA 9, dual GTX1070, both (either) recognized, but can only initialize/use one CUDA Setup and Installation	2	1335	August 2, 2018
IBM Power8: CUDA driver version is insufficient for CUDA runtime version CUDA Setup and Installation	9	3139	December 1, 2016
CUDA/deviceQuery only possible with sudo CUDA Setup and Installation	1	762	November 19, 2018
CUDA never uses two GPUs CUDA Setup and Installation	2	1704	April 27, 2016
Windows 7 no CUDA-capable device is detected CUDA Setup and Installation	23	19268	January 9, 2018
Problems with CUDA drivers for NVIDIA Hardware CUDA Setup and Installation	9	1268	October 27, 2020
GTX295 Specefications & CUDA CUDA Programming and Performance	5	12286	October 7, 2010
cudaGetDeviceCount() Returns Wrong Count CUDA Setup and Installation	0	2286	September 8, 2014
why "all CUDA-capable devices are busy or unavailable" ? CUDA Programming and Performance	34	64333	April 20, 2011
CUDA runtime version 0.0 CUDA Programming and Performance	2	1664	May 23, 2011

one CUDA card unrecognized in 64bit Win7

Related topics