GTX295 Specefications & CUDA

Hi!

I have started working with CUDA on my new PC. The PC specifications are:

As can be seen at the previous URL, the GPU consists of 480(240x2) processor cores and 1792MB(865x2) of memory.
However, if I execute the “deviceQuery.exe” program who comes at the NVIDIA GPU Computing SDK the result is as follows:

So, it notes that the GPU is “Geforce GTX 295” but only appears to be 240 processor cores and 869MB of memory! Indeed, if I try to load more than 869MB in the GPU, the program crashes.

So the question is: why CUDA doesn’t detect the correct specifications of the GTX295??. is it a bug??, Is there any solution?

Thanks a lot!

pdt: i’m sorry for my English :)

Hi!

I have started working with CUDA on my new PC. The PC specifications are:

As can be seen at the previous URL, the GPU consists of 480(240x2) processor cores and 1792MB(865x2) of memory.
However, if I execute the “deviceQuery.exe” program who comes at the NVIDIA GPU Computing SDK the result is as follows:

So, it notes that the GPU is “Geforce GTX 295” but only appears to be 240 processor cores and 869MB of memory! Indeed, if I try to load more than 869MB in the GPU, the program crashes.

So the question is: why CUDA doesn’t detect the correct specifications of the GTX295??. is it a bug??, Is there any solution?

Thanks a lot!

pdt: i’m sorry for my English :)

What you should see is two such devices. Basically its seems your second device is not detected. While I have not much experience with Windows in these matters, I think I remember that if you activate SLI, CUDA can not detect both devices (probably when using SLI Windows thinks there is only one device, and so CUDA can only find one). CUDA can not use the SLI bridge.

Cheers

Ceearem

P.S. This is how the output looks like on one of our Linux machines with 2 GTX295 and a 8400GS for the X-Server (though right now none is running…)

/usr/app-soft/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

There are 5 devices supporting CUDA

Device 0: "GeForce GTX 295"

  CUDA Driver Version:						   3.20

  CUDA Runtime Version:						  3.20

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 3

  Total amount of global memory:				 939327488 bytes

  Number of multiprocessors:					 30

  Number of cores:							   240

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 16384

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  2147483647 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.38 GHz

  Concurrent copy and execution:				 Yes

  Run time limit on kernels:					 No

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Exclusive (only one host thread at a time can use this device)

  Concurrent kernel execution:				   No

  Device has ECC support enabled:				No

Device 1: "GeForce 8400 GS"

  CUDA Driver Version:						   3.20

  CUDA Runtime Version:						  3.20

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 1

  Total amount of global memory:				 536150016 bytes

  Number of multiprocessors:					 1

  Number of cores:							   8

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 8192

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  2147483647 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.62 GHz

  Concurrent copy and execution:				 No

  Run time limit on kernels:					 No

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Default (multiple host threads can use this device simultaneously)

  Concurrent kernel execution:				   No

  Device has ECC support enabled:				No

Device 2: "GeForce GTX 295"

  CUDA Driver Version:						   3.20

  CUDA Runtime Version:						  3.20

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 3

  Total amount of global memory:				 939327488 bytes

  Number of multiprocessors:					 30

  Number of cores:							   240

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 16384

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  2147483647 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.38 GHz

  Concurrent copy and execution:				 Yes

  Run time limit on kernels:					 No

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Exclusive (only one host thread at a time can use this device)

  Concurrent kernel execution:				   No

  Device has ECC support enabled:				No

Device 3: "GeForce GTX 295"

  CUDA Driver Version:						   3.20

  CUDA Runtime Version:						  3.20

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 3

  Total amount of global memory:				 939327488 bytes

  Number of multiprocessors:					 30

  Number of cores:							   240

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 16384

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  2147483647 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.38 GHz

  Concurrent copy and execution:				 Yes

  Run time limit on kernels:					 No

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Exclusive (only one host thread at a time can use this device)

  Concurrent kernel execution:				   No

  Device has ECC support enabled:				No

Device 4: "GeForce GTX 295"

  CUDA Driver Version:						   3.20

  CUDA Runtime Version:						  3.20

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 3

  Total amount of global memory:				 939327488 bytes

  Number of multiprocessors:					 30

  Number of cores:							   240

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 16384

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  2147483647 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.38 GHz

  Concurrent copy and execution:				 Yes

  Run time limit on kernels:					 No

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Exclusive (only one host thread at a time can use this device)

  Concurrent kernel execution:				   No

  Device has ECC support enabled:				No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Version = 3.20, NumDevs = 5, Device = GeForce GTX 295, Device = GeForce 8400 GS

PASSED

Press <Enter> to Quit...

-----------------------------------------------------------

What you should see is two such devices. Basically its seems your second device is not detected. While I have not much experience with Windows in these matters, I think I remember that if you activate SLI, CUDA can not detect both devices (probably when using SLI Windows thinks there is only one device, and so CUDA can only find one). CUDA can not use the SLI bridge.

Cheers

Ceearem

P.S. This is how the output looks like on one of our Linux machines with 2 GTX295 and a 8400GS for the X-Server (though right now none is running…)

/usr/app-soft/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

There are 5 devices supporting CUDA

Device 0: "GeForce GTX 295"

  CUDA Driver Version:						   3.20

  CUDA Runtime Version:						  3.20

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 3

  Total amount of global memory:				 939327488 bytes

  Number of multiprocessors:					 30

  Number of cores:							   240

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 16384

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  2147483647 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.38 GHz

  Concurrent copy and execution:				 Yes

  Run time limit on kernels:					 No

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Exclusive (only one host thread at a time can use this device)

  Concurrent kernel execution:				   No

  Device has ECC support enabled:				No

Device 1: "GeForce 8400 GS"

  CUDA Driver Version:						   3.20

  CUDA Runtime Version:						  3.20

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 1

  Total amount of global memory:				 536150016 bytes

  Number of multiprocessors:					 1

  Number of cores:							   8

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 8192

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  2147483647 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.62 GHz

  Concurrent copy and execution:				 No

  Run time limit on kernels:					 No

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Default (multiple host threads can use this device simultaneously)

  Concurrent kernel execution:				   No

  Device has ECC support enabled:				No

Device 2: "GeForce GTX 295"

  CUDA Driver Version:						   3.20

  CUDA Runtime Version:						  3.20

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 3

  Total amount of global memory:				 939327488 bytes

  Number of multiprocessors:					 30

  Number of cores:							   240

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 16384

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  2147483647 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.38 GHz

  Concurrent copy and execution:				 Yes

  Run time limit on kernels:					 No

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Exclusive (only one host thread at a time can use this device)

  Concurrent kernel execution:				   No

  Device has ECC support enabled:				No

Device 3: "GeForce GTX 295"

  CUDA Driver Version:						   3.20

  CUDA Runtime Version:						  3.20

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 3

  Total amount of global memory:				 939327488 bytes

  Number of multiprocessors:					 30

  Number of cores:							   240

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 16384

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  2147483647 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.38 GHz

  Concurrent copy and execution:				 Yes

  Run time limit on kernels:					 No

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Exclusive (only one host thread at a time can use this device)

  Concurrent kernel execution:				   No

  Device has ECC support enabled:				No

Device 4: "GeForce GTX 295"

  CUDA Driver Version:						   3.20

  CUDA Runtime Version:						  3.20

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 3

  Total amount of global memory:				 939327488 bytes

  Number of multiprocessors:					 30

  Number of cores:							   240

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 16384

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  2147483647 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.38 GHz

  Concurrent copy and execution:				 Yes

  Run time limit on kernels:					 No

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Exclusive (only one host thread at a time can use this device)

  Concurrent kernel execution:				   No

  Device has ECC support enabled:				No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Version = 3.20, NumDevs = 5, Device = GeForce GTX 295, Device = GeForce 8400 GS

PASSED

Press <Enter> to Quit...

-----------------------------------------------------------

I can’t remember exactly what the option is named, but there’s something you need to set in the nVidia dialog in Control Panel. Something about acceleration or PhysX, if I remember correctly (my old development machine had 2 GTX295 boards in it, and I ran into the same problem).

I can’t remember exactly what the option is named, but there’s something you need to set in the nVidia dialog in Control Panel. Something about acceleration or PhysX, if I remember correctly (my old development machine had 2 GTX295 boards in it, and I ran into the same problem).