G210, GT220 deviceQuery?

jma · October 12, 2009, 1:45am

Now that the G210 as well as GT220 are about to hit retail, would somebody care to post the output from ‘deviceQuery’ for those cards?

avidday · October 12, 2009, 4:07pm

Are these going to be the first compute 1.2 capable cores in the wild, or are they just G9x cores shrunk onto TSMCs 40nm rule?

cbuchner1 · October 12, 2009, 5:50pm

From gpugrid.net forums

02/09/2009 19:28:49 CUDA device: GeForce GT 220 (driver version 19062, compute capability 1.2, 1024MB, est. 23GFLOPS)

Voila, it’s Compute 1.2

I am pretty dissatisfied with nVidia’s information policy regarding the newer laptop and OEM/mid range chips. It’s really hard to get any reliable information before product launch day.

Christian

jma · October 12, 2009, 5:58pm

OK, so the GT220 is compute 1.2 and therefore has more registers. Thats great! How about “Support host page-locked memory mapping” and “Concurrent copy and execution” then?

The G210 has 16 shaders (rather than an expected 24) which would be consistent with a die shrink of a 9400GT and compute 1.1 but … what is it actually?

tmurray · October 12, 2009, 6:41pm

Compute 1.2 should all support zero-copy and async memcpys. (I haven’t actually checked this, I’m going off of memory, but I’m pretty sure I’m right)

jma · October 12, 2009, 8:47pm

Well, could you? I am not asking for a big release with trumpets and elephants, not even a candlelight moment - just a deviceQuery so the card won’t be buried in total silence.

cbuchner1 · October 13, 2009, 7:19am

I’d like to ask for the fanfares and trumpets and for the detailed CUDA compute specs to be included on the product page of each chip.

Just stating DirectX capability, number of shaders, clock rates is not enough for us coders. I’d be eternally grateful if you could forward this request to marketing and/or the web design guys.

Add a small “CUDA” section to the Technical Specifications page.

Christian

cbuchner1 · October 13, 2009, 8:12am

And more importantly - will current Linux and Windows CUDA drivers recognize the PCI device IDs of these cards?

Or will one have to install inofficial or Beta drivers?

cbuchner1 · October 14, 2009, 9:51am

Searching the nVidia driver download page for a driver supporting GT220 on 32 bit Linux comes up empty, even when including Beta drivers. Sigh

Christian

jma · October 14, 2009, 3:22pm

GT220 is supported in 190.36 and has VDPAU feature set ‘C’:

…

GeForce GT 220 0x0A20 C

GeForce GT 230M 0x0A2A C

GeForce GT 240M 0x0A34 C

GeForce G210 0x0A60 C

GeForce G210M 0x0A74 C

GeForce GTS 260M 0x0CA8 C

GeForce GTS 250M 0x0CA9 C

ftp://download.nvidia.com/XFree86/Linux-x…appendix-a.html

cbuchner1 · October 15, 2009, 6:35pm

Just bought a cheap GT220 model with 1GB of DDR2 memory for 58 Euros. Will post the deviceQuery string tomorrow or so.

Christian

jma · October 16, 2009, 1:19am

Taking a bullet for the team? External Media

cbuchner1 · October 16, 2009, 7:52am

Purely for egoistic reasons. I am running out of registers on the G80/G92 architecture and wanted to get Compute 1.2 on the cheap.

avidday · October 16, 2009, 7:58am

The GT220 hasn’t exactly been setting the world on fire in the gaming benchmark stakes, but it should make a great CUDA development card. 48 cuda capability 1.2 MPs should perform pretty well on compute bound tasks with all that extra register file space and niceties like shared memory atomics.

cbuchner1 · October 16, 2009, 8:18am

Dear nVidia, you’ve got to be kidding me. The card has been in retail channels all week. We need working Linux drivers, please.

(II) NVIDIA dlloader X Driver  190.36  Wed Sep 23 07:47:56 PDT 2009

(II) NVIDIA Unified Driver for all Supported NVIDIA GPUs

(II) Primary Device is: 

(EE) No devices detected.

UPDATE:

Dear nVidia, I apologize.

xorg expected me to specify a BusID in the device section, probably because I had two cards in the machine and it could not determine on its own which one to use. The driver works now.

Device 0: "GeForce GT 220"

  CUDA Driver Version:						   2.30

  CUDA Runtime Version:						  2.30

  CUDA Capability Major revision number:		 1

  CUDA Capability Minor revision number:		 2

  Total amount of global memory:				 1073414144 bytes

  Number of multiprocessors:					 6

  Number of cores:							   48

  Total amount of constant memory:			   65536 bytes

  Total amount of shared memory per block:	   16384 bytes

  Total number of registers available per block: 16384

  Warp size:									 32

  Maximum number of threads per block:		   512

  Maximum sizes of each dimension of a block:	512 x 512 x 64

  Maximum sizes of each dimension of a grid:	 65535 x 65535 x 1

  Maximum memory pitch:						  262144 bytes

  Texture alignment:							 256 bytes

  Clock rate:									1.36 GHz

  Concurrent copy and execution:				 Yes

  Run time limit on kernels:					 Yes

  Integrated:									No

  Support host page-locked memory mapping:	   Yes

  Compute mode:								  Default (multiple host threads can use this device simultaneously)

In comparison to my nVidia 8500 GT in a PCI bus (not PCI-express), which is now the secondary CUDA device. Only different fields are shown. I should probably swap the display card such that I get unlimited run time for kernels on the faster device.

CUDA Capability Minor revision number:		 1

  Total amount of global memory:				 268173312 bytes

  Number of multiprocessors:					 2

  Number of cores:							   16

  Total number of registers available per block: 8192

  Clock rate:									0.92 GHz

  Run time limit on kernels:					 No

  Support host page-locked memory mapping:	   No

And don’t get a card equipped with DDR2 memory (16 GB/sec). At minimum you want GDDR2 (25 GB/sec) or even DDR3 (32 GB/sec).

Device to Device Bandwidth

.

Transfer Size (Bytes)   Bandwidth(MB/s)

 33554432			   13844.9

Argh ;) That’s what I get for buying cheap.

jma · October 16, 2009, 9:24am

Device 0: "GeForce GT 220"

...

 Concurrent copy and execution:				 Yes

 Support host page-locked memory mapping:	   Yes

...
And don’t get a card equipped with DDR2 memory (16 GB/sec). At minimum you want GDDR2 (25 GB/sec) or even DDR3 (32 GB/sec).

OK! Ordering a Gigabyte GT220 with DDR3

Thanks!

EDIT: It looks like DDR3 is 25 GB/sec as well. It is instead GDDR3 which maxes out at 32 (and also sips a lot more power)

cbuchner1 · October 16, 2009, 10:50am

Let me know if anyone finds a model that is equipped with GDDR3 and offers the 32GB/sec throughput.

For memory bandwidth limited apps, this indeed makes a difference.

UPDATE: Seems that I am effectively able to squeeze about 50 GFlops out of my card with my application.

Christian

jma · October 16, 2009, 2:06pm

Wait second now … IIRC bandwidthTest measures a[n] = b[n], where both a and b are in global memory. So each transfer involves one load as well as one store, meaning the raw bandwidth is twice of what is reported. To convince yourself that this is so you could try a += b[n], which should run twice as fast and return a measure more in line with what marketing would like to put in their press-releases.

EDIT: Or maybe not? In any case, my current theoretical bandwidth is 8 GB/sec, but bandwidthTest reports only 4.8 …

The new card should arrive some time next week, so - if you can manage - hold your horses until then and I’ll post a measure you can compare to.

UPDATE: bandwidth for the Gigabyte GT220-OC, device to device is indeed about 24GB/sec. That will do for my purposes. I am not sure why the card has a fan? But it runs very cool as well as very silent (as in “unnoticeable”), so no complaints.

seibert · October 16, 2009, 2:36pm

bandwidthTest already multiplies by a factor of 2 in computing the rate to account for the read and write. Most of my GTX 200-series cards get about 80% of their theoretical device-to-device bandwidth in bandwidthTest. You’re only getting 60%, so I wonder if this is related to ratio of MPs to memory bandwidth or some other factor.

cbuchner1 · October 16, 2009, 5:53pm

GDDR 3 Models appear to be (no warranty given, check with the manufacturer)

Club 3D: CGNX-G222I (512MB)
MSI: N220GTâ€MD1G/D3 (1GB)
Gigabyte: Die GV-N220OC-1GI (1GB)
Point of View: R-VGA150929-D3 (1GB)
ASUS: ENGT220/DI/1GD3 (1GB GDDR3, pretty sure)

Source: [url=“Geforce G 210 & GT 220: Club 3D, Elitegroup, Gigabyte, Leadtek, PoV und Zotac - Update: Passive Geforce GT 220 von MSI”]Geforce G 210 & GT 220: Club 3D, Elitegroup, Gigabyte, Leadtek, PoV und Zotac - Update: Passive Geforce GT 220 von MSI

However, some retailers do not seem to differentiate clearly between DDR3 and GDDR3. So it seems to be a hit-or-miss game to get the 32 GBytes/sec memory bandwidth.

Passively cooled models:

Elitegroup: NSGT220C-1GQS-H und NSGT220C-512QZ-H

Topic		Replies	Views
Speed difference for same CUDA code under Windows/Linux CUDA Programming and Performance	24	45929	March 17, 2010
Sounds like GK208 laptops/cards will support most sm_35 features CUDA Programming and Performance	102	29108	August 12, 2014
most effective way to get a mobile CUDA gpu CUDA Programming and Performance	24	7516	September 29, 2008
Memory bandwidth CUDA Programming and Performance	31	38405	October 5, 2007
GTX 1070 CUDA/Mem performance thread CUDA Programming and Performance	5	15087	August 8, 2016
trying to get a tesla k10 online. cuda_5.5.22_linux_64.run fails Linux	18	5800	February 16, 2014
GeForce 210 for $20 after rebate Perfect for display-only GPU when doing CUDA development CUDA Programming and Performance	24	9382	February 11, 2010
High performance CUDA notebooks GTX 285M anyone in a 15 inch form factor? CUDA Programming and Performance	34	4978	June 6, 2010
CUDA 2.2 beta features CUDA Programming and Performance	146	126071	May 19, 2009
Using GTX 590 cards for CUDA SLI cards under CUDA? CUDA Programming and Performance	37	14219	April 2, 2012

G210, GT220 deviceQuery?

Related topics