Tesla 20-Series Features and Advantages

aeronaut · April 15, 2010, 1:48pm

So with all this nice new tech rolled in, it’s all the more disappointing to have 3/4 of the performance capped on the consumer cards.

Martin

sumitg · June 8, 2010, 4:44am

bringing this message to the top, since this question keeps coming up in the forums.

Tom_Milledge · June 8, 2010, 11:24am

If you are going to bump the topic, would it be too much to ask if you would address the questions that were raised here?

gshi · June 15, 2010, 3:20pm

Regarding to the DMA engine, I did a test recently and found GTX480 can do bi-directional communication

but C2050 cannot. Do you have any insights on this?

http://forums.nvidia.com/index.php?s=&…t&p=1073297

to copy it here

$ ./concur_bandwidth  0

device 0: GeForce GTX 480

Device 0 took 3000.489502 ms

Test 1: Aggregate HtoD bandwidth in MB/s: 5995.058594

Device 0 took 3006.603027 ms

Test 2: Aggregate DtoH bandwidth in MB/s: 6621.408203

Device 0 took 2995.593994 ms

Test 3: Aggregate bidirectional per GPU bandwidth in MB/s: 11184.810547

$ ./concur_bandwidth  1

device 1: GeForce GTX 280

Device 1 took 2999.640137 ms

Test 1: Aggregate HtoD bandwidth in MB/s: 5995.058594

Device 1 took 3000.135498 ms

Test 2: Aggregate DtoH bandwidth in MB/s: 5860.841309

Device 1 took 2978.960693 ms

Test 3: Aggregate bidirectional per GPU bandwidth in MB/s: 5905.580078

$ ./concur_bandwidth 0

device 0: Tesla C2050

Device 0 took 3006.502441 ms

Test 1: Aggregate HtoD bandwidth in MB/s: 6129.276855

Device 0 took 2990.946533 ms

Test 2: Aggregate DtoH bandwidth in MB/s: 5681.883789

Device 0 took 2988.590332 ms

Test 3: Aggregate bidirectional per GPU bandwidth in MB/s: 6889.844238

AlexanderAgathos · June 15, 2010, 7:33pm

A common question we get is why should I buy Tesla instead of GeForce.

Here are some things to consider, written with Tesla 20-series / Fermi products in mind:

Tesla 20-series (Fermi-based) products are designed for high performance

scientific and technical GPU computing.

They thus have features, testing, and support over and above our consumer

GeForce GTX 470 and 480 (Fermi-based) products such as:

Double precision is 1/2 of single precision for Tesla 20-series, whereas double precision

is 1/8th of single precision for GeForce GTX 470/480

ECC is available only on Tesla

Tesla 20-series has 2 DMA Engines (copy engines). GeForce has 1 DMA Engine. This

means that CUDA applications can overlap computation and communication on Tesla using

bi-directional communication over PCI-e.

Tesla products have larger memory on board (3GB and 6GB)

Cluster management software is only supported on Tesla products

The TCC (Tesla Compute Cluster) driver for Windows is only supported on Tesla

OEMs offer integrated workstations and servers with Tesla products only

HPC ISV software is tested, certified, and supported only on Tesla products

Tesla products are built for reliable long running computing applications and

undergo intense stress testing and burn-in. In fact, we create a margin in

memory and core clocks (by using lower clocks) to increase reliability and long life.

Tesla products are manufactured by NVIDIA and come with a 3-year warranty

Tesla customers receive enterprise support and have higher priority for CUDA bugs

and requests for enhancements

Tesla products have long availability cycles ranging from 18 to 24 months and NVIDIA

gives its customers a 6 month EOL notice before discontinuing a Tesla product.

Learn more at http://www.nvidia.com/tesla

CUDA Software Development tools are linked from : http://www.nvidia.com/object/tesla_software.html

Knowledgebase entry that will kept up to date

http://nvidia.custhelp.com/cgi-bin/nvidia…hp?p_faqid=2595

Thats what I also say the Tesla 20 series are more capable than the current cards but with a high price tag of 2500 euros they have compute Capability 2.0 and abundant memory 16GB GDDR5. So they are better than a Fermi card. This is why I believe they take their time to create a Tesla Fermi card. I do not know any details I have no connection with Nvidia to know any kind of specs.

pawel_astro · September 2, 2010, 10:04pm

well, to begin with, clocks are lowered to decrease thermal failure rates, therefore bandwidth and TFLOPs take a hit.

how’s that “more capable” or “more tested”? more tested to select chips for lowering their frequency?

I have no inside information and it’s only my suspicion, but I certainly doubt that any additional

testing or treatment beyond lowering the clocks takes place on teslas. how do you convince me otherwise?

where are test results on reliability of tesla vs fermi?

nvidia should explain how it is that gtx 280 (especially factory-overclocked such as I have) is almost as fast in single precision as

gtx 480 (that I also have). nvidia’s official data sheets say 0.993 FT on old cards, 1.03 FT single prec. on new tesla architecture

(compute cards). so where’s the progress?

per one cuda processor, there is now less bandwidth! most codes are bandwidth-limited, it’s hard to to do dozens artimetic

oprations on one float before returning it to global memory. improvement? where?

gtx 295 had dual gpus and a better price-performance than fermis. again, where is the progress?

disabling 3/4 dp units for marketing reasons on gtx 480… and slapping a 5 times higher list price on C2050 ($2500 vs $500),

a card with slower clocks than gtx 480, what sense does it make? I’ll buy 5 x more cards and even if there is some truth

about smaller reliability of fermis, I’ll be much better off (in sp).

I’m mad because I’d like to have both a better single prec. price/performance and better dp capability, but now I have the dilemma…

In my personal view, having both the old and new architecture in my computers, the new one is overall sort-of 30% better

and not really cheaper. this needs to be compared with the boasts of nvidia leaders a number of years ago, where they

were predicting huge financial problems for the competitor (amd) and said they’ll aim at doubling the computational power

every year! no such thing ever happened. fermi cards are not significantly better than gt200 cards for numerical applications in single precision

(most science can be done in single precision, including cfd).

c++ on the device is nice but not an absolute must for scientific programmers.

personally, right now I’d love to be able to buy a bunch of gtx 295s at, say, $350-400, since the tesla price tag forces me out of double precision area anyway. however, 295s are totally gone and you can’t even find them second-hand on ebay.

all in all, nvidia pr machine should be congratulated on a pretty good job. for a while I was overwhelmed with how much better gtx 480 will be than

gtx 280, until the dp issue emerged and tesla cards were revealed to be essentially gtx470s with enabled dp units and one more dma engine (which in practice people don;t always immediately see working, as this thread documents).

I think the answer to the question: why should I buy tesla and not gforce is “maybe you shouldn’t !!!”

pawel_astro · September 2, 2010, 10:04pm