Teslas have 240shaders/cores, run at 1.3Ghz, and have 4GiB memory, correct?
Typical GeForce respective numbers could be 192, 600Mhz, 896MiB. Correct?
I was also told, that Tesla have some kind of “double precision”. When I hear double, I think float/double, kind of double, but this person claimed that Teslas somehow really did every operation twice,
to really double check that each op was performed correctly. It sounded to me like a missunderstanding of what the double datatype is supposed to do, but I ask to be sure. True/False/In between?
Is there anything else of importance, between the two?
Teslas are worth their high price if really used, but in my case, where I don’t expect to use the full amount of memory, I’m wondering if I’m better of with a couple of GT200s instead?
As far as performance, the GeForce GTX 285 is just as fast as, if not a little faster than, the Tesla C1060. Both have 240 stream processors, and the 285 is clocked at ~1.5 GHz. (I’m told that you can find 1.5 GHz Teslas as well, now.)
The only capability difference between the GTX 285 and the Tesla is the memory size, as was pointed out. (GTX 285 has 1 or 2 GB and Tesla has 4 GB.) The Tesla has no video connector, but is tested much more thoroughly for 24/7 usage than the GeForce models. That’s not to say you can’t use the GeForce for 24/7 calculations, but NVIDIA makes no guarantees.
Other than testing, the GPU used on the Tesla is basically the same as that used on the GTX 285.
The person you were talking to was misinformed and clearly has no idea what double precision is. Floating point numbers have a finite precision, and using more bits to store a number can help you get a more precise answer (though the actual order of calculations here also plays a role). Single precision numbers are 32 bits in size, and have a fractional uncertainty of around 10^-7, whereas double precision numbers are 64-bits and have a fractional uncertainty of about 10^-15. The “double/single” distinction is simply a statement of size: double precision numbers take up twice as much space.
The GTX 260, 275, 280, 285, and 295 cards all have double precision capabilities like the Tesla. In all cases (including Tesla) the double precision performance is much less than the single precision performance. (There are only 1/8 as many double precision units as single precision units.)
If you are experimenting with CUDA, you are almost certainly better off with a GTX 285. If you want 4 GB of RAM, or assurances that 24/7 operation will be reliable, then consider Tesla.
TESLA – Is made for Enterprise
Geforce – Is made for Gamers
Basically, TESLA are more robust and they are MADE to be so… for a good reason. THey also have more memory.
Note that TESLA does NOT even have a graphics output (its meant only for HPC)
So, If you intend to use on production/mission critical systems — Always go 4 a TESLA.
And, Whatever the person told about double is all non-sense (sorry, I could not find a lesser degree word to describe it)
After reading this, and looking again at nvidias homepage, I realise that my numbers must be wrong.
Even the 260 seem to have a similar clock frequency to the teslas.
I was comparing the processing to the graphics Hz.
I still don’t quite understand what makes the difference between different cards, and if the difference will matter to Cuda apps, or if only makes a difference for other applications (games), but we’ll buy a 260 and a 295 to the office and do some experimentation on them to see how much speed we can get.
I would be very interested in others experience with this.
If you are looking within the realm of GTX 200 and Tesla cards, the only differences that matter to CUDA are:
Shader (“stream processor”) clock rate
Number of stream processors
Memory bandwidth (width of bus * memory clock * 2)
Size of device memory
Which of these factors are important depends on your problem: floating point performance bound, memory bandwidth bound, memory size limited, etc.
(There are other, more architectural differences, between generations of cards, like GTX 200 vs. GeForce 8 and 9.)
Also, you should keep in mind that the GTX 295 is two CUDA devices in one package. Your code does not use both devices unless you explicitly submit kernels to each device from separate host threads. If you want to try out multi-GPU programming, the GTX 295 is a nice card to play with. Otherwise you might want to compare a GTX 260 to a GTX 285, which is the fastest single CUDA device available.