double precision on mobile GPU

E.D_Riedijk · September 29, 2011, 10:07am

Hi All,

was not able to find things by search, so I’ll ask here:

Does the Quadro 2000/3000/4000M support double precision (the slow Geforce way)? Or is the 5010M the only one with double precision (the fast Tesla/Quadro way)?

If anybody has more information, that would be great, since for demo’s I would want to have double precision support.

Thanks in advance,
Denis

pasoleatis · October 2, 2011, 6:32pm

Hello,

Go to this page http://developer.nvidia.com/cuda-gpus and check the Compute Capability. If It is is below 1.3 there is no double precision support. For the specific cards it is.

Quadro 5000M 2.0

Quadro 4000M 2.0

Quadro 3000M 2.0

Quadro 2000M 2.0

So they should all have double precision. When you compile a code in gppu put the flag -arch=sm_20 to force the 2.0, otherwise will compile outmatially for 1.0 and with no double.

seibert · October 2, 2011, 8:46pm

I think his question is whether the double precision throughput is capped to 1/8 the single precision throughput in the same way that the GeForce cards are. I seem to recall a statement that the desktop Quadros run at the Tesla level of double precision at 1/2 the rate of single precision, but I have no idea what the mobile GPUs are set to.

pasoleatis · October 2, 2011, 9:56pm

I see. My bad. The 2000/3000/4000/5000 cards use GF104 and 106 this is the same as the 400 series which have half performance in double precision compared to single precision. The quadro 5010m card is with the same chip as the 580 gtx which had equal performance in double or single precision.

Cristian

Here is the wikipedia page with al the cards List of Nvidia graphics processing units - Wikipedia

E.D_Riedijk · October 5, 2011, 2:29pm

Actually I do not really require full-blown double precision speed. Just the support would be nice.

I got confused by these pages:

and

where for 5010M it is stated: The Quadro 5010M is the successor of the Quadro 5000M and also offers ECC RAM and double-precision floating point cores
and for the 4000M: Compared to the 5010M, the 4000M does not support ECC memory and DP floating point calculations

So I have some doubt (given that notebookcheck always has quite good information), does anybody have some experience with double precision on notebook gpu’s?

seibert · October 5, 2011, 3:41pm

As far as I understand it, if the GPU is compute capability 2.0 or 2.1, then it has to support double precision. There is no equivalent in Fermi to the capability 1.2 mobile chips of the previous generation that were just like capability 1.3, but without the double precision.

Unfortunately, it sounds like the laptop vendor is confused. If they are correct about GF104, then you should have double precision at 1/12 of the maximum single precision throughput.

pasoleatis · October 5, 2011, 6:48pm

Quadro 5000M 2.0
Quadro 4000M 2.0
Quadro 3000M 2.0
Quadro 2000M 2.0
I am starting to get confused. These cards are based on the same chips as the 400m series which support double precision at half speed compared to single precision. Same for the 5010M.
Quadro 1000M GF108
Quadro 2000M GF106
Quadro 3000M GF104
Quadro 4000M GF104
Quadro 5000M GF100
Quadro 5010M GF110GLM

GeForce GT 435M GF108
GeForce GTX 460M GF106
GeForce GTX 470M GF104
GeForce GTX 480M GF100
GeForce GTX 485M GF104
M2090 GF110

It is strage. According to this nvidia page only the 5010M cards supports double precision Quadro & RTX Professional Design & Visualization Solutions | NVIDIA, but according to this page Page Not Found | NVIDIA the 5000M card supports double precision and ECC as well.

E.D_Riedijk · October 6, 2011, 7:41am

Quadro 5010M GF110GLM
       M2090 	GF110
It is strage. According to this nvidia page only the 5010M cards supports double precision NVIDIA Quadro Legacy Products, but according to this page Page Not Found | NVIDIA the 5000M card supports double precision and ECC as well.

Well, that can be because the 5010M is the successor of the 5000M, so the 5010M can be the only mobile card they currently manufacture to support DP.

Interesting to see that it is based on the same CHIP-code as M2090.

Anyhow, the Quadro 4000M is ordered as the 5010M is quite a lot more expensive, I’ll have to give it a try when it gets here :)

pasoleatis · October 6, 2011, 8:12am

Yes so I guess the 2000/3000/4000 cards are 2.0, but do not support double precision, while 5000 and 5010 do. Not so clear on their website.

laughingrice · October 9, 2011, 3:03pm

All Fermi cards support double precision at least at 1/8 of single. I have the 2000m and it definitely support double precision (at 1/8 perf. of single). I don’t know about the mobile GPUs, but Quadro 4000 and up and Teslas run at 1/2 single precision speed.

pasoleatis · October 9, 2011, 5:22pm

It is stated explicit on the nvisia webpage that only 5000M and 5010M have double precision, while the other 3 mentioned do not have double precision.

laughingrice · October 9, 2011, 5:58pm

They are talking about double precision at 1/2 single precision, not any double precision support. I know for a fact from first hand experience that the 2000m, 1000m, gtx 430, gtx480, gtx 570 and from second hand experience regarding every Fermi GPU possibly excluding the wierd laptop Fermi’s with 16 cores which I don’t know about. I did read somewhere that the performance is 1/12 rather than 1/8 of single precision, but it’s definitely there, regardless of how NVIDIA word things on the website.

By the way, using CUDA-z at the moment on the Quadro 2000m which is a Compute 2.1 GPU and which you claim has no double precision support, I see:

Single precision float - 279470 Mflops

Double precision float - 35087.3 mflops (1/7.97)

pasoleatis · October 10, 2011, 7:01am

I guess I misunderstand this statement from the nvidia. I thought something else. Good to know.

Fast 64-Bit Floating Point Precision

Industryâ€™s fastest double precision floating point performance enabling accurate results on mission-critical applications… Available only on Quadro 5010M.

laughingrice · October 12, 2011, 9:35am

Knowing NVIDIA and the hype around the HPC market, the person responsible for the page put the emphasis on “Fast 64-Bit” rather than “64-Bit”. The difference is with the speed, not the support.

E.D_Riedijk · October 13, 2011, 12:37pm

Well, I will certainly run CUDA-z when I get the laptop (somewhere in november…)
I was able to get the 5010M for a small extra prize, so then we should see only a factor of 2 difference :)

Good to hear that double is supported on all Fermis though, that makes it much more worthwhile to change some code to double.

E.D_Riedijk · October 28, 2011, 4:46am

Well, GPU-z did not really work very well, it shows the first page with info and then hangs.
The nbody example however showed 270 SP GFLOPS and 150 DP GFLOPS, so that suggests that it is indeed a factor of 2 difference.

If anybody wants to have some (windows only for the time being) benchmarks run on it, let me know.

pasoleatis · October 28, 2011, 7:38am

Hello,

I do not have a benchmark, but I have some code that you can test. It does some a bunch of real to complex FFT and some additions and multiplication. I have single and double precision versions. The attached codes should run less than 10 minutes and at the end will give some time.

SPRTinplaceTwoDPFCtwodblock.cu is the single precision code
DPRTinplaceTwoDPFCtwodblock.cu is the double precision code.

For the double precision the flag -arch=sm_20 in order to keep the double precision. I am not running windows so I am not sure how to compile on windows, but if you would have a command line the compile line would be:

nvcc -O2 -lcufft -arch=sm_20 DPRTinplaceTwoDPFCtwodblock.cu
or
nvcc -O2 -lcufft SPRTinplaceTwoDPFCtwodblock.cu
DPRTinplaceTwoDPFCtwodblock.cu (8.57 KB)
SPRTinplaceTwoDPFCtwodblock.cu (8.39 KB)

E.D_Riedijk · October 30, 2011, 7:43pm

It might be after Wednesday as I have the laptop in a demo setup now, but I’ll see if I can try these out this week. I’ll try with CMAKE.

Topic		Replies	Views
Double precision for mobile Nvidia Mobile GPUs CUDA Programming and Performance	4	1107	July 21, 2011
Looking for a laptop to run scientific simulations in CUDA with double precision - speed is important CUDA Programming and Performance	7	1088	November 8, 2017
Looking devices that support double precision CUDA Programming and Performance	1	3767	July 9, 2009
any laptops supporting double precission? CUDA Programming and Performance	0	2371	August 2, 2010
GTX 280, CUDA and Double Precision CUDA Programming and Performance	15	16963	July 17, 2008
Do the 9400M and 9600M GT support double precision? CUDA Programming and Performance	7	17873	August 13, 2009
GPUs with Double Precision Support CUDA Programming and Performance	3	3853	August 12, 2009
480M compute capabilities Double precision in a laptop finally? CUDA Programming and Performance	8	4407	May 29, 2010
CUDA on a laptop CUDA Programming and Performance	6	7008	June 30, 2009
GT 240 and double precision CUDA Programming and Performance	4	15169	February 8, 2011

double precision on mobile GPU

Fast 64-Bit Floating Point Precision

Related topics