Particle Accelerator Beam Dynamics using CUDA

First of all, please allow me to introduce myself.

I am Michael David Salt of the University of Manchester. For my MPhys project, I looked at the Beam Dynamics modelling of Particle Accelerators, using the GPGPU concept. Over the summer, we produced a conference paper at the CHEP '07 meeting in Canada.

This project was based on BrookGPU, using an nVidia 7900 GT. This proved satisfactory, albeit it was limited to single-point precision. We are looking at double precision in the future.

We were hoping that CUDA with an 8800 Ultra could provide this for us. Does CUDA support double-precision? I know it does not in the biblical sense (IEEE 754), but is it close enough?



Currently available GPUs do not natively support double precision.

Thank you very much.

I’ve just got to check, before we get the hardware. Will CUDA work using the nForce 680i motherboard, Intel Core 2 Duo 2.33GHz processor and an Nvidia 8800 Ultra? I am almost certain it will, but I have to get a second opinion for the funding council.


That configuration should work just fine.

That configuration works fine. But if you want double precision you should wait for the G92 (due very soon). It is said to support true double precision floating point arithmetic.

Is your CHEP paper online or published in some form yet? I would be extremely interested in reading it.

Looks like CHEP have not updated their webpage yet. The paper is available to view on the webpage of my Ph.D supervisor.

I realise that we have made a mistake with the terminology. We have referred to fp16 precision as ‘single’ and fp32 as ‘double’. Of course, fp16 is only ‘half’ precision. Apologies for that. Thank you for your interest.

Wumpus, any idea how soon ‘very soon’ will be? Ideally, we would like to get this project started within a month or so.


No official date yet AFAIK, but it is expected in mid-November.

All details on the G92 are rumors at best right now. The most recent article that claims the most “confirmed” rumors is here:

I still wouldn’t believe any of it until an official press release from NVIDIA, expected late Oct. thru mid Nov. Most of the rumors agree that the mass-produced G92’s will not have double (64-bit) precision, which will be reserved for a future Tesla.

The OP indicated his interest in 32-bit floats as “double”, which all current CUDA cards support.

But if he needs 64 bit float support, there is no other choice but waiting whenever the next generation comes. Or can you emulate it in some way on current hardware (retaining at least some performance)?

You can emulate it, using techniques like those shown in the dsfun90 library. The simple representation used in that library breaks a double precision number into the sum of two singles, a “most” and “least” significant float. This has less precision than a true double (48 bits vs 53 bits), but it will get you about 14 significant digits. Since the G80 does an intermediate round-off in the multiply-add instruction, you have to use a less efficient code path to do things like addition. IIRC, it was somewhere between 11 and 17 instructions to do a pseudo-double precision addition. Unfortunately, it looks like some of the implementations of transcendental functions use looping, which would kill performance on a GPU.

Not so good if you are FPU-bound, but if you are memory bound, then the extra overhead might not be so serious. Techniques like this could be used for calculations that need extended precision (quad or even higher) as well. (For example, GIMPS might be hitting the limits of double-precision FFTs in their Mersenne prime search.)

Thank you all for the advice. Evidently there is much to learn even before we get the hardware ordered.

From what I understand now, the G92 will become the 8800GT, albeit produced using the 45nm fabrication process. I assume then that this will not natively support double precision. Would that be correct?

That sounds very interesting, thanks Seibert


TSMC is not anywhere near mass production of 45nm.

Yes, sorry. 65nm fabrication. I confused it with the 45nm Intel Penryn. I wonder why they have given the new processor a Welsh name?

Anyways, anyone got any idea when we will see any form of natively double-precision GPU hardware? It appears that the G92 is not going to be the 64 bit wonder-chip we were waiting for.

In the meantime, we have gone for the BFG 8800 Ultra. We can always slide a double-precision Tesla board in later. There is talk of PCI Express v2.0. Will this be a hardware upgrade, or just firmware? It would be a shame to have to replace the motherboard just to use the v2.0 hardware.


The Intel X38 Chipset already supports PCI-Express 2.0, but I don’t know if there are any cards out yet that do have PCI-Express 2.0 interface.
Maybe the 8800 GT supports it.