Latency Studies of CUDA for Particle Physics Applications Two talks at the TIPP 2011 Conference

I’m at a particle physics conference focusing on instrumentation and detector hardware, and saw two interesting talks from groups starting to look at using CUDA in the trigger systems of particle accelerator experiments. So far, all of the use of CUDA in particle physics that I’m aware of has been “offline” situations where you want a lot of throughput, but there isn’t a hard deadline for the calculation to finish. In the case of a detector trigger, you need to decide in milliseconds (or less) whether to keep an event or throw it away. In these cases, you care a lot about latency and the predictability of that latency. (There is buffering, so you can still afford to do the trigger decision calculation on many events at once.)

Both of these talks detail some early studies of latency in CUDA by these physics groups:

“Performance Study of a GPU in Real-Time Applications for HEP Experiments”:
“GPUs for fast triggering in NA62 experiment”:

(Disclaimer: I have no affiliation at all with either group.)

Thanks for sharing. The latency in the talk is typically 20 micro seconds, as we know, overhead of kernel launch is >10 micro seconds.

If time constraint of a real time system is 10+ micro seconds, not 10 ms, what should we expect on GPU?