Rendering data

Most cuda applications I see are rendered on the fly. I am working on combining the nbody and particles simulation in the sdk and I’d like to save the position data for some analysis and perhaps some really cool renders. Is dumping position data for 100k particles even feasible? I’m sure there will be speedup by nixing the opengl window.

Does anyone have any experience as to the best way to go about this?

It’s certainly feasible, just copy the particle data back to the CPU each frame using cudaMemCpy (which you can do at about 4GB/sec these days), and then write it to disk.

Dumping every frame is a good way to slow things down considerably, not to mention eat up your disk space fast! 100k particles * 16 bytes/particle * 100 TPS = 152 MiB/s. That’s a gigabyte of data every ~7 seconds. What we do in the science business is to dump only once every N steps, where N might be 100, 1000, or even 100000 (or anything in-between).

Sorry if I’m stating the obvious, but only save as much data as you really need to do your offline analysis. Depending on the type of analysis, the required data rate will differ. For example, calculating a pair correlation function g® takes quite a bit of statistics, so dumping every 1000 steps might be needed (alternately, increase the number of particles). On the other side, watching a movie of a structure that takes 10 million steps to form really only needs dumps every 10000 steps (giving you 100 frames of movie). The worst are quantities like the mean-squared displacement: in order to get a nice line fit for a diffusion coefficient, you really need to calculate it every 10 steps (or even every step). As a consequence, mean squared displacement is usually best calculated during the simulation: then you only need to write out one float every N steps.