Particle simulation in unsteady flows: which hardware?

Hi, we are planning to build a CUDA simulation for interactive real-time analysis of many-particle-advections in unsteady wind fields. Likely bottleneck will be moving a lot of data (3D wind filds over many time-steps) as quickly as possible from disk to GPU memory. Possibly also the interpolation of winds to particle positions, some turbulence parameterization and high numbers of particles. I am wondering now which hardware setup would be good to have - in particular is there a way to transfer data from disc to GPU without having to go via the CPU?