NVIDIA Tesla K80 GPUs

Dear Everyone,

I was looking at the specifications for the Tesla K80 GPU that was recently launched and I saw that it has two GPU cards. While running a program on the K80, does CUDA take care of data transfer between the GPUs or should it be done via other techniques (say MPI or POSIX).

I am doing Computational Fluid Dynamic (CFD) simulations for my graduate work and am exploring the possibility of using GPUs to run my simulations since my research requires large super computing resources. Any help regarding this would be greatly appreciated. Thanks in advance for the reply!

The two GPU devices on a Tesla K80 board must be managed independently by the programmer, with a few exceptions (cublasXt and cufftXt).

If you had an MPI program that did a certain amount of GPU work per rank, you could arrange it so that each rank used a particular GPU device of the 2 that are on a K80.

Alternatively, CUDA provides lower level primitives for copying data from one GPU to another, if your needs dictate that.

Has anyone actually written some CFD code in CUDA? As far as I know now, most of that type of code is done on the CPU still.

I could be wrong though (and I hope that I am).

Ansys Fluent offers GPU acceleration. And there are other CFD codes that are GPU accelerated.

http://www.nvidia.com/object/computational_fluid_dynamics.html

Not sure what you mean. CFD was one of the first areas targeted with CUDA, like Lattice-Boltzman and Navier-Stokes. Google Scholar shows hundreds of relevant papers, and there are commercial applications, for example:

http://www.ansys.com/About+ANSYS/ANSYS+Advantage+Magazine/GPUs+Speed+the+Solution+of+Complex+Electromagnetic+Simulation
Accelerating ANSYS Fluent 15.0 using Nvidia GPUs

Thank you very much for your reply!

As far as I know, most Navier-Stokes solvers still employ CPUs (multi-core processors to speed up the computations). However, techniques like Lattice-Boltzmann Method (LBM) and Smoothed Particle Hydrodynamics (SPH) are suited for GPUs since most computations are local and explicit. In fact, there is an increasing number of GPU usage for these methods since they offer a low cost super computing alternative. I myself have seen very good speed ups with my LBM code on a GTX 580 GPU. Hope that helps!

My only hands-on interaction in the CFD field was with a Lattice-Boltzman code. I did not write the code. The performance on the GPU was very good relative to CPUs. That was probably five years ago.