I’m looking for applications that have high cpu/gpu communication ratio.
Those applications should either be applications that the data transfer between the cpu and the gpu takes more than 30% of the total execution time or even better,hybrid applications that the cpu does a certain part of the job and the gpu part of the computations.
So far i’ve found that there are a lot of applications with high data transfer overhead but very few (if any) hybrid applications.
Do you know any application that i could use for benchmarking purpuses in any of the two categories mentioned above?
I’ve already tested parboil benchmark suite and rodinia benchmarks.
Thanks in advance
Have a look at some of the dense matrix factorization functions in the UTK Magma library. Those include hybridization of different parts of the Lapack block factorization algorithms, including a lot of data exchange between host and device through the lifespan of one operation.
MAGMA is a great example. btw, the CPU-GPU communication gap can be reduced by using overlapped-copy-kernel-execution feature. Not sure how many apps actually take advantage of it…(That could really reduce the overhead if properly used and tuned)
forgot to say thank you.
i will look into MAGMA
All of the CUDA accelerations in VMD work in hybrid.
The app I develop, HOOMD-blue, is the exact opposite of what you are looking for because the GPU does 99.999…% of the work and gpu->cpu communication is only performed when needed for disk I/O :)
Great i will also look into it,thanks a lot.