How to run GPU and OMP together in terms of Finite Difference?

I am trying to use GPU and OMP together to improve my computation time on large scale finite difference codes. The thing is my GPU card only have 1.6G memory (Quadro FX 4800), and my CPU is 8 cores with total 16G memory.

I use pgi complier to avoid re-writing codes, adding one line before loop starts and one line after loop ends. it can improve the computation time efficiently by using GPU. But the model size is limited because I need to concern 1.6G memory of GPU card. In this case, my 8 cores with 16G memory can not be used by adding OMP line to spread to multicore processor.

How can I write code to execute pgi complier of GPU and multicore CPU simutaneously?


I worked on a few algorithms similar to Finite difference but coded everything manually and not used pgi (I guess you’ll find out yourself that you need to do it as well :), one reason

being multi-gpu and memory limitations).

Anyway why would you want to run on both the GPU and the CPU at the same time? if you get a good speedup from the GPU the CPU will either only hold you back or will be able

to calculate only a very small portion of the overall job - just give the GPU that small portion of work the CPU would have been able to do and wait a few more seconds… if you got

a good speedup from the GPU those few milliseconds/seconds you’d save are meaningless.

My 1 cent,