Is there any applications in which CPU-GPU transfer take significant amout of time?(On the same order of magnitide of GPU computing time)
In such cases the GPU will probably not be an attractive solution.
In my case I have to move to the GPU a lot of data (GBs for one portion of calculation) - since the calculation takes a lot of time its
still a good solution. However some few things to note:
you can use better PCI lanes - use x16 Gen2.
move your data to the GPUs in chunks and make sure once the chunk has been processed there is no need for it again.
move as much of your algorithm to the GPU - I moved a crappy serial code to the GPU (managed to make it a bit multithreaded on the GPU)
in order to make the GPU return a much smaller subset to the CPU as the result of the calculation, thus avoiding a lot of PCI overhead.
Anything where you don’t use the data much! If you have an application where each data value is only used once, it may be an idea to try zero-copy.
Thinking about my current application it actually spends close to 100% of the time doing memory transfers. It also spends close to 100% of the time doing computations. Concurrent copy and execute is fun.
Pinned mem can help…