I’m trying to use all computing power of GTX 460 graphic card and all 8 cores of Intel i920 processor.
All cuda stream tutorials cover situations where two or more parallel actions (kernels and memory copy) are executed simultaneously.
What I need is a simple example of using cuda streams to parallel execute GPU kernel and a CPU function using streams.
Let’s assume I have GPU kernels gpu1(), gpu2() and gpu3(); and a CPU function cpu1().
first function gpu1() must be executed.
after gpu1() is complete, gpu2() and cpu1() can run in parrallel
when both gpu2() and cpu1() are done, gpu3() can be executed
gpu1() | | ----- ------ | | Ë‡ Ë‡ gpu2() cpu1() | | ---- ---- | | Ë‡ Ë‡ gpu3()
Can plz someone write an example cuda code using streams…