Do CPU and GPU function execute parallely?

Hi All,

I have three function, one of it ,is executed on GPU( GPU kernel ) and other two are host function as:-

foo_1<<<gridSize, blockSize>>>(param_1, param_2);

foo_2(param_3);
foo_3(param_3);

Here foo_1 is GPU kernel and foo_2 and foo_3 are CPU functions, parameters of GPU and CPU are independent of each other.
My question is that if I calling like above specified order then , are GPU and CPU executed parallely?

I mean to say that if my foo_1 kernel takes 12 ms and functions foo_2 & foo_3 take total 15 ms then do these calling order take total 15 ms (all three function)?

By default yes.

Yes, they will be concurrent.

However, if you use the ‘cudaThreadSynchronize’ function call just after your kernel, then, the CPU will wait till the GPU completes this kernel and then will move onto the foo_2 and foo_3.

Thanks. This has helped me to improve performance.