simultaneous run of CPU and CPU

(typo: CPU and GPU should be the proper subject title, my mistake.)

can i simultaneously run an extern “c” function and cuda from the main? in a extremely simple example,

main()
{

an_extern_c(A,B);

cuda_kernel<<<dimGrid,dimBlock>>>(C,D);

}

in the above, how do i arrange those two calls (an_extern and cuda_kernel) so that they run simultaneously?

can i do that?

any comments are welcome and thanks in advance.

Just reverse the order of the calls so the GPU is launched first (which is asynchronous) and the CPU function happens afterwards (which is blocking.)

You’ll probably want a synchronization afterwards so you can continue after you know both CPU and GPU have finished.

cuda_kernel<<<dimGrid,dimBlock>>>(C,D);

an_extern_c(A,B);

cudaThreadSynchronize();

do not forget that gpu call are batched, so to run simultaneously you need to release batch to gpu.