I know that : – Host memory loads and read-backs are synchronous and there is no need to put any sychronizing mechanism for it. The CPU after getting the control back from GPU ( after kernel launch) executes instructions till it encounters some GPU-CPU memory transfer. At that point it waits till the kernel finishes.
My question : 1. Suppose there are multiple kernels running then can the CPU read back the result of a previous kernel while the next kernel is running simultaneously (supposing there is no dependency in the previous kernel result and next kernel result)
Is it not possible to fetch intermediate results? while the kernel is still running.