Hi
I’m trying to work on the CPU with the data generated by the kernel… that is, after copying data from the device to the host, and resend the kernel to generate more data…
This process is on a loop.
This is the algorithm but it doesn’t seem to work…
while (true)
{
-
cudaMallocs() // asking for memory
-
cudaDeviceSynchronize() // making sure everything is synchronized
-
cudaEventRecord(start) // for taking time purposes
-
call_async_Kernel(b, t, 0, 0)
-
cudaEventRecord(stop)
-
cudaMemcpyAsync(cudaMemcpyDeviceToHost)
-
while (cudaEventQuery(stop) == cudaErrorNotReady)
if (data_Available)
Work_on_CPU() -
cudaFree()
}
Maybe this is not the proper way to do it, any ideas?
Thanks in advance!