Device Handling

Hi,

I’m new in cuda and I would like some help.

I know thtat kernel calls are asychronous, that is,after a kernel launch, control immediately returns to cpu.
But how can i know when the device has finished the job? Is there some event that informs me that kernel has completed?

Thanks in advance.

The CUDA 4.0 runtime API includes cudaDeviceSynchronize, which is a blocking call which waits until the current device has gone idle. In older versions of the CUDA runtime API, the equivalent functionality was implemented using cudaThreadSynchronize.

The problem is that I don’t want to stay in blocking state.

The fact that the control returns to the cpu is exactly what i need.

But also, i need to be informed that the kernel has finished while the control is on cpu.

It is important for me if anyone could help me.

its all in the CUDA programming guide, really. Just read the section on asynchronous execution. There, you’ll find a discussion of events, streams, and the calls cudaEventQuery, cudaStreamQuery, and cudaStreamWaitEvent. These will do exactly what you ask.