When event changes its state from NotReady to Recording. cudaEventRecord. stream.


I have a question about cuda event.
I’m studying the example asyncAPI in SDK.
It creates 2 events at different time and compute cpu iteration and elapsed time in gpu.

Below is the code for event creation and other computation in asyncAPI.cu.

// create cuda event handles
cudaEvent_t start, stop;
CUDA_SAFE_CALL( cudaEventCreate(&start) );
CUDA_SAFE_CALL( cudaEventCreate(&stop)  );

unsigned int timer;
CUT_SAFE_CALL(  cutCreateTimer(&timer)  );
CUT_SAFE_CALL(  cutResetTimer(timer)    );
CUDA_SAFE_CALL( cudaThreadSynchronize() );
float gpu_time = 0.0f;

// asynchronously issue work to the GPU (all to stream 0)
CUT_SAFE_CALL( cutStartTimer(timer) );
    cudaEventRecord(start, 0);
    cudaMemcpyAsync(d_a, a, nbytes, cudaMemcpyHostToDevice, 0);
    increment_kernel<<<blocks, threads, 0, 0>>>(d_a, value);
    cudaMemcpyAsync(a, d_a, nbytes, cudaMemcpyDeviceToHost, 0);
    cudaEventRecord(stop, 0);
CUT_SAFE_CALL( cutStopTimer(timer) );

In just above paragraph, it starts to record event start' by cudaEventRecord(). Then the state of start’ event
at that time is set to cudaErrorNotReady(by querying it after cudaEventRecord()).
My question is when exactly the event state changes to real recording.

In this example above, it looks like the start' event starts recording when a thread started memory copying to device, and the stop’ event starts recording when all thread finished the copy to host. Is this right?

Basically, I am not clear what event mean in cuda. It looks like the streaming (memory copying) is an example of it.

Thank you.