I’m writing my first cuda program to compare performance of the device over the host. Here is the code I used to record the host computation time:
//Create and Record cuda events to time cpu execution
cudaEvent_t start1, stop1;
cudaEventCreate(&start1);
cudaEventRecord(start1,0);
float cputime;
doCompOnHost(a_h, b_h, c_h, result_h, N );
//End Time events for host
cudaEventCreate(&stop1);
cudaEventRecord(stop1, 0);
cudaEventSynchronize(stop1);
cudaEventElapsedTime(&cputime, start1, stop1);
cudaEventDestroy(start1);
cudaEventDestroy(stop1);
Would this return an accurate time for the cpu compuation?
Thanks, I just read the kernel calls are asynchronous, so it makes sense why it wouldn’t work for CPU times. Do you know of a comprehensible CPU timing tutorial in C. I just want to time a cpu function. External Media