malloc & cudaMalloc confusion over initialization of the two

No sir,

its the execution time of the kernel.

No sir,

its the execution time of the kernel.

cudaThreadSynchronize measures the kernel time I suppose.

cudaThreadSynchronize measures the kernel time I suppose.

No, cudaThreadSynchronize() just stalls the host thread in a spinlock until the GPU is finished. If you are using host side timers, it is the preferred way to get a reasonable indication of execution times.

This measures only kernel launch time:

timerstartcode();

kernel <<>> ();

timestopcode();

whereas this measures both kernel launch and execution time:

timerstartcode();

kernel <<>> ();

cudaThreadSynchronize();

timestopcode();

No, cudaThreadSynchronize() just stalls the host thread in a spinlock until the GPU is finished. If you are using host side timers, it is the preferred way to get a reasonable indication of execution times.

This measures only kernel launch time:

timerstartcode();

kernel <<>> ();

timestopcode();

whereas this measures both kernel launch and execution time:

timerstartcode();

kernel <<>> ();

cudaThreadSynchronize();

timestopcode();

Sir, I am using this code to measure timing of the kernel, will this not work?

[b]cudaEventCreate(&start);

cudaEventCreate(&stop);

cudaEventRecord(start,0);

kernel<<<>>>

cudaEventRecord(stop,0);

cudaEventSynchronize(stop);

cudaEventElapsedTime(&time,start,stop);

printf(“\n\nProcessing time is:\t%f (ms)\n\n”,time);

cudaEventDestroy(start);

cudaEventDestroy(stop);

[/b]

Sir, I am using this code to measure timing of the kernel, will this not work?

[b]cudaEventCreate(&start);

cudaEventCreate(&stop);

cudaEventRecord(start,0);

kernel<<<>>>

cudaEventRecord(stop,0);

cudaEventSynchronize(stop);

cudaEventElapsedTime(&time,start,stop);

printf(“\n\nProcessing time is:\t%f (ms)\n\n”,time);

cudaEventDestroy(start);

cudaEventDestroy(stop);

[/b]