Timing a CUDA program ?

Hi guys,
I have written a ray tracer for CUDA and want to time it as in I want to figure out how much time GPU is executing, how much time CPU is executing and waiting for GPU to return.

I have based my RT on postprocessGL example.

I tried the asynchAPI example for timing and all but somehow the cudacreateevent doesnt happen and it gives some memory error.

Basically I had written the block for creating timer and creating events in the process method in the postprocessgl.cu file.
I will post snippets telling exactly what I have done.

Thanks again.


use the visual profiler, it will tell you all sorts of details, runtime being one of them.

Thanks for the reply.


// Run the Cuda part of the computation //


void process(int pbo_in,int pbo_out,int width,int height, dimm3 center) 


	// for timing cuda calls


    // create cuda event handles

    cudaEvent_t start, stop;

    CUDA_SAFE_CALL( cudaEventCreate(&start) );

    CUDA_SAFE_CALL( cudaEventCreate(&stop)  );

    unsigned int timer;

    CUT_SAFE_CALL(  cutCreateTimer(&timer)  );

    CUT_SAFE_CALL(  cutResetTimer(timer)    );

    CUDA_SAFE_CALL( cudaThreadSynchronize() );

    float gpu_time = 0.0f;


	// for the CUDA process

    int *in_data;

	int* out_data;

    CUDA_SAFE_CALL(cudaGLMapBufferObject( (void**)&in_data, pbo_in));

	CUDA_SAFE_CALL(cudaGLMapBufferObject( (void**)&out_data, pbo_out));

    dim3 block(BLOCK_SIZE,BLOCK_SIZE,1);

    dim3 grid(width / block.x, height / block.y,1);

	CUT_SAFE_CALL( cutStartTimer(timer) );

  cudaEventRecord(start, 0);


  cudaRayTracer<<< grid, block,0,0>>>(out_data,width,height, center);

  cudaEventRecord(stop, 0);


	CUT_SAFE_CALL( cutStopTimer(timer) );

That is basically my code for timing.

Still it doesnt work.

It gives some memory errors I dont get otherwise.

The Visual Profiler…ehh that should solve all my problems :D


– PsycloXPS