hi,
i am new to cuda and have been trying out matrix multiplication. But i am having a problem when i time the gpu runs. i am posting the code below.
[codebox]cudaEvent_t start, stop;
float gpu_time;
cudaEventCreate(&start);
cudaEventCreate(&stop);
cudaEventRecord(start, 0 );
// Launch code on device
mat_mul_on_device<<<dimGrid, dimBlock>>>(a_d, b_d, p_d, WIDTH);
cudaEventRecord(stop, 0 );
cudaEventSynchronize(stop );
cudaEventElapsedTime(&gpu_time, start, stop);
cudaEventDestroy(start);
cudaEventDestroy(stop);[/codebox]
After this I print the time. I get these times for various widths of (square) matrices being multiplied:
width cpu time (s) gpu time (ms)
50 | 0.0000000e+00 | 6.0927998e-02
100 | 0.0000000e+00 | 4.7520000e-02
150 | 0.0000000e+00 | 6.4640000e-02
200 | 0.0000000e+00 | 6.5664001e-02
250 | 0.0000000e+00 | 6.5087996e-02
300 | 0.0000000e+00 | 6.7359999e-02
350 | 0.0000000e+00 | 6.6271998e-02
400 | 0.0000000e+00 | 6.6720001e-02
450 | 0.0000000e+00 | 6.7103997e-02
500 | 1.0000000e+00 | 6.6656001e-02
550 | 1.0000000e+00 | 6.8223998e-02
600 | 2.0000000e+00 | 7.0496000e-02
650 | 3.0000000e+00 | 7.5680003e-02
700 | 3.0000000e+00 | 7.6831996e-02
750 | 4.0000000e+00 | 7.8720003e-02
800 | 5.0000000e+00 | 8.0480002e-02
850 | 6.0000000e+00 | 8.0544002e-02
900 | 8.0000000e+00 | 8.1248000e-02
950 | 9.0000000e+00 | 8.2528003e-02
1000 | 1.1000000e+01 | 8.1184000e-02
2000 | 1.1900000e+02 | 7.9839997e-02
My problem is that these times seem to be going up and down with increase in width which i dont understand. Is there a header file to be included while using “cudaEvent…” commands?
Please help, Thanks!