Evaluate cycle execution time Newbie question

I want evaluate execution time of multiplying 2 matrix with cublasSgemm

I try do such thing in c file

           start = GetTickCount();

	cublasSgemm ('n', 'n', M, N, N, 1, devPtrA, M, devPtrB, M, 0, devPtrC, M);

	stop = GetTickCount();

And i’ve got same values.

Could you explain please what’s wrong here and how can i do such thing?

P.S. Sorry my english isn’n best yet

I didn’t found the solution, so i did it another way - i evaluate execution time of all GPU time and use gemm 10000 times (to neutralize i/o time) with same input and output arrays and matrix dimension 4096*4096 and i"ve got 2 mSec for 1 sgemm run. It’s very nice result, but it doesn’t looks like real. Could you explain where is the problem?