then i implemented the sorting to be serial on CPU, WHAT is the best timer code in C to estimate the elapsed time and is fairly comparable with the previous parallel timer
When you look into the functions in helper_timer.h in SDK, you would find that they use
gettimeofday() function. Essentially, sdkStartTimer(&hTimer) and sdkStopTimer(&hTimer),
both of them use serial timer. One thing should point out is that do not forget to do
the synchronization after sorting. In other words, insert cudaDeviceSynchronize() after
sorting. Otherwise, the timer only measures kernel launch time.