Timing algorithms in host and device.


I’ve been developing some algorithms in CUDA and now is time to see what’s the performance using Sequential vs Parallel algorithms. For the elapsed time when running the parallel algorithm I’m using CUDA Events, but what can I use to measure the time on sequential algorithms? Thanks in advance !