clFinish problem on Mac OS X Lion

I’m building an app using both CUDA and OpenCL, where I measure execution times using a CPU timer.
I’m using cudaThreadSynchronize() and clFinish(…) before starting and before stopping the CPU timer.
Everything works fine on Ubuntu Linux 10.04 (the CUDA and OpenCL times are nearly identical), but on Mac OS X Lion the measured OpenCL times are too low.
Any help would be great.

Thank you.