Hey guys. I have a piece of code something like this:
for (n=0; n<N; n++)
gpuMultKernel<<<blocksPerGrid, g_ThreadsPerBlock>>>(dg_uiptrInB, dg_fvG, dg_plG);
gpuAccKernel<<<blocksPerGrid, g_ThreadsPerBlock>>>(dg_uiptrOutB, dg_cpNoiseB);
I measure the time it takes by using the endTime and startTime, On average it takes say 40us for one loop. But every once in a while (4-5 times in 10,000 loops) I see that it takes over 1ms. Does anybody have any idea why this is happening? When this happens I’m way behind real-time…
This is happening on both GTX285 and Tesla C2050 none of which are used for the display and the development platform is Linux.
I really appreciate any insights and suggestions.