11sec kernel then 600us


I have a large kernel and it takes 11sec the first time to run on a headed display. All execution of that kernel image takes 600us after that. Has cuda crashed? How could I tell?

Does this have anything to do with the five second limitation?

Insert cudaThreadSyncronize() right after kernel invocation and check return value for error code.

Running CUT_CHECK_ERROR(“Kernel execution failed”) does nothing in Release mode so I didn’t catch it. In debug mode it reports that the kernel has timed out. Apparently my memory isn’t being initialized properly. I have posted that problem in a seperate post.