cudaThreadSynchronize() stalls?

The CUDA 1.1 release notes says:

In order to reduce unwanted CPU utilization, the following APIs have
been modified to yield the CPU when the device is busy.

  • cuCtxSynchronize
  • cuEventSynchronize
  • cuStreamSynchronize
  • cudaThreadSynchronize
  • cudaEventSynchronize
  • cudaStreamSynchronize

Does this mean that cudaThreadSynchronize() (and other synchronize calls) could return before the kernel execution terminates? :unsure:


The modification to the synch calls was to add a thread yield to the busy wait loop. This is mentioned in a post by NVIDIA somewhere here, but I forget where.