When calling cuEventSynchronize() on a CUevent that was created WITHOUT the CU_EVENT_BLOCKING_SYNC flag, which waiting method does the CPU thread use: SPIN or YIELD?
If anyone from the community or NVIDIA team has any insights, I would appreciate hearing from you. Thank you.
You could file a bug report requesting more documentation about the default behaviour.
You should be able to create a simple example which you can profile via nsight systems. If it shows continuous cpu usage during execution of eventsynchronize, this would indicate a spinning loop.
Note that if the default behaviour is not specified in the documentation, the insights from profiling might not apply to different systems
Thank you for the feedback. The default behavior is exactly the specification I was looking for clarification on, but it seems it is undecided.