Circular reference in CUDA Driver API design


There exists a circular reference in the API design for CUDA Driver API:

// stream section
cuStreamWaitEvent( stream, event, flags );

// event section
cuEventRecord( event, stream );

In other words:

“stream object” uses “event object”.

“event object” uses “stream object”.

This circular design pattern can probably be broken by moving the cuEventRecord to the stream section and perhaps even swapping the parameters like so:

// stream section
cuStreamRecordEvent( stream, event );

cuStreamWaitEvent( stream, event, flags );

^ I think this could be a better design.