Hi,
I’m working with CUDA Graphs and I’d like to confirm the correct usage pattern of cudaGraphExecDestroy
.
According to my understanding, if I launch a graph via cudaGraphLaunch(graphExec, stream)
, and then immediately destroy it via cudaGraphExecDestroy(graphExec)
, there could be undefined behavior unless I first synchronize the stream.
However, the latest CUDA documentation for cudaGraphExecDestroy
no longer includes the note stating that “it is the user’s responsibility to ensure the graph execution is complete before destroying the graph execution object.”
So my question is:
Is it still required (or strongly recommended) to call
cudaStreamSynchronize(stream)
before callingcudaGraphExecDestroy
, in order to avoid undefined behavior or crashes?
If not, does the destroy API now handle pending executions safely? Or is the synchronization still a user’s responsibility, even if not explicitly stated in the documentation anymore?
Thanks for any clarification from the NVIDIA team or engineers.
Best regards,
Yuanjun Ren