Do We Need to Synchronize Before Calling cudaGraphExecDestroy

renyuanjun310 · June 18, 2025, 2:57am

Hi,

I’m working with CUDA Graphs and I’d like to confirm the correct usage pattern of cudaGraphExecDestroy.

According to my understanding, if I launch a graph via cudaGraphLaunch(graphExec, stream), and then immediately destroy it via cudaGraphExecDestroy(graphExec), there could be undefined behavior unless I first synchronize the stream.

However, the latest CUDA documentation for cudaGraphExecDestroy no longer includes the note stating that “it is the user’s responsibility to ensure the graph execution is complete before destroying the graph execution object.”

So my question is:

Is it still required (or strongly recommended) to call cudaStreamSynchronize(stream) before calling cudaGraphExecDestroy, in order to avoid undefined behavior or crashes?

If not, does the destroy API now handle pending executions safely? Or is the synchronization still a user’s responsibility, even if not explicitly stated in the documentation anymore?

Thanks for any clarification from the NVIDIA team or engineers.

Best regards,
Yuanjun Ren

Topic		Replies	Views
Regarding where to place the api cudaStreamSynchronize() while looping Jetson AGX Orin cuda	2	83	June 26, 2024
Documentation on cudaGraphXXX() Graph Management functions? CUDA Programming and Performance	1	589	October 8, 2018
Do I need to synchronize the stream / threads after a cusolver call? CUDA Programming and Performance cuda	1	369	March 29, 2022
sanity check: when do I need to synchronize kernel launches? CUDA Programming and Performance	1	542	February 2, 2018
Should I call cudaDeviceSynchronize between successive npp/cufft/cublas calls in default stream? CUDA Programming and Performance cuda	3	49	October 26, 2024
Is it possible to terminate a cuda stream while it is running? CUDA Programming and Performance	9	2929	May 31, 2019
Multiples launch of a single cudaGraphExec_t on the device creates a deadlock CUDA Programming and Performance cuda	2	511	August 15, 2023
Multiple launches of a single cudaGraphExec_t executing in parallel, in contrast to documentation? CUDA Programming and Performance	9	821	March 18, 2022
synchronization between the host and the stream CUDA Programming and Performance	3	988	June 29, 2009
Should I synchronize after forward/backward APIs before destroying descriptors? cuDNN	2	492	August 13, 2022

Do We Need to Synchronize Before Calling cudaGraphExecDestroy

Related topics