Why cudaGraphLaunch(graph_exec_, stream1) dont run the graph at stream1

i have captured a graph at stream0 and instantiated it. Then i want the graph replay at another stream, for example stream1. so i use cudaGraphLaunch(graph_exec_, stream1). but the info from profiler shows that the graph run at a new stream, nor stream0 neither stream1. why?

Capturing a graph from a stream will “just” record the graph topology and optimize. The captured workflow can then be executed repeatedly.
For this, the driver is free to use internal streams to fulfill the node dependencies.

1 Like