Providing new data to a cuda graph

If we record a graph, we specify the input to the root node.

When we run the graph, it uses that as the input.

But we don’t want to keep processing the same data with the graph! How can we advance the graph to run the next buffer? You might suppose that we can pass the address of the buffer to the root node by reference, and then keep changing the address, but if we’re recording a sequence of cuBLAS calls of cuFFT, they don’t accept a float **.

For example:

// Start graph capture
        cudaStreamBeginCapture(cudaStreamPerThread, cudaStreamCaptureModeGlobal);

        cufftExecR2C(plan_, (cufftReal*) input_data, complex_result);

        // Stop graph capture
        cudaStreamEndCapture(cudaStreamPerThread, &graph);

When we run the graph, we have made things work by copying the new data into


, but that seems really inefficient and unnecessary.