Graph scheduling and __const__ data

What’s the recommended convention for updating __const__ data when using CUDA graphs?

If I have a kernel node in a CUDA graph and I need to update some __const__ data on each launch, I can set up a memcpy node prior to the kernel node to do the update and everything is great.

But, what if I have two instances of the same kernel as independent nodes with no schedule dependency? The graph scheduler could decide to run memcpy(dataA), memcpy(dataB), kernel node expecting data A, kernel node expecting data B. But, the first kernel node will be severely disappointed…

So, then what?

Bump.

Bump