[CUDA Graph] Add Node from 3rd party library that contains CUDA kernel calls

Hi,

My team is currently working on organizing a complex algorithm using CUDA Graphs. One step of the algorithm involves a call to a third party library, which includes a CUDA kernel launch along with some host side pre and post processing.

Because of this combination of host and device operations (and lack of direct control over the kernel launch), it doesn’t seem possible to represent this step using either a HostNode or KernelNode in the CUDA Graph.

Is there a way to embed such a function into a CUDA Graph, or work around this limitation?
Thanks

It is generally not possible to add arbitrary library calls to a graph. The workaround would be to split the graph into two.

graphEverythingBeforeLibraryCall

libraryCall

graphEverythingAfterLibraryCall

1 Like