cudaGraph kernel node copy questiont

I am studying the cudaGraph API and very confused by the cudaGraphAddKernelNode. In the official https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__GRAPH.html, the statement for cudaGraphAddKernelNode is:

  1. Kernel parameters can be specified via kernelParams. If the kernel has N parameters, then kernelParams needs to be an array of N pointers. Each pointer, from kernelParams[0] to kernelParams[N-1], points to the region of memory from which the actual parameter will be copied. The number of kernel parameters and their offsets and sizes do not need to be specified as that information is retrieved directly from the kernel’s image.

Does it mean when I call cudaGraphAddKernelNode with a parameter of type cudaKernelNodeParams, the cuda runtime will copy all the arguments pointed by kernelParams[0] to kernelParams[N-1] “immediately”? I am asking this because in my program I have many other functions to set up the pointers to local values. If the cuda runtime copy the actual parameter at the time I execute the graph, the pointer may point to some dangling locations.

1 Like

Yes. CUDA makes an internal copy of the task description when cudaGraphAddKernelNode is invoked, it wont refer to the pNodeParams pointer you passed in after returning from the function.

1 Like

Thanks for answering this, am I safe to assume the above also applies to cuLaunchKernel and cudaLaunchKernel functions in driver/runtime APIs?

BTW, this is the only clarification i could find on this and only after searching some time.

The official Nvidia API docs (for cuLaunchKernel, cudaLaunchKernel, cudaGraphAddKernelNode etc) would benefit from a clarification. E.g. “…when this function returns the values pointed by kernelParams will have been copied and the memory may be freed or used for other purposes.”