Why does CUDAGraph itself have memory consumption?


I am new to CUDAGraph. I notice that every time when I instantiate a CUDAGraph, there is a small GPU memory consumption associated with the creation. May I know what that memory reservation is for?

The graph definition and state requires storage on the GPU. It’s not possible to go into much detail here. The storage on the GPU is necessary so that the graph can be executed with the lowest possible latency. Low latency execution is a key use-case for CUDA graphs.