CUDA Graph Memory Reservations

I have a simple program that uses CUDAGraph:

        system("nvidia-smi");

        checkCudaErrors(cudaStreamBeginCapture(stream, cudaStreamCaptureModeGlobal));
        for (int ikrnl = 0; ikrnl < 1000; ikrnl++)
        {
                shortKernel<<<1, 1, 0, stream>>>(A);
        }
        checkCudaErrors(cudaStreamEndCapture(stream, &graph));
        checkCudaErrors(cudaGraphInstantiate(&instance, graph, NULL, NULL, 0));
        checkCudaErrors(cudaGraphLaunch(instance, stream));
        checkCudaErrors(cudaStreamSynchronize(stream));

        system("nvidia-smi");

        checkCudaErrors(cudaGraphExecDestroy(instance));
        checkCudaErrors(cudaGraphDestroy(graph));

        system("nvidia-smi");

From the numbers reported by nvidia-smi, I notice that the creation of CUDAGraph’s cause the GPU memory consumption to go up a little bit, and furthermore, destroying the CUDAGraph data structure (the last 3-4 lines) does not reclaim the memory. May I ask (1) why CUDAGraph’s need to make memory reservations and (2) whether there is a proper way for us to reclaim the memory allocated?

Thank you.

I see the issue as well. I’m not aware of any other method to reclaim the memory (other than cudaDeviceReset() which I’m sure is not what you had in mind).

I’ve filed an internal bug at NVIDIA to have this looked at. (3865932)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.