I have a simple program that uses CUDAGraph:
system("nvidia-smi");
checkCudaErrors(cudaStreamBeginCapture(stream, cudaStreamCaptureModeGlobal));
for (int ikrnl = 0; ikrnl < 1000; ikrnl++)
{
shortKernel<<<1, 1, 0, stream>>>(A);
}
checkCudaErrors(cudaStreamEndCapture(stream, &graph));
checkCudaErrors(cudaGraphInstantiate(&instance, graph, NULL, NULL, 0));
checkCudaErrors(cudaGraphLaunch(instance, stream));
checkCudaErrors(cudaStreamSynchronize(stream));
system("nvidia-smi");
checkCudaErrors(cudaGraphExecDestroy(instance));
checkCudaErrors(cudaGraphDestroy(graph));
system("nvidia-smi");
From the numbers reported by nvidia-smi
, I notice that the creation of CUDAGraph
’s cause the GPU memory consumption to go up a little bit, and furthermore, destroying the CUDAGraph
data structure (the last 3-4 lines) does not reclaim the memory. May I ask (1) why CUDAGraph
’s need to make memory reservations and (2) whether there is a proper way for us to reclaim the memory allocated?
Thank you.