manual release of GPU memory how to fix memory leak problems when relaunching the job

arabarra · February 21, 2011, 10:45am

Hi all,

I’m having a hard time booting and rebooting again my machine… Apparently when the program is abnormally interrupted for some reason cudaFree doesn’t get to clean the GPU memory and my system ends up running out of GPU memory when I relaunch it.

Clearly something went wrong with the device I use:
[cuda]$ gpumeminfo
Detected 2 GPU
!!! cuMemGetInfo failed! (status = c9)^^^^ Device: 0
^^^^ Free : 0 bytes (0 KB) (0 MB)
^^^^ Total: 0 bytes (0 KB) (0 MB)
^^^^ nan% free, nan% used
^^^^ Device: 1
^^^^ Free : 109314048 bytes (106752 KB) (104 MB)
^^^^ Total: 131399680 bytes (128320 KB) (125 MB)
^^^^ 83.192020% free, 16.807980% used

My question is if there is any better way to release the “trapped” memory from the GPU without rebooting it each time?

thanks in advance,
Daniel

chaohuang · March 2, 2012, 9:34pm

I have the same question: how to manually release the whole device or GPU memory? Thanks!

Topic		Replies	Views
memory management how to reset all the GPU memory ? CUDA Programming and Performance	3	8735	May 21, 2008
freeing GPU memory CUDA Programming and Performance	2	3285	April 9, 2008
cudaMemGetInfo free mem value is not correct CUDA Programming and Performance	1	1049	September 9, 2018
<1% Free Memory (Reset Programmatically? CUDA Programming and Performance	3	1034	June 7, 2009
cudaMemGetInfo returns total, free and used memory as 0.00000MB CUDA Programming and Performance	2	959	March 29, 2018
Why do cudaMemGetInfo also occupies a lot of GPU memory? CUDA Programming and Performance cuda	2	639	December 22, 2021
How To Recover From A Memory Leak CUDA Programming and Performance	1	1049	May 25, 2012
Runtime memory leaks? How do you clear them? CUDA Programming and Performance	3	5060	October 22, 2008
terrible memory leak CUDA Programming and Performance	1	2088	November 11, 2009
How much of total memory should typically be free? CUDA Programming and Performance	1	587	November 29, 2019

manual release of GPU memory how to fix memory leak problems when relaunching the job

Related topics