CUDA and GPU memory How to get CUDA to exit cleanly when a routine demands too much memory

Perhaps memory issues are described elsewhere on these fora or the documentation, but I’ll post anyways…

Several times I’ve run into the problem where the code I run demands more memory than the CUDA device has. What happens is the routine runs anyways (perhaps with some funky behavior on the screen) and seems to complete, but the answers are all wrong (or all zeros). If one is using the CUDA device for graphics at the time, it is also unclear to me how one divides between or separates graphics and compute memory - I guess the whole thing crashes when they overlap.

What I would like to see, in the best of all possible worlds, is for the routine to immediately stop with an “Out of Memory” complaint. Is there some test for that? Can one design a short initial routine that would test for that? Are there ways to measure the GPU memory usage so that it could be monitored during a routine? Should the compiler address this issue? Such a test would save a lot of time (rather than waiting for the routine to finish, only to realize that the answers are all wrong when it completes!) and potentially stop one’s computer or X11 Desktop from crashing. Esthetically, it would be far better to have a clean exit rather than a nebulous “maybe it completed o.k.” or computer crash situation.

Could there be a flag for starting X11 that would specify or define the memory allocated to graphics as opposed to compute? This would put a bit of a firewall between graphics and compute memory, and give a clearly defined memory limit for compute. I have in mind something like the motherboard BIOS settings for onboard graphics memory, except applied to the GPU. (Apologies for my ignorance here…but one never knows until one asks…)

I suppose if I were clever and methodical, I would calculate the numbers ahead of time…but who has time to do that…

you could check the return codes of your cudaMallocs?

Return codes! What a concept! :) (I don’t code in C too often…this is the obvious answer to my question…)

To be specific the following solves most of the problem for me:

cudaError_t cudaError;

.

.

.

 cudaError=cudaMalloc ((void **)&A, ND*NN * sizeof(A[0]));

 if (cudaError != cudaSuccess) { mexErrMsgTxt("Out of Nvidia device memory."); }

or

cublasStatus status;

.

.

.

status=cublasAlloc (Ic*Ic, sizeof(float), (void**)&G);

if (status != CUBLAS_STATUS_SUCCESS) { mexErrMsgTxt("Out of Nvidia device memory."); }

The “mexErrMsgTxt” is used because I am using mex files - this call has the handy property that it stops matlab. Just a “return;” does not stop matlab, which is bad in my case.

Perhaps obvious to everyone (including me now) but its always nice to resolve nagging issues!