How to find out GPU memory usage (from kernel not host)

Hi,

I am running out of memory for cudaMallocs on the device side (i.e. I am allocating in the kernel not host).

Is there a way to monitor how much free memory there is on the GPU as I step through the device function in debug mode?

I see that cudaMemGetInfo is a host only function.

I am on Windows, so I am looking at the NSight window showing memory allocations, but bizarrely it doesn’t seem to show any significant allocations at the point cudaMalloc returns 2 (out of memory). I don’t think I am fragmenting the memory badly, so I would like to check on the true GPU memory usage.

Any ideas appreciated.

you can adjust the amount of memory available for allocation on the device.

I don’t know of a way to query how much is used.

Thank you. I realized this shortly after posting, it was my error for not setting cudaDeviceSetLimit high enough before allocating.