Memory profiler

Hello,

I was just wondering if there is some way to trace how much of the Global/Constant/Shared is used at any point of time while the kernel is in execution. I would also like to see the maximums these accesses hit to.

Thanks,
Neelam

Constant and shared mem are static alloced. You just need to add the dynamic size of the shared mem argument to the kernel launch if any. The device mem usage/available can be queried using the driver API cuMemGetInfo

Peter

Hi Peter,

Thanks for your reply. But we couldn’t find any reference to the driver API cuMemGetInfo

that you mentioned. Could you tell us a little more about how to use that?

Thanks,

Neelam

Yeah, just checked - the cuMemGetInfo indeed is missing in the manual. Hello NVIDIA … :wave:

Other forum threads also described some errors in the manual, which is why I usually consult the headers directly. See cuda.h for the driver API and cuda_runtime_api.h for the runtime API. Driver and runtime API cannot be mixed (the runtime keeps internal state) except for the …GetInfo and MemAllocSystem/MemFreeSystem calls.

CUresult cuMemGetInfo(unsigned int *free, unsigned int *total);

Hope that helps.

Peter