Kernel maxing out GPU memory when it definitely should not be

fherder · October 29, 2018, 4:16pm

Hi,

I have a program that launches several thousand blocks of variable thread size (configured by launch parameters) on our P100. What I’ve noticed is that my pen-and-paper calculations show that it should only be using ~200 MB of memory, however when I run nvidia-smi it shows ~600 MB of memory being used. What initially alerted me to this discrepancy was one of our performance monitoring tools reporting periods of 100% device memory utilization - the amount of memory I’m using should not even be approaching the 12GB available, and it doesn’t match up with the numbers nvidia-smi is reporting, so this was very alarming! The last strange part of this whole problem is that the performance tool only reports max memory usage (I have it email me whenever this happens) when I use a specific thread block configuration; it doesn’t appear as if using other configurations triggers this effect. I’m very certain that I’m the only one utilizing the device when this happens.

Any ideas? I’m kind of at a loss as to what’s going on here.

Robert_Crovella · October 29, 2018, 4:20pm

When a GPU device is idle, it may report nearly 0 memory utilization.

When you run a program on the GPU, two things will consume memory:

The memory needs of your program.
The memory overhead of CUDA.

CUDA can easily use 400MB of overhead on a GPU. However this overhead should be roughly constant, whether your program needs 200MB of memory or 10GB of memory. So if your program uses 200MB of memory and nvidia-smi reports 600MB used, that doesn’t surprise me. Some things may affect this overhead, such as whether the GPU is driving a display, but that should be evident in the idle state also.

Regarding the behavior of your performance monitoring tool which you haven’t specified, I have no idea about that. It may have a bug. On the other hand, I’m pretty confident that nvidia-smi gives a reasonably accurate representation of memory used on a device.

fherder · October 29, 2018, 4:29pm

That information on CUDA’s constant memory overhead is very useful, thank you! That’s one anomaly potentially resolved, at least; I’ll have to take things up with our sysadmin to see if perhaps there’s a bug in the performance monitoring tool we’re using.

Thanks!

Topic		Replies	Views
Why does the GPU memory used by process not add up to memory used according to nvidia-smi? Video Processing & Optical Flow	2	1302	October 12, 2021
Memory usage values in nvidia-smi command CUDA Programming and Performance	4	1964	November 21, 2023
Weird memory usage for idle GPU CUDA Programming and Performance	3	2355	December 19, 2013
Different CUDA memory usage between nvidia-smi and cudaMemGetInfo CUDA Programming and Performance	0	1279	September 19, 2019
cudaMemGetInfo returns similar result for 3 different GPUs CUDA Programming and Performance cuda , nvbugs	5	460	January 23, 2024
Strange memory consumption on the device CUDA Programming and Performance	9	3654	July 13, 2017
GPU Memory Less Than Promised CUDA Programming and Performance	19	3426	December 15, 2022
How to choose how many threads/blocks to have? CUDA Programming and Performance	43	53851	June 7, 2022
Unable to profile the memory utilization with V100 CUDA Programming and Performance	0	387	October 7, 2019
All GPU memory in use and no program running Monitoring/Assessment Tools	1	10054	September 16, 2020

Kernel maxing out GPU memory when it definitely should not be

Related topics