`nvidia-smi` Performance degredation

It doesn’t need 52GB of memory. It is mapping a virtual address (VA) space. Carving out a virtual reservation. this is a typical activity involved in “starting up a GPU”. When a GPU is truly/completely idle (no process running on it, persistence mode not enabled) it has no bearing on the machine VA space - it is invisible (mostly, leaving aside things like the BARs and things visible in I/O space.)

When you start up a GPU to do “real work”, then the CUDA runtime (the operating system for the GPU ) needs to make it possible for the various CUDA allocators to request allocations within a VA space that has already been reserved for such things. This is doing that VA space reservation.

Some requests that you might make of nvidia-smi require it to “start up the GPU”. Some don’t. This you may see variation in behavior, depending on what exactly you request on the command line.

And I’m not suggesting this is intended to be a detailed description for all possible questions that could be asked in this vein, or even all of the question you have asked, nor would I be able to provide such a description. For example I’m not suggesting that having persistence mode enabled makes every issue go away, although it demonstrably helps in some cases. If you don’t have persistence mode enabled (which might happen on a driver upgrade) you should probably re-enable it.

And if you observe that in otherwise identical scenarios that a particular nvidia-smi from a particular driver version involves very low run time whereas from another driver version involves very high run time, then that would probably be a candidate for a bug report. I’m running R565 drivers on a particular machine of mine and nvidia-smi runs in less than a second on a single GPU with 8GB of system ram.

Is that a cloud machine?