I have a server with 4 Tesla GPUs and I would like to shutdown 2 of them. Unfortunately I have other devices connected to the pcie Bus and I can’t just shut down the Bus. I was wondering if there is a way to shutdown a GPU using nvidia drivers, or nvidia-smi. I found a way to set the power limit, but not a way to shut it down.
AFAIK, that’s not possible.
I found a way to do it through nvidia-smi. I just didn’t know what drain mode meant.
sudo nvidia-smi -i 0000:xx:00.0 -pm 0
# changing gpu to draining mode (power idling)
sudo nvidia-smi drain -p 0000:xx:00.0 -m 1
#enable again persistance mode.
sudo nvidia-smi -pm 1
This hides the selected gpu from other applications, and only uses the gpus that are no in drain mode.