When I dynamically add gpus to a linux host via PCI fabric the new gpus show up when I do lspci, but they do not have /dev/nvidia* nodes. I have to run nvidia-smi to create the /dev/nvidia* nodes.
When I dynamically remove the gpus they are gone when I do lspic, but the /dev/nvidia* nodes remain and running nvidia-smi gives an error. The only way I’ve found to remove the /dev/nvidia* nodes is to do a reboot, which I want to avoid.
Is there a way to get the nvidia driver or software to remove the appropriate /dev/nvidia* nodes when a gpu is removed from the PCI fabric?
NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2