Hi, guys. I’m a rookie at linux/gpu.
I want to confirm if the latest kernel support NVDIA GPU Hot(Un)Plug. I mean plug/unplug the GPU device while the server is running.
I’ve tried to remove GPU in linux by “echo 1 > /sys/bys/pci/devices/0000:****/remove” and that works. After that, could I unplug the GPU?
Our server is using A100 and it’s too expensive, so I didn’t dare to do unplug operation.
I also checked open-gpu-kernel-modules. There are some hotplug keywords but I think that is module related.
Is there any clear documentation related to this?
Thanks.