restart single device restart single device on multi-Gpu machine


sometimes my code ‘corrupts’ the device due to programming errors.
I have a machine with quadroFX5800, GTX295 and two Teslas, so 5 devices.
Can I restart/reinitialise a single device (tesla), which is not connected to an XServer
and if, how can I do it ?

Because all the cards are enumerated from a single instance of the kernel driver, I don’t think you can. Your only likely alternative to rebooting would be to stopping all processes using every card (including X11) and the rmmod’ing the driver and insmod’ing it again. Even then I am not sure that would be enough, depending on whether the card or driver relies on the VGA bios to leave the card in a certain state for the driver. In such cases, a hosed card might still be hosed after the driver reloads.

thanks, it works this way at least. nevertheless I have to stop my simulations on the other cards.