Nvidia modprobe not found after power outage

There was a recent thunderstorm in the city I’m living, and there was a city-wide power outage. Luckily my computer and many other were connected to stabilizers, but that didn’t stop them from powering out.

Today I turned all of the computers back up again, but the small cluster computer that I have that has an Nvidia Titan X rebooted perfectly but it couldn’t seem to run any CUDA related operations (it was running some during the power outage). I can still add the cuda path and it detects the nvcc, however when I run a matlab script that calls the graphics card for cuda related processes, I get a “FATAL error: nvidia modprobe” not found. Not really sure what happened. It seems like when I go to the /dev/ folder, the nvidia device doesn’t seem to show up.

However, when I type

$lspci | grep VGA
VGA compatible controller: NVIDIA Corporation Device 17c2 (rev a1)
VGA compatible controller: Matrox Electronics Systems LTd. MGA G200e

So I’m not really sure if the system detects the graphics card… Not really sure what the next step is here, should I open the computer and check if the graphics card didn’t melt from the power outage? Is there anyway via the terminal to make any checks of the current status (hardware + software) of my graphics card?

Other details:
the $./deviceQuery command from the CUDA Samples Toolbox outputs:
modprobe: FATAL : Module nvidia not found.
cudaGetDeviceCount returned 38
→ No CUDA-capable device is detected.
Result = FAIL

My second option is to re-install the graphics card, but at this point I don’t even know if it’s working or not…

If you have root privilege try running the ./deviceQuery as root.

If that doesn’t work, then try reinstalling the GPU driver (or all of CUDA).

You may also want to take a look at (as root):

dmesg | grep NVRM

to see if there is anything interesting there.

Doesn’t work, I get the same error.

I ran the command, nothing pops up. Is this ok?

No, it’s not OK. Best case would be to see something like this:

[   15.227592] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  352.39  Fri Aug 14 18:09:10 PDT 2015

If you don’t even see that, it means the driver is not even attempting to load.

You at least have a corrupted system config. I would try reinstalling the driver.

Ok, will try reinstalling the driver now. Will keep you posted

Driver re-installed from here:
https://devtalk.nvidia.com/default/topic/878117/cuda-setup-and-installation/-solved-titan-x-for-cuda-7-5-login-loop-error-ubuntu-14-04-/

Everything is working fine now. Thanks!