Error when running nvidia-smi with 'VIDIA-SMI couldn't find libnvidia-ml.so library...'

I got the error as below when I was trying to run nvidia-smi to check if nvidia driver is running:

~$ nvidia-smi
NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.

This is not understandable as it just worked well during the past months. So I tried with :

~$ sudo find /usr/lib/ -name libnvidia-ml.so
[sudo] password for keven: 
/usr/lib/nvidia-384/libnvidia-ml.so

It seems my system Ubuntu 16.04 had automatically upgraded the driver from 375(which I installed before with cuda 8.0) to 384, and I tried to see what are under the folder

ls /usr/lib/nvidia
nvidia/           nvidia-375/       nvidia-384/       nvidia-384-prime/

and

nvidia-settings 

ERROR: Error querying enabled displays on GPU 0 (Missing Extension).


ERROR: Error querying connected displays on GPU 0 (Missing Extension).

** Message: PRIME: Requires offloading
** Message: PRIME: is it supported? yes

ERROR: nvidia-settings could not find the registry key file. This file should
       have been installed along with this driver at
       /usr/share/nvidia/nvidia-application-profiles-key-documentation. The
       application profiles will continue to work, but values cannot be
       prepopulated or validated, and will not be listed in the help text.
       Please see the README for possible values and descriptions.

the nvidia-settings can well display the configuration panel but with the display above when quit.

then I tried

$ sudo nvidia-bug-report.sh 
[sudo] password for keven: 

nvidia-bug-report.sh will now collect information about your
system and create the file 'nvidia-bug-report.log.gz' in the current
directory.  It may take several seconds to run.  In some
cases, it may hang trying to capture data generated dynamically
by the Linux kernel and/or the NVIDIA kernel module.  While
the bug report log file will be incomplete if this happens, it
may still contain enough data to diagnose your problem.

Please include the 'nvidia-bug-report.log.gz' log file when reporting
your bug via the NVIDIA Linux forum (see devtalk.nvidia.com)
or by sending email to 'linux-bugs@nvidia.com'.

Running nvidia-bug-report.sh...ls: cannot access '/proc/driver/nvidia/./gpus/': No such file or directory


If the bug report script hangs after this point consider running with
--safe-mode and --extra-system-data command line arguments.

 complete.

It is probably a problem of incompatibility from the auto upgrade. But I don’t know how to solve.

nvidia-bug-report.log.gz (70.3 KB)

Well, I solved by upgrading and update system and modifying .bashrc file. As Nvidia driver had been automatically upgraded to 384 so I modified path to redirect to the right version of nvidia lib and cuda.

Hey Keven,

I have the exact same problem when trying to use cuda 8. I have an additional issue where is doesn’t even work without sudoing first.

What changes did you make to .bashrc? I tried adding the /usr/lib/nvidia-384 to the PATH and LD_LIBRARY_PATH but it’s still not working :(

Can you print out last few lines of the .bashrc file ? and also post the error message etc. Normally, this issue is caused by system update which make path and libraries mismatching.

Thanks for answering, this is the end of my .bashrc

# added by Anaconda3 4.4.0 installer
export PATH="/home/avi/anaconda3/bin:$PATH"

export PATH="$PATH:/usr/local/cuda-8.0/bin:/usr/lib/nvidia-384"

export LD_LIBRARY_PATH="/usr/local/cuda-8.0/lib64:/usr/lib/nvidia-384:/usr/local/lib:$LD_LIBRARY_PATH"

I haven’t updated the system, it didn’t work from the beginning. Just to clarify that I’m on Ubuntu 16.04, and I couldn’t solve the collision between Nouveau driver and NVIDIA driver when installing from the command line, so I ended up installing the NVIDIA drivers from the Gnome gui (from “Software and Updates” -> “additional drivers”).

Your .bashrc seems OK I think but I print mine just in case in which I think several lines are redundant:

export LD_LIBRARY_PATH="/usr/local/cuda-8.0/lib64:/usr/lib/nvidia-384:/usr/local/cuda-8.0/extras/CUPTI/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}"
export CUDA_HOME=/usr/local/cuda-8.0
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export PATH="/usr/local/cuda-8.0/bin:$HOME/anaconda3/bin:$PATH"

As it seems your nvidia driver is not properly installed so I think you could not compile the cuda example code (NVIDIA_CUDA-8.0_Samples/) neither. It really takes time to work around with these kind of problems so I suggest to completely uninstall the current driver and reinstall a new one. You can do it via command line or use the Gnome gui. This will be the most efficient way according to my experience. Attention, first to uninstall the current dirty nvidia drivers completely !!! this is super important. Then install the cuda-8.0 step by step(don’t use cuda-9.0, this will be troublemaker )

problem occurred when compiling cuda-9.0 or cuda-8.0 with Nvidia 384.90

  1. cannot find -lnvcuvid -> use command “ld -lnvcuvid(missing lib) --verbose” to get mislinked file, then search the required file and link with “sudo ln -s /usr/lib/nvidia-384/libvcuvid.so /usr/lib/libvcuvid.so” etc
  2. Error -> eglstrm_common.c: In function ‘EGLStreamInit’:…
    replace files in /usr/include/EGL with official supported new files. ref: https://devtalk.nvidia.com/default/topic/1025071/compiling-cuda-9-0-samples-on-ubuntu-16-04-has-error/