Cuda - tensorflow compatibility issue

Hello,
I am using Ubuntu-17.10. I installed cuda-9.2 and cuDNN for deep learning purposes. However when I installed tensorflow-gpu, I ran into a problem. found out that tensorflow-gpu is compatible with cuda-9.2.
Instead it asks for cuda-9.0.

I tried to install cuda-9.0 alongside the installed cuda-9.2.
now there is a complier verision issue.
ubutu-17.10 comes with gcc verion 7.
cuda-9.0 requires gcc-6.
I installed gcc-6, but that didn’t solved the problem.

My question:

  1. Am i heading to the right direction.
  2. Will the nvidia-396 display drivers work in hand with cuda-9.0
  3. If the answer of 1. is yes , then how to solve the complier version issue
  1. Yes
  2. Yes
  3. You haven’t provided enough information. Installing an earlier version of gcc is a possible solution. But since you haven’t said what the problem is, it’s difficult to say. Maybe all you need to do is make sure your version of gcc-6 is compatible with CUDA 9.0, and make sure it is selected.

It may not even matter if all you want to do is run tensorflow-gpu

Thank you sir for your reply.
Here are the details of the problems I am facing:

When I tried to run tensorflow-gpu on cuda-9.2, I get the following error:

“ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.”

I surfed the net a bit to find out what it means. I got that it means that the tensorflow module was looking for cuda-9.0, which is not installed. I know that multiple cuda version can co-exist.
So I tried to install cuda-9.0 too. Then I got this complier issue:

"
You are attempting to install on an unsupported configuration. Do you wish to continue?
(y)es/(n)o [ default is no ]: yes

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81?
(y)es/(n)o/(q)uit: no

Install the CUDA 9.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
[ default is /usr/local/cuda-9.0 ]:

/usr/local/cuda-9.0 is not writable.
Do you wish to run the installation with ‘sudo’?
(y)es/(n)o: y

Please enter your password:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 9.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
[ default is /home/sahil ]:

Error: unsupported compiler: 7.2.0. Use --override to override this check.
Installing the CUDA Samples in /home/sahil …
Copying samples to /home/sahil/NVIDIA_CUDA-9.0_Samples now…
Finished copying samples.

===========
= Summary =

Driver: Not Selected
Toolkit: Installation Failed. Using unsupported Compiler.
Samples: Installation Failed

Logfile is /tmp/cuda_install_3400.log
"

I ran the installation again with “–override”.
And faced the same error with tensorflow-gpu

I installed the gcc-6 compiler from the repo using the command:
sudo apt-get install gcc-6

Is there anything else I need to do to make the gcc-6 compatible with cuda-9.0?
Or is there any other solution to this problem?

Once you’ve installed gcc-6 you need to make sure it is the default compiler. Then redo the installation. Deselect the driver install. The driver you installed for CUDA 9.2 will work fine for CUDA 9.0

Thank you sir, that worked.
cuda-9.0 is now installed.

But I am still getting the same error when i tried to run tensorflow-gpu.

1.How to make sure that tensorflow-gpu uses the newly installed cuda-9.0 ?
2. Do I have to downgrade my cuDNN version too? Can multiple cuDNN verion coexits?

You need to set the appropriate environment variables as outlined in the CUDA linux install guide:

[url]Installation Guide Linux :: CUDA Toolkit Documentation

(section 7.1.1)

to point to your CUDA 9.0 installation (the above instructions are for CUDA 9.2, change those to point at your CUDA 9.0 install)

Regarding CUDNN, you need to make sure you have the version of CUDNN installed that your TF binary is expecting, just like CUDA libs.

Yes, you can have multiple CUDNN versions installed. Point to the one you want to use with the environment variables, or by putting the libs/versions you want to use in the path that TF is looking for them.

“putting the libs/versions you want to use in the path that TF is looking for them.”
can u please specify how to do this.

I installed cudnn-7.0.5 using the .deb files.
When I tried to verifu the installation, i got the following error:

"
rm -rf *o
rm -rf mnistCUDNN
/usr/local/cuda/bin/nvcc -ccbin g++ -I/usr/local/cuda/include -IFreeImage/include -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_53,code=compute_53 -o fp16_dev.o -c fp16_dev.cu
In file included from /usr/local/cuda/include/host_config.h:50:0,
from /usr/local/cuda/include/cuda_runtime.h:78,
from :0:
/usr/local/cuda/include/crt/host_config.h:119:2: error: #error – unsupported GNU version! gcc versions later than 6 are not supported!
#error – unsupported GNU version! gcc versions later than 6 are not supported!
^~~~~
Makefile:203: recipe for target ‘fp16_dev.o’ failed
make: *** [fp16_dev.o] Error 1
"

although my current gcc commpiler version is 6.4.0

for CUDA 9.0 you need 6.3 (or earlier)

regarding the libs, you can put your CUDNN libs in the same CUDA directory where TF is expecting to find CUDA libs, that should work (e.g. /usr/local/cuda/lib64). Alternatively, add another path to your LD_LIBRARY_PATH that points to wherever CUDNN libs are.

I installed g+±5 and gcc-5, that seemed to work.
But now I have run into a new error.(they never seem to end)

I tried to verify the cudnn installation by running the ./mnistCUDNN file.
I received the following error:

"
./mnistCUDNN: error while loading shared libraries: libcudart.so.9.0: cannot open shared object file: No such file or directory
"

I thought this might be due to cuda version.
But i have already updated my environment paths to point to the cuda 9.0
here is the output of nvcc --version

"
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
"

you have to set your environment variables as I already mentioned.

But I already have set my variables.

"
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\ ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
"

I have already added the above two lines in my .bashrc file.
Still I am getting the error

Am I missing something?

after you add them in the .bashrc file, you need to log out and log in again, for it to take effect.

what is the result of running:

echo $LD_LIBRARY_PATH

(and)

ls /usr/local/cuda-9.0/lib64

?

I rebooted my computer, still the same error.

the output of echo $LD_LIBRARY_PATH:

/usr/local/cuda-9.0/lib64

the output of ls /usr/local/cuda-9.0/lib64

libaccinj64.so libculibos.a libnppicom_static.a libnppitc_static.a
libaccinj64.so.9.0 libcurand.so libnppidei.so libnpps.so
libaccinj64.so.9.0.176 libcurand.so.9.0 libnppidei.so.9.0 libnpps.so.9.0
libcublas_device.a libcurand.so.9.0.176 libnppidei.so.9.0.176 libnpps.so.9.0.176
libcublas.so libcurand_static.a libnppidei_static.a libnpps_static.a
libcublas.so.9.0 libcusolver.so libnppif.so libnvblas.so
libcublas.so.9.0.176 libcusolver.so.9.0 libnppif.so.9.0 libnvblas.so.9.0
libcublas_static.a libcusolver.so.9.0.176 libnppif.so.9.0.176 libnvblas.so.9.0.176
libcudadevrt.a libcusolver_static.a libnppif_static.a libnvgraph.so
libcudart.so libcusparse.so libnppig.so libnvgraph.so.9.0
libcudart.so.9.0 libcusparse.so.9.0 libnppig.so.9.0 libnvgraph.so.9.0.176
libcudart.so.9.0.176 libcusparse.so.9.0.176 libnppig.so.9.0.176 libnvgraph_static.a
libcudart_static.a libcusparse_static.a libnppig_static.a libnvrtc-builtins.so
libcudnn.so libnppc.so libnppim.so libnvrtc-builtins.so.9.0
libcudnn.so.7 libnppc.so.9.0 libnppim.so.9.0 libnvrtc-builtins.so.9.0.176
libcudnn.so.7.0.5 libnppc.so.9.0.176 libnppim.so.9.0.176 libnvrtc.so
libcudnn_static.a libnppc_static.a libnppim_static.a libnvrtc.so.9.0
libcufft.so libnppial.so libnppist.so libnvrtc.so.9.0.176
libcufft.so.9.0 libnppial.so.9.0 libnppist.so.9.0 libnvToolsExt.so
libcufft.so.9.0.176 libnppial.so.9.0.176 libnppist.so.9.0.176 libnvToolsExt.so.1
libcufft_static.a libnppial_static.a libnppist_static.a libnvToolsExt.so.1.0.0
libcufftw.so libnppicc.so libnppisu.so libOpenCL.so
libcufftw.so.9.0 libnppicc.so.9.0 libnppisu.so.9.0 libOpenCL.so.1
libcufftw.so.9.0.176 libnppicc.so.9.0.176 libnppisu.so.9.0.176 libOpenCL.so.1.0
libcufftw_static.a libnppicc_static.a libnppisu_static.a libOpenCL.so.1.0.0
libcuinj64.so libnppicom.so libnppitc.so stubs
libcuinj64.so.9.0 libnppicom.so.9.0 libnppitc.so.9.0
libcuinj64.so.9.0.176 libnppicom.so.9.0.176 libnppitc.so.9.0.176