NVIDIA Driver, CUDA, cuDNN and TensorRT compatibility issue

Specification:

  • NVIDIA RTX 3070.
  • Ubuntu 18.04

i was installing cuda toolkit 11.0, 11.1, 11.1 update 1 but all of them resulting black screen to me whenever i do rebooting. from linux installations guide it order us to avoid conflict by remove driver that previously installed

but it turns out all those cuda toolkit above installing a wrong driver which makes a black screen happened to my PC, so im installing nvidia-driver-460 from Ubuntu apt-get, and it turns out to be ok for now…

but when i look at the cuDNN support metrix page, i can only install cuDNN 8.1.0 with CUDA 11.2 that supports 460 driver version (1st row of table)

and when i look at TensorRT support metrix, it shows that 11.2 only supported by cuDNN 8.0.5. here is what makes me so confused.

in cuDNN page i just see CUDA 11.2 is the only version supported by cuDNN 8.1.0 but why in TensorRT page shows that we should use cuDNN 8.05 instead?

So i think another way by downgrading my driver either to 450 or 455 that supports my RTX 3070 so that i can use CUDA 11.0 or CUDA 11.1 and install cuDNN 8.0.5 to satisfied requirements for TensorRT.

but now when i look at the NVIDIA driver page, i saw that my option is only 455 and its not NVIDIA certified and short time linux (so i don’t think its a good driver to install)

its pretty confusing for me…

  1. the linux installations guide is not properly installed the correct NVIDIA Driver for my RTX 3070
  2. support metrix page between 1 product to another seems dead-locking each other…

can someone please help me to provide the correct version that compatible for each product?

1 Like

Not much help from me is possible as I am in similar criss-cross -matrix hell. If you use Matlab for Deep Learning then you have to use a particular gcc/g++ version, a particular CUDA version, a particular CuDNN version and you need TensorRT. TensorRT wants a different CUDA and CuDNN version and a completely different compiler again. As soon as one bit works another bit fails.

You can have multiple versions of gcc and cuda and so on - check out:
update-alternatives in Linux.

But be warned because if you switch gcc versions other installations - like drivers from NVIDIA or even the apt install NVIDIA drivers can become a mess when an unexpected or unsuitable compiler is called.

The one thing that is not a problem as far as I can tell is the NVIDIA Driver. You are usually advised to install the latest one. If you do a: prompt>nvidia-smi - you are shown the Maximum possible CUDA version for that driver at the top right of the output - so this is not a recommendation nor is it the version you are using - just saying as this was not obvious to me at first.

The documentation for this stuff is all over the place and without the community posts in forums these things would be impossible. Shame on NVIDIA and Mathworks (the MatLab people) for not creating scripts to set up these environments in a working form and relying so heavily on the Community to do the troubleshooting. You can go down the Docker/Singularity route for example and get pre-built environments which has its own can of worms to negotiate. A big nod to Anaconda packages as the way to get Tensorflow and many Deep Learning environments ready built and insulated from your OS so avoiding big system installations with complicated dependencies.

I am setting up stuff for researchers who do not have a clue how to put this stuff together and would have no chance - meanwhile with all my tech support experience I am wasting hours and feel like an idiot. Good luck with your problem.