I might not fully understand the terminology used so forgive me if my question is stupid.
I believe some of the applications I want to use are not supported by cuda 9 yet
do you mean I need to install cuda 9? or are is there some latest available driver also for cuda 8? if the later then how can I find out the driver I currently have installed? also do notice I strictly followed the guide found on this site
what is the difference between CUDA driver version and CUDA runtime version? how do I verify that they match?
Software is typically layered, arranged as a “stack” (imagine a stack of pancakes). Code higher up in the stack makes calls to code lower in the stack, which in turn calls code even lower in the stack, … you get the picture.
In this case, the CUDA runtime sits on top of the CUDA driver in the stack. Your CUDA-accelerated applications sit, in all likelihood, on top of the CUDA runtime.
A particular version of the CUDA runtime will require a certain minimum version of the CUDA driver, and complain if that is not in place (see error message above). It will also operate with versions of the driver newer than the minimum required one. So generally the recommendation is to simply use the latest available driver. You can download drivers here:
you recommend to update the driver. however I’m using a specific driver recommended by nvidia itself with the cuda runtime version recommended in the same place. I had the login loop issue before and it was pretty annoying to solve so I would like to avoid risky driver versions
The document you point to seems to have been last updated for CUDA 8. So obviously it talks about drivers available around the time CUDA 8 became available, and it seems that one particular version, 378, caused problems at that time, thus the comment says to avoid it. Since then driver versions have progressed to 385 or thereabouts.
The interaction between the CUDA runtime and the CUDA driver is such that each version of the runtime requires a certain minimum driver version, and it complains if that is not in place, as you have seen. But it is generally fine to use newer drivers than the minimum required, with (rare) exceptions where newer drivers introduce a serious bug not present in older ones.
I have no idea how you installed your system. If you installed the complete CUDA 8.0 package, it should have come with a matching CUDA driver and you shouldn’t be seeing the error message you reported encountering in your original post. Your current installation may be incomplete or corrupted; I do not know of a way to diagnose that remotely.
Maybe txbob will come along and have more specific advice for you, he is more of a Linux installation expert than I am. There is certainly no issue with the device: GeForce GTX 960M is a Maxwell-family GPU with compute capability 5.0, and I would expect it to be a number of years before CUDA drops support for that.
You have a broken driver install. The “driver version is insufficient for CUDA runtime version” is a very solid, reliable indicator of that. Not sure what else there is to say. The fact that nvidia-smi gives typical output is, unfortunately, not a guarantee that the driver is installed correctly for CUDA. It is normally a good indicator, but it is not conclusive. The NVIDIA driver structure that supports GPU computing involves multiple linux modules, and it is possible that enough of these modules are “harmonized” so that nvidia-smi will work, but some other aspect is not. This is occasionally the outcome when people have struggled with the driver install. Trying multiple different things in sequence from an otherwise clean config is a recipe for disaster.
Start over with a clean install of linux. Then follow the instructions in the linux install guide (for CUDA 8, if you wish). It will work. You also evidently have availed yourself of instructions for avoiding the login-loop. Avail yourself of those, again, as you start with a pristine install.
If you prefer, follow the instructions for cleanup of previous installs contained in the aforementioned install guide. It may work. But the complexity of the driver combined with the myriad possibilities that people may have performed previously as part of the “history” of their machines makes it impossible IMO to provide a concise set of recovery/cleanup instructions that are guaranteed to work in every case, with the exception of “Start over with a clean install,… follow the instructions in the linux install guide precisely…, etc.”