Installing K40 for compute only.

I have a system with a fresh centOS 7.0 install, a Telsa K40, and on-board graphics. I’m running into the “X-server running” problems (and dozens of suggestions on how to get around it) and complaints that it can’t disable nouveau when trying to install the Cuda 7.5 toolkit. Both of these make sense IF I want to use the K40 for display (but obviously I can’t) … it’s compute only. Why does it care what the display is doing?

TxBob’s directions at this link appear to be related, but I don’t understand why he’s suggesting to disable Nouveau and then reboot. What would be my graphics driver when I reboot?

https://devtalk.nvidia.com/default/topic/846241/does-tesla-k40-and-ati-radeon-work-together-/

Also, it seems like this

  1. Get the latest driver runfile installer for the K40c (NOT the CUDA toolkit). The driver. Like 346.59: http://www.nvidia.com/download/driverResults.aspx/84194/en-us).

… and the instructions following should be noted with the Cuda Toolkit … that the K40 has special requirements.

nouveau is a linux driver for NVIDIA GPUs. Since you mention “on-board graphics” that is an intel graphics device, and it has no connection to a nouveau driver. Removing the nouveau driver should have no effect whatsoever on your Intel graphics display.

OTOH, the nouveau driver, if present in your system, will try to take ownership of any NVIDIA GPU it sees in the system, whether being specifically configured for X display or not. This is troublesome if you actually want to use the NVIDIA driver for the GPU, so you can use compute functions, for example.

The K40 doesn’t have special requirements. What is required is that you:

  1. Use a driver that supports your GPU
  2. Use a driver that supports the CUDA version you are going to use

Item 1 above can dictate that you use a newer driver so that your GPU is supported, if it is a “newer” GPU with an “older” toolkit. This is not unique to K40 or even to linux.

So why isn’t the driver with the 7.5 toolkit sufficient (352.39) but instead must use 346.59:

http://www.nvidia.com/download/driverResults.aspx/84194/en-us

When I say “special” … I mean your directions to install the 346.59 driver (“NOT the Cuda Toolkit”) and then turnaround and run the “Cuda 7 runfile installer” (I assume means tool kit?), but not the driver, just the toolkit and samples.

BTW, when I do that, I get a WARNING incomplete installation driver of at least version 352.00 is required.

First of all the discussion was around CUDA 7, not CUDA 7.5. There are different bundled drivers. Second, I was breaking out the driver install separately, so that specific driver install command line switches could be applied. This is quite difficult to do with the driver installed/launched from the toolkit installer.

I did not say you must use 346.59, nor did I say 352.39 cannot work. In fact, the driver bundled with CUDA 7 or CUDA 7.5 installer can work just fine with K40. But the specific sequence outlined there or something like it is needed to avoid the driver install issue described there. And for that, it is convenient to have the driver installer as a separate tool, as opposed to bundled with the toolkit installer.

Following that sequence also allows you to specify the --no-opengl-files at driver install time, which is a good suggestion. It’s difficult or impossible to do this if you just use the ordinary CUDA toolkit installer to install the driver. For whatever reason, I forgot to mention this driver install switch when I was detailing out the list of steps - but it is useful and important and a key reason to install the driver “separately”. You can do the same thing with 352.39 for CUDA 7.5 or any newer driver (just on general principal I would suggest using a later 352.xx driver.)