New person

Hello everyone,

I am very new this and this is my first time doing gpu computing.

I have a really quick question. Is there a utility for ubuntu that I can download to test to see if I installed my drivers correctly?

Maybe everyone here knows. I am using some older hardware and I installed driver 346.59. In my setup, I have 3 Tesla C1060, and one GTX 480. Does the driver 346.59 install drivers for both the tesla and geforce? Is there anyway that I can test my tesla cards to make sure they are being recognized?

OK so I am running Ubuntu 14.04 and I am attempting to install CUDA 6.5 Toolkit. I Have downloaded the .run file but my main issue is that gedit is constantly freezing.

Is there a better method?

I’m not sure where gedit enters the process. You may simply have an unstable system, it’s not clear that it is a CUDA or GPU issue.

However, 346.xx drivers are not compatible with C1060. CUDA 6.5 and it’s driver family (340.xx) is the last CUDA version compatible with C1060.

There is a test utility, it installs with the driver, called nvidia-smi

If you run:

nvidia-smi

and get sane output, your drivers are very likely installed correctly.

There is a linux getting started guide that may help at http://docs.nvidia.com/cuda/

Ok, thanks,

Yeah, I was looking at the getting started guide for the nvidia Cuda 6.5 toolkit. I have the .run package for version 340.87. Is it a good option for me to run the .run package or is there a better method?

To install the toolkit, I am using the document: http://developer.download.nvidia.com/compute/cuda/6_5/rel/docs/CUDA_Getting_Started_Linux.pdf

However, I am stuck at the installing the repo. I get an error gpg: no valid OpenPGP data found

Looking at some similiar issues, one solution is saying that I need to get Authentication keys.

Is this what I need to do for CUDA?

Interesting, I am not saying that you are wrong or anything but looking at the compatibility list, it looks says the for the 346 drivers support the C1060 card. Why is there mis information? Did someone forget to update the compatibility list?

Also, a third question.

I actually have 4 gpu cards. 3 tesla C1060 and 1 Geforce GTX 480. I a driver version for where both of them are the same number 340.65:

http://www.nvidia.com/download/driverResults.aspx/80874/en-us

http://www.nvidia.com/download/driverResults.aspx/80647/en-us

My question is this, does the 340.65 support both cards and I only install one or do I still have to install two different drivers?

If you use the .run method, you don’t need to do anything with repos.

If you point out what you mean by “the compatibility list” where it says r346 drivers support the C1060, I’d like to see it. You can use a single install of the 340.65 driver and it will (should) recognize and enable all of the GPUs you mention. There is no process or concept of installing “two different drivers” when it comes to the NVIDIA GPU driver (display driver). Only one driver (version) can be active at a time.

In my last post, I have two links. One for the Geforce drivers and the other for the Tesla. Both are the same version.

I do not have a master list of compatibility so I am making sure that the Geforce Drivers and the tesla drivers are the same version. Because then I know that they are both supported.

Weird logic but I am unsure if the 340.76 drivers for the Geforce support the Tesla card. But I know from looking at the two compatibility lists for both Tesla and Geforce that the 340.65 supports both. I am assuming that the drivers are unified.

Now, do I need to install separate drivers specifically for the Tesla?

On windows, it’s possible for drivers that are numerically the same version to technically have different lists of supported devices. This is due to the windows driver inf system. On linux, it doesn’t work that way.

So:

A(ny) linux 340.xx driver will work with all the GPUs you list.

You do not install “separate” drivers.

Oh ok, perfect.

Again, I am very new to linux. I am still a windows user but had to install linux on a few machines for my research.

NOw I Have to get rid of the 346 drivers and install the 340 drivers

For the driver itself, there should be an uninstaller deposited in /usr/bin:

/usr/bin/nvidia-uninstall

However, between r340 and r346, it should be OK to just run the r340 installer. It will “uninstall” the previous r346 driver.

Otherwise, it’s possible to have both CUDA 7 and CUDA 6.5 installed, and you can select between the two with appropriate choice of PATH and LD_LIBRARY_PATH environment variables. You can only use CUDA 7 with a proper GPU driver, however, i.e. r346 or later. The point being that you should be able to switch back to CUDA 6.5 by installing the r340 driver and then modifying your environment variables, assuming you had CUDA 6.5 loaded at some point (seems to be the case, based on above dialog.)

Ok perfect.

Yeah, I went with the drivers option. where I press the ubuntu logo and typed in drivers and selected additional drivers.

It displayed that I had the 346 drivers installed for the geforce and the X-server drivers installed for the tesla.

For the tesla, I told it to install the 340.76 driver. After rebooting, the additional drivers is now saying that I have the 340.76 driver for Tesla but the x.org X-server for Geforce???

Kinda weird. But, the good news is that I can use my Tesla

I did the nvidia-smi Here is what I am getting. Is this a sane output?

±-----------------------------------------------------+
| NVIDIA-SMI 340.76 Driver Version: 340.76 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla C1060 Off | 0000:08:00.0 N/A | N/A |
| 35% 68C P8 N/A / N/A | 3MiB / 4095MiB | N/A Default |
±------------------------------±---------------------±---------------------+
| 1 GeForce GTX 480 Off | 0000:0A:00.0 N/A | N/A |
| 66% 88C P3 N/A / N/A | 483MiB / 1535MiB | N/A Default |
±------------------------------±---------------------±---------------------+
| 2 Tesla C1060 Off | 0000:81:00.0 N/A | N/A |
| 35% 50C P8 N/A / N/A | 3MiB / 4095MiB | N/A Default |
±------------------------------±---------------------±---------------------+
| 3 Tesla C1060 Off | 0000:82:00.0 N/A | N/A |
| 35% 61C P8 N/A / N/A | 3MiB / 4095MiB | N/A Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| 0 Not Supported |
| 1 Not Supported |
| 2 Not Supported |
| 3 Not Supported |
±----------------------------------------------------------------------------+

My only immediate concern is the Compute process saying that it is not supported.

I also have the Nvidia X-server settings up and it is able to recongize all GPUs