building SDK examples in Ubuntu 11.04, tookit 4.1 problems cannot find -lXi

Hello,

I have done some Cuda programming in Windows. Now that I have to run it some Linux cluster, I started building Cuda environment on my PC in Linux. I have a GTX480.

/usr/bin/nvidia-settings says I have driver version 270.41.06

The OS I have

x86_64

DISTRIB_ID=Ubuntu

DISTRIB_RELEASE=11.04

The gcc version I have is gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2

I installed both toolkit and sdk, latest ones (maybe I should have kept to v4.0)

cudatoolkit_4.1.28_linux_64_ubuntu11.04.run

gpucomputingsdk_4.1.28_linux.run

I changed the environment variables, as below

LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda/lib:

PATH=/home/elan/.bashish/launcher:/usr/local/share/bashish/lib:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games

But when I tried to run make file in SDK/C, I got the error

Cannot find -lcuda

The best help I got was when browsing the forum threads, the one below, that had an exact description

So I edited common.mk in SDK/C/common, to include nvidia-current path in all the if-else paths of # Libs section.

Then, I could get at least one project built, namely deviceQuery

  1. No other project is built. I get the error

/usr/bin/ld: cannot find -lXi

collect2: ld returned 1 exit status

make[1]: *** […/…/bin/linux/release/recursiveGaussian] Error 1

Unfortunately, searching for this error does not give any hits.

  1. Running deviceQuery doesn’t work either

[deviceQuery] starting…

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35

→ CUDA driver version is insufficient for CUDA runtime version

[deviceQuery] test results…

FAILED

exiting in 3 seconds: 3…2…1…done!

Can anybody please infer where the error could be? I have spent two days on this, trying installing new nVidia drivers in vain (in spite of the advice in the quote), editing the frightening xorg.conf etc. Good, I have learnt some Linux.

Thank you,

Elan.

Hi Elan,
Looks like you’ve got the new toolkit (CUDA 4.1.28) but are using the older driver (270.41.06 is for CUDA 4.0). That’s why it can find ‘libcuda.so’ but just not the right version. If you upgrade your driver to the 290 or higher series, you should be good to go. Try this driver.
James

Hi James, thanks for the reply. I tried it quite often and it invariably only messes up xorg.conf. In fact, when I try to install, it says, “Found driver 290.10. It has to be uninstalled to installed driver 290.10. Press Ok to proceed.”

The procedure I followed was as in
http://askubuntu.com/questions/66328/how-do-i-install-latest-nvidia-drivers

There are quite some warnings there!

But,
/usr/bin/nvidia-settings
says that I still have driver 270.41.06

System → Administration → Additional drivers says
NVIDIA Accelerate graphics driver
This driver is activated, but not currently in use

But if I were to trust this guy

the driver is used, and the error message is just a text bug in the Linux distro. (Or quite possibly, the 270 driver is capable of all the tests he (and I) performed.)

My current situation is exactly as described in

, deviceQuery does not work, but deviceQueryDrv works, and the solution is updating drivers.

So it all boils to: how to install the new drivers

  1. System → Administration → Additional drivers does not provida means.
  2. sudo apt-get install nvidia-current says the driver is current
  3. The command line mode seems to install the driver, but the effect is not seen.

Thank you,
Elan.

Hi Elan,

Sometimes the distributions stop updating the nvidia drivers or update them real slowly. Especially for some of the older releases.
Unfortunately for us this means, we cant rely on the package managers to update to the latest nvidia driver.

The best method is to do the following

  1. Download the latest driver from here Official Drivers | NVIDIA
  2. stop X. You can do this by doing sudo service gdm stop.
  3. Uninstall the existing driver. sudo apt-get remove nvidia-current
  4. Install the new driver by running it from console. sudo sh path/to/latest/driver.sh
  5. Restart X. sudo service gdm start

P.S. This is ubuntu specific, for other readers, they may have to do the distribution specific commands

One manual way to resolve which NVIDIA driver is actually running:

bash$ cat /proc/driver/nvidia/version 

NVRM version: NVIDIA UNIX x86 Kernel Module  290.10  Wed Nov 16 19:27:25 PST 2011

GCC version:  gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)

Hello int, struct,

Thank you very much for the suggestions. I did try installing the new driver many times, and ended up with " Fatal server error: no screens found ". Soon I mastered the art of backing up the xorg.conf file. The installation was ok, but the driver was never active. I tried various combinations, once removed the xorg.conf and installed the driver again. This time, it said it had to disallow the incompatible nouveau drivers. I said yes, and after this installation, I see that the new driver have been deployed and all the SDK examples I tested, passed.

Thank you.

Just a list for people who might stumble upon the same problem.

  1. Ubuntu 11 comes with gcc 4.5, which is not supported by Cuda toolkit 4.0. Apparently it will be lot of work for Nvidia engineers to upgrade to gcc4.5
    The Official NVIDIA Forums | NVIDIA
    Since the primary concern is nvcc and not gcc, let us install the older version of gcc and proceed with that. Use sudo apt-get install gcc-4.4 g+±4.4.

  2. Let the rest of the system use gcc 4.5. But nvcc (cuda compiler) should see the older version. There is a complex solution of changing compiler binding and a simpler solution of creating a symbolic links. Both are given here.
    CUDA incompatible with my gcc version - Stack Overflow

  3. Now you have the compilers ready. So you go to /C and try calling ‘make’. You will most probably get the error that lcuda is not found. The solution is found the quote (from kynan) above, in the first post in this thread. You have to edit the file mentioned just below the quote.

  4. Try make again. If yours is a fresh installation, you will get an error, -lXi not found. Some more packages are needed. You find the dependencies in this thread
    The Official NVIDIA Forums | NVIDIA

  5. Now that all the link dependencies are also resolved, you will see all the binaries. But they don’t run. Characteristic error is described here
    The Official NVIDIA Forums | NVIDIA
    Looks like the last part of kynan’s observation does not hold after these few months. Install newer drivers and check if they are running as mentioned in the posts above. But read the next point before attempting it.

  6. Looks like it is a common problem that, when the drivers are installed manually, and not with sudo apt-get install, the file that describes where the monitor is, could get messed up. The details are explained here.
    Linux No Screens Found Error Fix - YouTube
    You need not do all these, it is just for better understanding of the issue.
    What you have to do is, before attempting to update the driver, make copy of the /etc/X11/xorg.conf file for later comparisons. After the driver update, call nvidia-xconfig. If you have more than one GPU, you should call nvidia-xconfig --enable-all-gpus. This creates the data afresh, backing up the old one under the name xorg.conf.backup. You can compare the screen’s addresses in the working config and the new one, if the new one does not work.

  7. Though Ubuntu does not recommend it,
    BinaryDriverHowto/Nvidia - Community Help Wiki
    the persistent blank screen with " Fatal server error: no screens found " is enough motivation to disable the nouveau driver. It is disabled by creating a file as mentioned in
    The Official NVIDIA Forums | NVIDIA

With this, the new driver should be up and running.