Cuda 8.0 driver .run file won't run because "X server is running"

The file tells me to see the readme file on the driver page for how to fix this. I can’t find a readme file on the driver page. HELP, PLEASE.

Thanks,

Also, the replacing of my X server is a feature I DON’T WANT and CAN’T DO, because it apparently breaks QT Creator v5.8, which I NEED. Can you please tell me how to get around this? I don’t need or want your X server. The one I have works with remote desktops and yours, of course, doesn’t. I say “of course” not in a derogatory way at all; but I assume it wants to use the local GPU memory as a display memory. But this is a server (it’s a server class GPU the M60 I’m using), and I am not using the local display. I think that should not be the default for server class computational acceleration GPU’s. The thought of using this thing as a display adapter, frankly never even occurred to me. I’m using it for computation. If you are going to add an X server (for demonstration purposes, to impress the CEO by showing him pretty pictures of the computation in progress I guess), you have to make it hand off to the other X server when I’m not using a local display or something.

Thanks,
Walt

Here’s a driver page:

[url]http://www.nvidia.com/Download/driverResults.aspx/114708/en-us[/url]

Click on that link above.

Then click on “Additional Information”

Then click on the README link.

The driver installer will not affect your existing X server if (do both of these):

  1. You use the command line option to not install opengl files. (Run the driver installer with -A command line switch to get a list of these options, or study the README). You won’t need these for ordinary compute tasks.
  1. You select “No” when prompted to modify your xorg.conf file.

By the way, NVIDIA sells the M60 primarily as a remote/virtualized graphics adapter, not as a compute device. This use case doesn’t require a local display, and in fact the M60 does not support local displays. You can, of course, use it for compute, like all NVIDIA GPUs.

[url]http://www.nvidia.com/object/tesla-m60.html[/url]

I found the readme file where you told me to look, and I followed it closely to install the driver. Then I reinstalled the toolkit. Unfortunately, I must have done something wrong along the way. Probably I had not yet read the part about the OPENGL libraries, or I answered wrong about the xorg.conf file (I do not actually have such a file, and when it asked me something about the X server, I definitely said “No”).

So now my cuda software builds and runs against 8.0. Unfortunately, I am getting this:
/home/wkailey/dev/imagesciences/nimp/aut/scaleRotateTrans
wkailey@fourier$ qtcreator &
[1] 9563

/home/wkailey/dev/imagesciences/nimp/aut/scaleRotateTrans
wkailey@fourier$ [xcb] Too much data requested from _XRead
[xcb] This is most likely caused by a broken X extension library
[xcb] Aborting, sorry about that.
qtcreator: xcb_io.c:736: _XRead: Assertion `!xcb_xlib_too_much_data_requested’ failed.

Can you tell me how to fix this? Do I need to uninstall and reinstall the Cuda 8.0 toolkit, or is there an easier way than that?

Thanks,
Walt

I tried
$ zypper se nv
which gave (in part)
i | cuda-nvgraph-8-0 | NVGRAPH native runtime libraries | package
i | cuda-nvgraph-dev-8-0 | NVGRAPH native dev links, headers | package
i | cuda-nvml-dev-8-0 | NVML native dev links, headers. | package
i | cuda-nvrtc-7-5 | NVRTC native runtime libraries | package
i | cuda-nvrtc-8-0 | NVRTC native runtime libraries | package
i | cuda-nvrtc-dev-7-5 | NVRTC native dev links, headers | package
i | cuda-nvrtc-dev-8-0 | NVRTC native dev links, headers | package

. . .

| nvidia-glG03 | NVIDIA GL libraries for OpenGL acceleration | package
| nvidia-glG04 | NVIDIA GL libraries for OpenGL acceleration | package

-----> Note the lack of an ‘i’ to the left of the NVIDIA GL libraries, which means, I think, that they are not installed <—

So maybe I did that part right(?)

just to make sure I tried
$ zypper remove nvidia-glG03

This gave the following output:
fourier:/var/log # zypper remove nvidia-glG03
Loading repository data…
Reading installed packages…
Package ‘nvidia-glG03’ is not installed.

This seems to confirm that the OPENGL libraries are not installed. This is good, I take it? So can you help me understand why QT does not work in that case? I know I answered “NO” to the question about whether I wanted it to mess with my X server.

By the way, my X server has a “config directory”, not a “config file”, it looks like, based on the logs in /var/log/Xorg.0.log:

$ grep config Xorg.0.log
gives, in part
[ 16.566] (==) Using config directory: “/etc/X11/xorg.conf.d”
[ 16.566] (==) Using system config directory “/usr/share/X11/xorg.conf.d”
Using a default monitor configuration.
If no devices become available, reconfigure udev or disable AutoAddDevices.
[ 16.838] (==) Matched mga as autoconfigured driver 0
[ 16.838] (==) Matched nvidia as autoconfigured driver 1
[ 16.838] (==) Matched nvidia as autoconfigured driver 2
[ 16.838] (==) Matched nouveau as autoconfigured driver 3
[ 16.838] (==) Matched nv as autoconfigured driver 4
[ 16.838] (==) Matched nvidia as autoconfigured driver 5
[ 16.838] (==) Matched nvidia as autoconfigured driver 6
[ 16.838] (==) Matched nouveau as autoconfigured driver 7
[ 16.838] (==) Matched nv as autoconfigured driver 8
[ 16.838] (==) Matched mga as autoconfigured driver 9
[ 16.838] (==) Matched modesetting as autoconfigured driver 10
[ 16.838] (==) Matched fbdev as autoconfigured driver 11
[ 16.838] (==) Matched vesa as autoconfigured driver 12

It’s quite possible that your desired X stack was corrupted when you did previous GPU driver installs - it’s necessary to follow the instructions I mentioned (or similar), and if you don’t your X stack will be broken, and removing CUDA and/or the driver doesn’t (in my experience) suddenly revert the X stack to its unbroken state.

You can certainly try removing the NVIDIA GPU driver (which is what can break your X stack, not the CUDA toolkit per se), you may get lucky. Otherwise, the correct method to do this would depend on which GPU your X stack is actually hosted/running on.

In my experience the most straightforward bulletproof fix is to reinstall the OS. If that seems unpalatable, then it would be necessary for you to research a “reinstall” of your non-NVIDIA based X stack. I wouldn’t be of much use there. The thing I would probably try would be to find the xorg driver that is associated with your non-NVIDIA-GPU, then use the package manager to uninstall and reinstall that driver. The package manager will depend on your OS, so for ubuntu it would be apt (apt-get), for RHEL it would be yum, etc.

As an aside, the nouveau driver has not been removed from your system. That is a problem. Its removal is covered in the readme as well as the CUDA linux install guide:

[url]Installation Guide Linux :: CUDA Toolkit Documentation

Your X system seems to have autodetected the NVIDIA GPU and is loading various drivers for it. For compute purposes only, you really do not want that. Configuring X on your particular distro is covered in various places.

txbob, please note that when I uninstalled the 8.0 installation that I had put over the 7.5 installation, as well as certain components of the original 7.5 installation itself, then QT started working again.

I then proceeded to download and install the 8.0 driver followed by a re-install of the 8.0 toolkit. The toolkit and driver combo failed because it left the 7.5 driver in place, even though the toolkit is incompatible with this. I could build against 8.0 but could not run at that point. I did not test QT at that point.

So I manually uninstalled the 7.5 driver and the 8.0 toolkit and manually installed the 8.0 driver and 8.0 toolkit following your (quite lengthy) readme as well as I am able. Note, I looked at and attempted to follow chapters 2, 3, 4, and 6 (from memory) chapter numbers as directed to do in Chapter 1. Apparently the OpenGL was NOT installed (as shown in my post above), and I know I said no to the question about X server. This more careful install of the 8.0 driver and tool kit made the build and run of my cuda app work, which is a real improvement!

However, the bad news is that I am still getting these QT errors at this point. Is there more information I can post here, or can you advise me how to proceed.

By the way, I did not understand that the M60 was designed as a graphics display driver, because, even though I did read the product description page that was linked to, I did not understand all the jargon. Not being in the business of consumer hardware or gaming (I am a high speed image processing specialist, but it isn’t consumer stuff, and my images don’t go to video displays), I did not understand the jargon on that page to mean that the thing was for projecting remote displays. I took “remote workstation” to mean a station for doing heavy computational tasks that is accessed remotely, and that is exactly what my server is, in which I have installed this card. So, while the misunderstanding was entirely on my part, not yours, I thought I would explain to you how it occurs. I’m just a physicist, not a gamer, nor a game developer.

Cordially,
Walt

more info: I’m using OpenSuse 13.2 on x86_64 Linux, and that is the precise version of the 8.0 Cuda toolkit and driver that I downloaded and installed. Please let me know what other information, if any you need.

Thanks,
Walt

further response to:
“You can certainly try removing the NVIDIA GPU driver (which is what can break your X stack, not the CUDA toolkit per se), you may get lucky. Otherwise, the correct method to do this would depend on which GPU your X stack is actually hosted/running on.”

Please explain: how could I run the Cuda application I have developed without the Cuda driver installed? I’m afraid I don’t understand.

With regard to: “this would depend on which GPU your X stack is actually hosted/running on”, I would be surprised if my X stack is running on any GPU, unless it can both run on a local GPU and display on a remote machine. I use this machine remotely from realVNC that is running on a Windows laptop. So, while I do not know much of anything about X, I can tell you that the X server is being asked to project its windows remotely over VNC that is running on a Windows laptop. Not sure if that helps, though. If you can tell me how to get the information you are looking for, I will do whatever is necessary.

Thanks,
Walt

Question about:
“As an aside, the nouveau driver has not been removed from your system. That is a problem. Its removal is covered in the readme as well as the CUDA linux install guide”

What is a nouveau driver? I did read the readme, and I’m afraid I must have overlooked the part that told me to uninstall the nouveau driver, whatever that is, but I’ll give it a try and report back here.

Sorry to be a bit segmented in my responses to your posts above, but as you might guess I’m doing a number of things at once here at my end.

Cordially,

Walt

The solution that has the highest probability of success would be to reload a clean copy of your OS and start from there.

The instructions in the linux install guide will allow you to get CUDA up and running.

If you don’t want any effect on your existing X setup, then follow the instructions I previously gave. There are similar instructions that can be used directly with the CUDA 8 installer, to prevent modification of your X system.

Once you broke your X stack, before you even started this thread, its possible that nothing I tell you may fix it.

Regarding this:

“Please explain: how could I run the Cuda application I have developed without the Cuda driver installed? I’m afraid I don’t understand.”

The initial objective is to get your X (remote) display working correctly (right? Isn’t that what you started asking at the beginning of this thread?). It’s possible that removing the NVIDIA software you installed may get it working again. I doubt it, but you could try it if you wish. I was not suggesting that removal of the GPU driver was a long term solution. It was a possible first step in the process to get your X display working again. If it were successful, then at that point I think you could follow the instructions I previously gave to get the GPU driver loaded (again) without corrupting your existing X stack.

Thank you for that. I will try to do as you suggest.

By the way, as I reported above (but it may have gotten buried in too much information), when I did uninstall the 7.5 and 8.0 tool sets (with the exception of some piece of the 7.5 driver that remained behind and caused the reinstall of 8.0 to refuse to work), it DID fix QT. That’s how I first knew that it was the Nvidia Cuda tools, version 7.5 that originally introduced the conflict with my QT creator that was formerly working fine alongside Nvidia Cuda tools 7.2.

Thanks for the help. I’ll let you know how it goes.

-Walt

If you can get back to that point where QT was fixed, then do it, as your first step.

After that, follow the instructions in the linux install guide to clean up (remove) old installs:

[url]Installation Guide Linux :: CUDA Toolkit Documentation

Then run the GPU driver install using the 2 steps I gave above (no opengl files, no mod to X files). Use the runfile installer method for driver install, but also follow the instructions in the linux install guide to remove nouveau:

[url]http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau[/url]

Don’t bother with the CUDA install at that point, just verify that your QT still works. If it does, you should be good to go. Install CUDA 8 at that point (using runfile installer), and decline the step to install the NVIDIA GPU driver, since you already have that installed.

Before starting this process, you may want to read the entire linux install guide. It’s considerably more concise than the driver README. I only pointed out the driver README because you asked where it was.

Especially be aware that there are 2 different install methods:

  • package manager install (I think it would use zypper in your case, I know very little about your OS)
  • runfile install

These two methods are different, and they cannot be intermixed. Use the runfile install method, and stick with it for the life of your machine. These two methods exist for both driver install and CUDA install. If you have mixed the two, you will likely need to remove any NVIDIA software (see above) and start over.

Hi,

I went to one of my Linux guys and asked for help to uninstall the nouveau drivers, and he showed me this:

fourier:~ # lsmod | grep nou
fourier:~ #

According to my Linux guy, that means the nouveau driver is not installed.

Next, I used the nvidia-uninstall script provided with the stuff I installed, and it took away the Nvidia drivers successfully.

QT Creator immediately started working again.

But now my Cuda application does not work :-( --of course you warned me about that.

Please advise how to proceed next, thanks.

disregard that last question, I see that your previous post does contain information about how to proceed next. I will attempt to do that. Table 3 of Section 2.7 presents a lot of research opportunities, because I don’t really remember how either of the two toolkits that were installed were installed at this point. I’ve installed and uninstalled so many that I have no idea, honestly.

My issue was resolved by uninstalling everything using zypper se queries and rpm -qa queries that was at all related to cuda or nvidia, including the nvidia driver. Then I rebooted and installed JUST the cuda-8.0 toolkit, letting it install its own driver. When I had tried to install the driver from its own driver install page (where I had gone to look at the referenced README file), that caused the problem EVERY TIME. So that was the key: do NOT use an already-installed driver with the cuda toolkit installer. Rather start from a position of driver NOT installed, stop the X server using the command
$ rcxdm stop
and then run the cuda_8.0 tool kit installer. Say yes to let it install the cuda driver. Say no to OPENGL. Say no to the Xserver question. Then, and only then, did everything work together, including QT and the Cuda tools and the Nvidia driver.

Thanks a lot for putting up with all my ignorance! My knowledge of Linux systems is limited, as you can tell. Thanks for your patience and help!