2 Tesla C1060s with a legacy GeForce FX 5200 card Need help editing the xorg.conf file for multiple

The latest drivers for Tesla C1060 have dropped support for the GeForce FX series and older. My current config uses two Tesla C1060s plugged into PCIEx16 slots and a GeForce FX 5200 in a PCI slot (for display). I managed to install the 177.73 drivers (on Ubuntu 8.10) for the C1060s and edit the xorg.conf file so that it uses open source “nv” drivers for the 5200. Ubuntu X starts fine with basic 2D acceleration, but when I start the NVIDIA X server tool, it reports that “You do not appear to be using the nvidia X driver…”. I could use some help editing the xorg.conf file to make both cards work together. Or better yet, is it possible to use both the legacy proprietary driver and the Tesla proprietary driver at the same time?

Below are the relevant sections of my xorg.conf file -

[codebox]Section “Module”

Load           "glx"

EndSection

Section “Device”

Identifier     "Videocard0"

Driver         "nv"

VendorName     "NVIDIA Corporation"

BoardName      "GeForce FX 5200"

EndSection

Section “Device”

Identifier     "Videocard1"

Driver         "nvidia"

VendorName     "NVIDIA Corporation"

BoardName      "Tesla C1060"

BusID          "PCI:02:00:0"

EndSection

Section “Device”

Identifier     "Videocard2"

Driver         "nvidia"

VendorName     "NVIDIA Corporation"

BoardName      "Tesla C1060"

BusID          "PCI:03:00:0"

EndSection

Section “Screen”

Identifier     "Default Screen"

Device         "Videocard0"

Monitor        "Configured Monitor"

SubSection     "Display"

    Modes      "nvidia-auto-select"

EndSubSection

EndSection[/codebox]

This is expected behavior if your desktop is being driven by anything other than the nvidia X driver. You cannot load & use two different versions of the nvidia X driver simultaneously.

Thanks netllama. But when I first read your reply 3 days ago, I disregarded it and logged more than 10 hours trying to find a solution for the 5200 and the C1060s to coexist. And nothing… I made absolutely no progress. The Tesla C1060 drivers refuse to load as long as the 5200 is plugged in. I guess I’ll have to shell out some money for a cuda capable GPU that plugs into a PCI slot. But hey… I learned a lot during those 10 hours! ^_^

you can also use a non-nvidia card, that might be easier to find than a PCI card that is CUDA capable.

Could you install the nvidia driver, use nouveau in X to drive the 5200, and use the devID script from the 2.1 beta release notes to set up the device entries for /dev/nvidia*?

(oh no now you’re going to waste another ten hours)

You could just buy a new NV card for display, even if it’s not a CUDA capable one. The problem is the FX series is 5 years old.
Would there be a problem with a series 6 (or even series 5) card?

A PCI 6200 card is about $45. http://www.newegg.com/Product/Product.aspx…N82E16814130289
A waste in many ways, but think of it as being cheaper than the 10 hours of effort already spent.

I am assuming you require a PCI card, so you’re stuck with a limited selection.

If you have a spare PCIE slot (even an 8x slot), you can find GF7 cards for as low as $25.
http://www.newegg.com/Product/Product.aspx…N82E16814130098

I apologize for not replying earlier. I didn’t realize that my forum “email notifications” was disabled. I took E.D. Reidijk’s advice and bought ATI for the first time in my life hoping that the ATI and NVIDIA drivers could work simultaneously on Ubuntu. The following is the relevant part of my xorg.conf -

[codebox]Section “Device”

Identifier	"Configured Video Device"

Driver          "radeon"

BusID           "PCI:10:00:0"

EndSection

Section “Device”

Identifier	"tesla0"

Driver          "nvidia"

BusID           "PCI:02:00:0"

EndSection

Section “Device”

Identifier	"tesla1"

Driver          "nvidia"

BusID           "PCI:03:00:0"

EndSection

Section “Monitor”

Identifier	"Configured Monitor"

EndSection

Section “Screen”

Identifier	"Default Screen"

Monitor		"Configured Monitor"

Device		"Configured Video Device"

EndSection

[/codebox]

Now, when I compile and run deviceQuery from the NVIDIA SDK, it only shows me the CPU (emulation mode) and disregards the two Teslas. This is exactly the same situation I had with the legacy 5200. Is there any way to force CUDA applications to see the Teslas? Isn’t there a way to specify which GPU you want a CUDA application to run on? I am running out of cash and a solution that does not involve buying a non-legacy PCI GeForce card, would be really helpful.

P.S. Newegg is now selling PCI GeForce 9400GT cards for $80. This would be ideal, but I can’t afford to buy 4 of these for my 4 nodes.

To clarify, the ATI card works fine and displays X Windows using the open source “radeon” drivers. But the Tesla’s should load the non-free “nvidia” drivers that I installed. I’m not very familiar with how drivers work in Ubuntu. So I’d be grateful for almost any form of help. Thanks.

Please generate and attach an nvidia-bug-report.log.

Thanks Lonni. That the nvidia linux drivers had log files and error reports, was itself news to me. So here’s the scenario in which I collected the logs and reports -

  • New installation of Ubuntu 8.04 with latest updates.

  • All the libraries required by CUDA SDK (and gcc for compiling the nvidia drivers) were installed using synaptic.

  • ATI’s fglrx drivers were then installed using Ubuntu’s proprietary hardware drivers app. And graphics acceleration started working fine after a reboot.

  • Rebooted Ubuntu in recovery mode and entered shell prompt to install nvidia’s drivers (180.06)

  • Did not run nvidia-xconfig and left xorg.conf unchanged.

  • Ubuntu rebooted to a black screen. Ctrl+Alt+F1 followed by Alt+F9 revealed that X Windows has started in “low graphics mode” and is unable to load any drivers.

Here is the full xorg.conf that worked fine before I installed the nvidia drivers -

[codebox]

Section “InputDevice”

Identifier	"Generic Keyboard"

Driver		"kbd"

Option		"XkbRules"	"xorg"

Option		"XkbModel"	"pc105"

Option		"XkbLayout"	"us"

EndSection

Section “InputDevice”

Identifier	"Configured Mouse"

Driver		"mouse"

Option		"CorePointer"

EndSection

Section “Device”

Identifier	"Configured Video Device"

Driver		"fglrx"

EndSection

Section “Monitor”

Identifier	"Configured Monitor"

EndSection

Section “Screen”

Identifier	"Default Screen"

Monitor		"Configured Monitor"

Device		"Configured Video Device"

Defaultdepth	24

EndSection

Section “ServerLayout”

Identifier	"Default Layout"

screen “Default Screen”

EndSection

Section “Module”

Load		"glx"

EndSection

[/codebox]

Attached are the Xorg log files, and the bug report. My guess is that installing the nvidia drivers caused the nvidia glx to be loaded for the radeon card. If this is the problem (or something else), how can I get around it?

Thanks,

Cyriac

According to the bug report, you’re using the VESA X driver, and never started X with the nvidia X driver. Therefore, the problem that you’re reporting is expected behavior. You need to configure X to use the ‘nvidia’ X driver for at least one of the C1060 cards, and then start X (successfully).

Ok. So how do I do that? I was under the impression that it was defaulting to the failsafe:vesa because the ati fglrx drivers could not be loaded alongside nvidia’s glx.

I had also tried the following custom xorg.conf which causes the same symptoms (Ubuntu resets to Low graphics mode aka vesa). In the attached Xorg log files, notice that it ends up loading vesa even though I have explicitly asked for the drivers “nvidia” and “fglrx”. What is it that I’m doing wrong here?

[codebox]# nvidia-xconfig: X configuration file generated by nvidia-xconfig

nvidia-xconfig: version 1.0 (buildmeister@builder58) Sat Nov 8 18:20:50 PST 2008

Section “ServerLayout”

Identifier   "SingleHeadConfiguration"

Screen    0  "Screen0" 0 0

Screen    1  "Screen1" RightOf "Screen0"

Screen    2  "Screen2" LeftOf "Screen0"

InputDevice  "Mouse0" "CorePointer"

InputDevice  "Keyboard0" "CoreKeyboard"

EndSection

Section “Files”

EndSection

Section “Module”

Load         "dbe"

Load         "extmod"

Load         "fbdevhw"

Load         "glx"

Load         "record"

Load         "freetype"

EndSection

Section “InputDevice”

Identifier   "Mouse0"

Driver       "mouse"

Option       "Protocol" "auto"

Option       "Device" "/dev/psaux"

Option       "Emulate3Buttons" "no"

Option       "ZAxisMapping" "4 5"

EndSection

Section “InputDevice”

Identifier   "Keyboard0"

Driver       "kbd"

EndSection

Section “Monitor”

Identifier   "Monitor0"

VendorName   "Dell"

ModelName    "E173FP"

HorizSync     31.0 - 80.0

VertRefresh   56.0 - 75.0

Option       "dpms"

EndSection

Section “Device”

Identifier   "Tesla0"

Driver       "nvidia"

VendorName   "NVIDIA Corporation"

BoardName    "Tesla C1060"

BusID        "PCI:02:0:0"

EndSection

Section “Device”

Identifier   "Tesla1"

Driver       "nvidia"

VendorName   "NVIDIA Corporation"

BoardName    "Tesla C1060"

BusID        "PCI:03:0:0"

EndSection

Section “Device”

Identifier   "Radeon0"

Driver       "fglrx"

VendorName   "ATI Technologies Inc"

BoardName    "Radeon HD 2400"

BusID        "PCI:10:0:0"

EndSection

Section “Screen”

Identifier   "Screen0"

Device       "Radeon0"

Monitor      "Monitor0"

DefaultDepth  24

SubSection "Display"

    Viewport  0 0

    Depth     24

    Modes    "1280x1024" "1024x768" "800x600" "640x480"

EndSubSection

EndSection

Section “Screen”

Identifier   "Screen1"

Device       "Tesla0"

Monitor      "Monitor0"

DefaultDepth  24

Option       "UseDisplayDevice" "none"

SubSection "Display"

    Virtual   800 600

    Depth     24

    Modes    "800x600"

EndSubSection

EndSection

Section “Screen”

Identifier   "Screen2"

Device       "Tesla1"

Monitor      "Monitor0"

DefaultDepth  24

Option       "UseDisplayDevice" "none"

SubSection "Display"

    Virtual   800 600

    Depth     24

    Modes    "800x600"

EndSubSection

EndSection

Section “Extensions”

Option       "Composite" "Disable"

EndSection[/codebox]

Here is an excerpt from a post in 2006 - https://lists.ubuntu.com/archives/ubuntu-de…ber/022308.html

“I noted that xorg-driver-fglrx conflicts with nvidia-glx. They both divert libGL.so, so they can’t really work together in any meaningful way. I had tossed around the idea at one point of writing a libGL wrapper that could at least allow us to avoid diverting it, but it still wouldn’t let you use both fglrx and nvidia together, as applications would have no way of specifying which libGL they wanted to write to. (Okay, I suppose if the wrapper was Xinerama-aware and other fancy things, and could throw calls for one card to one libGL and calls for another to the other, it could perhaps work, but this all just sounds very sick, twisted, and utterly wrong… If you really want them to be able to work together, encourage NVIDIA and ATI to work with the Xorg/mesa folk on unifying libGL instead of providing their own)”

Can anyone confirm that this is still the current state of affairs and that I was very stupid to buy four PCI HD 2400 cards and expect them to work as display devices alongside Tesla C1060s…?

libGL is used for OpenGL support. It is not needed for CUDA unless you need OpenGL interop support.

GL would be nice to have for CUDA visualization apps. But my priority is just this -

How do I get CUDA apps to run on the Teslas when the primary graphics adapter is a Radeon HD 2400?

And if it is just a matter of editing xorg.conf, how do I correct the files I showed above?

If it is more than just the xorg.conf, what is it that I’m missing?

Thanks,

Cyriac

You can’t get simultaneous OpenGL support in X for both NVIDIA’s GPUs and another vendor’s GPUs. Which ever OpenGL support you install last is the one that will be used for whichever X driver it supports.

Thanks. But what about CUDA? Can I run CUDA apps on the Tesla’s while the Radeon is used only for display?

Please, anyone? I really need to get CUDA apps running on the Teslas. The Radeon is just there to get the system to POST. I don’t care if it uses vesa.
Even if you think this is an impossible configuration, please let me know. So that I can stop wasting my time.

In theory it should work, however I’ve never tried to run anything CUDA related with an ATI GPU. What is the current problem?

Ok… to make it easier, I’ve switched to a GeForce 6200 and used the default xorg.conf. But, the driver craps out as follows:

codebox Setting vga for screen 0.

(**) NVIDIA(0): Depth 24, (–) framebuffer bpp 32

(==) NVIDIA(0): RGB weight 888

(==) NVIDIA(0): Default visual is TrueColor

(==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)

(**) NVIDIA(0): Enabling RENDER acceleration

(II) NVIDIA(0): Support for GLX with the Damage and Composite X extensions is

(II) NVIDIA(0): enabled.

(EE) NVIDIA(0): Failed to initialize the NVIDIA graphics device!

(II) UnloadModule: “nvidia”

(II) UnloadModule: “wfb”

(II) UnloadModule: “fb”

(EE) Screen(s) found, but none have a usable configuration.

Fatal server error:

no screens found[/codebox]

There are also these kernel errors:

[codebox] /var/log/messages:

Jan 26 18:15:58 gpu2 kernel: [ 333.639010] NVRM: request_mem_region failed for 16M @ 0xfb000000. This can

Jan 26 18:15:58 gpu2 kernel: [ 333.639011] NVRM: occur when a driver such as rivatv is loaded and claims

Jan 26 18:15:58 gpu2 kernel: [ 333.639012] NVRM: ownership of the device’s registers.

Jan 26 18:15:58 gpu2 kernel: [ 333.639026] NVRM: The NVIDIA probe routine failed for 1 device(s).

Jan 26 18:15:58 gpu2 kernel: [ 333.639028] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 180.06 Sat Nov 8 17:50:38 PST 2008

Jan 26 18:16:43 gpu2 kernel: [ 378.827720] NVRM: request_mem_region failed for 16M @ 0xfb000000. This can

Jan 26 18:16:43 gpu2 kernel: [ 378.827721] NVRM: occur when a driver such as rivatv is loaded and claims

Jan 26 18:16:43 gpu2 kernel: [ 378.827722] NVRM: ownership of the device’s registers.

Jan 26 18:16:43 gpu2 kernel: [ 378.827737] NVRM: The NVIDIA probe routine failed for 1 device(s).

Jan 26 18:16:43 gpu2 kernel: [ 378.827739] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 180.06 Sat Nov 8 17:50:38 PST 2008

Jan 26 18:18:34 gpu2 kernel: [ 44.812830] NVRM: request_mem_region failed for 16M @ 0xfb000000. This can

Jan 26 18:18:34 gpu2 kernel: [ 44.812830] NVRM: occur when a driver such as rivatv is loaded and claims

Jan 26 18:18:34 gpu2 kernel: [ 44.812831] NVRM: ownership of the device’s registers.

Jan 26 18:18:34 gpu2 kernel: [ 44.812844] NVRM: The NVIDIA probe routine failed for 1 device(s).

Jan 26 18:18:34 gpu2 kernel: [ 44.812845] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 180.06 Sat Nov 8 17:50:38 PST 2008

[/codebox]

I’ve been googling these errors for a while now, with no luck. Any kind of insight would be appreciated. If you can provide a workable xorg.conf, that would be awesome! Attached is the full bug report.
nvidia_bug_report.log.txt (155 KB)