Tesla C870 and Linux RHEL 4.5

Hi,

I have a Tesla C870 board installed on a Linux RHEL 4.5 64-bit system Intel system.
I’ve installed the latest CUDA 1.1 driver, Toolkit and SDK.

When I run non-graphics SDK samples like deviceQuery, matrixMul, bandwidthTest etc., they work fine however when I run some OpenGL samples like fluidGL or particles there is an error “GLX extension is not supported by the display :0.0”

I have an on-board Intel 945 chipset and the monitor is connected to this onboard videoout.

Also after installing the driver, when i configure the X settings using the nvidia-xconfig and try to run startX, the X windows do not start and I get a Fatal error. ‘XIO Fatal IO error 104 (connection reset by peer) on X Server:0.0’

The X however runs fine with the default Intel 945 chipset.

I have attached a couple of logs, in case it helps.

Let me know if some one has come across a similar problem and has found a solution to this.

Thanks,
Shailesh
[attachment=5452:attachment][attachment=5453:attachment][attachment=5454:attachment]
startX_error.txt (1.36 KB)
xorg.conf_with_nvidia.txt (3.16 KB)
glxinfo.txt (809 Bytes)

This is expected behavior unless you are running an X screen on the Tesla GPU, and have exported the DISPLAY variable to use the Tesla X screen.

You are loading the wrong module for your chipset.

Section “Device”
Identifier “Videocard0”
Driver “nvidia” <<<---------------
VendorName “Videocard vendor”
BoardName “Intel 945”
EndSection

I would try this:

  1. Reinstall X for your chipset
  2. Install the NVIDIA driver for Tesla but do not run the Xconfig for Nvidia
  3. Look at the release note for the script that will load the nvidia driver and create the proper /dev/nvidia* entries.

Thanks very muc for your reply.

The xorg.conf indeed has a correct entry for the Intel chipset. I am enclosing the file. The xorg.conf_with_nvidia (that i attached y’day) was the file that was seen after you run nvidia-xconfig.

I tried the other two steps that you suggested.

I see /dev/nvidia0 and /dev/nvidiactl created in /dev

Unfortnately I still can not run the fluidsGL or similar apps.

I still see the “OpenGL GLX extensions not support by DISPLAY” error.

I have attached the output of glxinfo, xorg.conf and /var/log/Xorg.0.log

Please help in further debugging.

Thanks,

Shailesh
Xorg.0.log.txt (55.7 KB)
glxinfo_log.txt (788 Bytes)
xorg.conf_intel.txt (2.71 KB)

You are loading the GLX module from NVIDIA.

(II) LoadModule: “glx”
(II) Loading /usr/X11R6/lib64/modules/extensions/libglx.so
(II) Module glx: vendor=“NVIDIA Corporation” <------------------------------
compiled for 4.0.2, module version = 1.0.0
Module class: X.Org Server Extension
ABI class: X.Org Server Extension, version 0.1
(II) NVIDIA GLX Module 169.09 Fri Jan 11 14:46:52 PST 2008
(II) Loading extension GLX

and it results in an error later on:
(EE) Failed to initialize GLX extension (Compatible NVIDIA X driver not found)

Try to get the integrated graphic adapter to work with glxgears. Once you got it to work, install the NVIDIA driver but skip the Xconfig and do the entries in /dev plus load the nvidia kernel module.

@shailesh
If you want to run the SDK samples on the Tesla C870 GPUs then you’ll need to configure X to run on the C870 in addition to configuring X to use the integrated Intel GPU. Once you have X working on the Intel integrated GPU, for the C870, you should run the following as root:
nvidia-xconfig --virtual=800x600 --use-display-device=none

and then restart X.

At this point you will still need to set the DISPLAY environment variable to point to the X screen that is running (headless) on the the C870 if you wish to run fluidGL (or any other sample which requires X). You can do so as follows:
export DISPLAY=:0.1

At this point you should be able to run fluidGL on the C870, however if you wish to see the rendering, you would need to take a screenshot of the X screen running on the C870. This can be accomplished with the ‘import’ utility which ships with the ImageMagick package(which ships with RHEL-4.5):
import -window root screenshot.jpg

Please note that you will not be able to run any of the samples on the Intel X screen as the NVIDIA display driver overwrites the GLX (Mesa) X module that the Intel X driver requires with a new one that the nvidia X driver requires.

If you have further questions please generate and attach an nvidia-bug-report.log (as root by running nvidia-bug-report.sh).

thanks,
Lonni

Thanks Lonni for the detailed explanation.

My X works well with the integrated Intel 945 chipset and I tried the

“nvidia-xconfig --virtual=800x600 --use-display-device=none” as suggested by you.

However the startx doesn’t work after this command and returns the fatal error.

Also tried the export DISPLAY,

I’ve attached the nvidia-bug-report with this thread.

Thanks very much.

Shailesh
nvidia_bug_report.log.txt (98.3 KB)

Hi, Thanks for the info.

Since ther was no complete uninstall for Nvidia, I did a complete RH installation and am able to run glxgears now.

I will install the Nvidia driver and try now.

Is there a separate Nvidia Kernel module ?

Really appreciate your help,

Thanks very much,

Shailesh

OK so followed exactly the same steps that you recommended.

I re-installed the RHEL as mentioned y’day.

I checked the glxgears was running fine.

I installed the Nvidia driver without running nvidia-xconfig, Also installed the Toolkit and samples.

However with all this I am back to same problem of failed to initialize GLX extension. The glxgears has stopped working.

Since this is only a test system, I am able to make changes to it at will.

So I am trying the fedora8 option tonight, just to check.

In case you have some comments for the RHEL do let me know.

Thanks,

Shailesh

There should not be any need to reinstall the OS.

Please try using the attached xorg.conf. You may need to replace “i810” with whichever X driver is required for the onboard Intel GPU.

If this doesn’t work, please generate a new bug report.
xorg.conf.txt (3.4 KB)

Ok I was able to run glxgears with the Nvidia driver and other software in place. Engineers from Nvidia (pune) helped to ensure that the correct lib64 files are used while running glxgears.

However the fluidsGL and similar applications were still not running and I was getting the following error : “ERROR: Support for necessary OpenGL extensions missing.”

The engineers asked me to get the latest Mesa, which I did and installed the latest Mesa7.0.2 Demos, GLUT and Lib. The latest Mesa demos like gears, gloss, teapot work fine. However when I try to invoke fluidsGL etc samples I get the same error as above.

One observation while running an applications called fslight from Mesa, I got the following message

./fslight

This program requires OpenGL 2.x, found 1.3 Mesa 6.2.1

Also I had copied the files from Lib64 directory of Mesa to the /usr/X11R6/lib64.

(the lib64.txt has the list of files I copied)

Also I checked (ldd fluidsGL, lddsimpleTexture etc) if the same files were being looked at. I’ve attached the logs fyi.

What additional OpenGL files are required to run these samples ?

Thanks,

Shailesh
marchingcubes_err.txt (1.15 KB)
lib64.txt (344 Bytes)
simpleGL_err.txt (1.16 KB)
fslight.txt (66 Bytes)
fluidsGL_err.txt (670 Bytes)

The OpenGL headers that ship with Mesa are mutually exclusive of the OpenGL headers that ship with the NVIDIA X driver. You cannot use them both, and installing/updating Mesa after installing the NVIDIA X driver will result in all OpenGL support from the nvidia X driver being removed.

Thus, its entirely expected that glxgears would run (falling back to the software Mesa OpenGL) and any apps which require hardware acceleration via the nvidia driver to fail.

[b]Ok we are finally able to run the fluidsGL and similar demos.

[/b]

  • I installed a good 800W SMPS in the system (which wasn’t present earlier)

  • Installed an additional Nvidia Quadro NVS 280 board. (this installation forced me to install the higher wattage SMPS). Connected the monitor to this board.

  • We installed a completely new RH (workstation) version

  • Installed the 169.09 Tesla driver.

(did not install the 32 bit compatible nvidia OpenGL libraries, and did nvidia-xconfig)

  • executed nvidia-xconfig -a again from the console ( this took a long time and we almost thought the system had hung but it actually configured X)

  • Installed the SDK and CUDA

  • Added the required, PATHs and links and compiled the samples.

The demos are running fine now.

One of the engineers fron Nvidia Pune helped in installing and running that second nvidia-xconfig -a step. That was key I guess.

Thanks to all who helped.

  • Shailesh