Successful Driver and SDK, but example programs Segment Fault?

omdown · June 19, 2009, 8:45pm

So, I’m not sure what to make of this, I’m BRAND NEW to CUDA, so I apologize if this post is in the wrong section or anything… if it is, just point me in the right direction and I’ll take it there :)

Before my problem, here are the relevant details:

Fedora 10
GeForce 9300 GE
NVIDIA Driver: 180.51
CUDA Version: 2.1

Now the problem. I installed the CUDA Toolkit and SDK, compiled and built the example code (for whatever reason, it couldn’t find -lcuda in the /usr/lib/nvidia folder, but once I copied it to /usr/lib, it built fine). I ran deviceQuery and bandwidthTest, both of which passed - I can copy/paste the results here if it might help diagnose the problem. But now whenever I try to run most of the other example programs, I get a Segment Fault error. I have had a FEW of them run successfully, namely: clock, scan, bitonic, BlackScholes, matrixMul… these pass, but others, such as marchingCubes, oceanFFT, particles, smokeParticles, volumeRender, all return Segment Fault.

I can provide more information if necessary. Thanks in advance for any advice.

netllama · June 19, 2009, 8:50pm

The fact that you needed to copy the CUDA driver to a different location on your system suggests that something isn’t setup correctly. I’m guessing that you’re using a 3rd party repackaging of the driver, as it doesn’t normally get installed in /usr/lib/nvidia.

Assuming that these problems reproduce with the official NVIDIA driver package AND the CUDA-2.2 release, please provide an nvidia-bug-report.log.gz along with the output from each app that is failing.

e.ping · June 19, 2009, 9:02pm

there seems to be a pattern emerging: all examples that don’t open up GL windows work fine, everything that does have graphics doesn’t. If this is the case, then that’s another point supporting netllama’s diagnosis.

omdown · June 19, 2009, 9:04pm

Hm. Will CUDA not work properly with a 3rd party release? I’ll be the first to admit, I’m not especially knowledgeable about Linux and I was told by my supervisor explicitly where to get the drivers… I don’t know the exact details of why, as I said, I’m not especially great with Liux. Where should the libraries be located? Is it possible I might be able to copy them to the proper directories to get this running?

avidday · June 20, 2009, 10:31am

This sounds like a problem that you most certainly won’t be able to solve just by shuffling files around. My guess is that you have inadvertently linked against some incompatible Open GL libraries somewhere. You should study the ldd output for the SDK examples that don’t work and see what libraries are being used. If things are in non-standard places, it might be necessary to either modify the SDK makefile so that the correct libraries are found at compile time, or craft an appropriate runtime library hierarchy for CUDA apps via the LD_LIBRARY_PATH mechanism, or both.

netllama · June 21, 2009, 4:32pm

CUDA might work just fine with a 3rd party repackaging of the driver, however in your case it does not work properly.

Shuffling files around is going to result in an installation that isn’t supported by anyone (neither the 3rd party that built the package, nor NVIDIA). The correct solution here is to file a bug with the 3rd party who provided the driver package and/or install the official driver package from NVIDIA.

netllama · June 21, 2009, 4:38pm

CUDA might work just fine with a 3rd party repackaging of the driver, however in your case it does not work properly.

Shuffling files around is going to result in an installation that isn’t supported by anyone (neither the 3rd party that built the package, nor NVIDIA). The correct solution here is to file a bug with the 3rd party who provided the driver package and/or install the official driver package from NVIDIA.

omdown · June 22, 2009, 12:35pm

Here is my ldd for marchingCubes… it seems to be finding everything except for linux-gate… is that the most likely problem?

ldd marchingCubes
linux-gate.so.1 => (0x00de3000)
libcudart.so.2 => /usr/local/cuda/lib/libcudart.so.2 (0x007a1000)
libGL.so.1 => /usr/lib/nvidia/libGL.so.1 (0x054e7000)
libGLU.so.1 => /usr/lib/libGLU.so.1 (0x055af000)
libX11.so.6 => /usr/lib/libX11.so.6 (0x005b3000)
libXi.so.6 => /usr/lib/libXi.so.6 (0x007fb000)
libXmu.so.6 => /usr/lib/libXmu.so.6 (0x00885000)
libglut.so.3 => /usr/lib/libglut.so.3 (0x00272000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00916000)
libm.so.6 => /lib/libm.so.6 (0x003d6000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00877000)
libc.so.6 => /lib/libc.so.6 (0x003ff000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00112000)
libdl.so.2 => /lib/libdl.so.2 (0x0012c000)
librt.so.1 => /lib/librt.so.1 (0x00131000)
libGLcore.so.1 => /usr/lib/nvidia/libGLcore.so.1 (0x0575d000)
libnvidia-tls.so.1 => /usr/lib/nvidia/tls/libnvidia-tls.so.1 (0x0013b000)
libXext.so.6 => /usr/lib/libXext.so.6 (0x006ef000)
libxcb-xlib.so.0 => /usr/lib/libxcb-xlib.so.0 (0x006bb000)
libxcb.so.1 => /usr/lib/libxcb.so.1 (0x00595000)
libXt.so.6 => /usr/lib/libXt.so.6 (0x079dc000)
libXxf86vm.so.1 => /usr/lib/libXxf86vm.so.1 (0x07794000)
/lib/ld-linux.so.2 (0x0023b000)
libXau.so.6 => /usr/lib/libXau.so.6 (0x006b6000)
libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0x0058d000)
libSM.so.6 => /usr/lib/libSM.so.6 (0x00c64000)
libICE.so.6 => /usr/lib/libICE.so.6 (0x00c6e000)
libuuid.so.1 => /lib/libuuid.so.1 (0x00af1000)

Edit:: Actually, looking at the ldd for a lot of the other examples, it seems to be not finding that on any of them, including the ones that work… all of them seem to be finding the libraries just fine… and in locations searched by my library path…

avidday · June 22, 2009, 1:47pm

That looks OK. The key NVIDIA libraries like libGL and libGLcore look like they are being picked up at runtime correctly. That narrows things down to two possibilities - the SDK compilation isn’t using the same libraries as ldd is finding at runtime, or the driver bundle you are using is genuinely broken somehow.

omdown · June 22, 2009, 3:32pm

Hm. The driver bundle has worked just fine for everything else… I do OpenGL and OpenSceneGraph programming on this and they all compile and run fine… does that mean / imply anything? The bundle I’m using is from RPMfusion, downloaded via autoten… I’m not great with Linux, I’ve only been using it for a few weeks now. I can copy / paste the details it outputs from the autoten download if it will help ID the bundle…

netllama · June 22, 2009, 3:35pm

As I stated earlier, you can either file a bug with the RPMfusion folks or install the official driver package from NVIDIA.