Failed Tests in SDK on Ubuntu 7.10 (Gutsy)

Hello,

I’m having some difficulty running some of the examples in the CUDA 1.1 SDK. The following tests either fail (“TEST FAILED”) or crash my machine entirely: eigenvalues (crashes), fastWalshTransform (fails), histogram256 (fails), Mandelbrot (crashes), MersenneTwister (fails), and oceanFFT (program generates errors from CUFFT library).

I’m running a fully patched fresh install of Ubuntu 7.10 x86 with the 169.09 drivers on a system with an Intel Core2Duo and a 8800GTS (640MB version). I downloaded the “freeglut3-dev” package and was able to build all of the examples in the SDK without any errors. However, the tests listed above do not seem to work properly.

Can someone with a similar configuration confirm that the tests listed above work correctly? Should I be concerned that I can’t get these tests to work? Thanks!

I got X11 crashes with 169.09, and 169.09 seems to have been pulled from the Linux drivers page:

http://www.nvidia.com/object/unix.html

Try going back to 169.07, which works fine here.

Hmmm, the 169.09 drivers are still available if you go in through the “Get Drivers by Product” option: http://www.nvidia.com/object/linux_display_ia32_169.09.html

I can try reverting to the old drivers tonight and see if that makes a difference. Can anyone else confirm problems with the 169.09 drivers in Ubuntu 7.10?

Just as an update, I tried the 169.07 and 169.04 drivers and the same tests still fail. Should I be concerned that my setup does not run all of the samples? Any suggestions?

I have three similar new systems (Kubuntu 7.10, 8800GT, E8400, Gigabyte duale PCIe 2.0 mobo) and have only one issue running the demos so far (a hang when running dual GPU demo + graphical demo, which may have been going overboard) – using 169.09, didn’t really update anything on the system apart from what was needed to install CUDA

I’m running Nvidia 780i motherboard, Intel Quad core Q6600, 8GB ram and Ubuntu 7.10 64-bit server edition using the 8800GTS/512.

The 169.07 works fine (except for 100% fan, but it does not bother me) but the 169.09 runs at only half the speed (half device bandwidth, half speed in eigenvalues, half the fan speed … wait a minute here … :) )

— Kuisma

Just a quick note - the latest “nvclock” CVS can enable automatic fan control w/ 169.07, if you like.

Could you check the clock reported by deviceQuery after one of these slow runs?
Also, in the Nvidia X control panel, could you check if the card has different power states listed?

Right on!

Below with 169.09;

 Clock rate:                                    810000 kilohertz

The clock rate with 169.07 is twice this above.

Since the machine have no X11 installed, I’m not able to run nvidia-settings. Do you like me to install X, or can I gather the same information any other way?

– Kuisma

It is possible that your card has different power states and the driver does not kick the card back to the performance mode when running CUDA. We have seen this behavior on the new Quadro FX3700 and we are working on a solution.

If you could install X to double check if nvidia-settings is reporting multiple power states, it would be great.

If I start X11, the card will go at full clock speed in CUDA even after I exit X11 and unload the X11 driver (modprobe -r nvidia) with the 169.09 version…!

I’m not sure about what to look after in nvidia-settings about “power states”, but under the “PowerMizer” tab “performance levels” it only lists one mode, It’s strange, at the same tab it shows “Performance level: 0 Desktop” but if I look at some other tab and goes back to PowerMizer tab, it changes (after 0.5s) to “Performance level: 0 Maximum performance”. This is the same in 169.07 and 169.09. Only the text description changes, the number remains zero, and zero is the only mode that exists …

Is this helpful? Is there anything else you need?

– Kuisma

As an additional update, I rebuilt my system using Ubuntu 7.04 (Feisty) instead of 7.10. I still have the same problems with the tests mentioned previously.

For some of the tests it looks like the GPU-calculated result does not match the CPU-calculated result, so the test fails. Thinking this might be hardware-related, I underclocked my CPU from 3.2Ghz down to 1.2Ghz and the problems still remain.

If no one has any suggestions, I think I’m just going to move on with using CUDA and hope that none of my programs encounter any of these problems. Are Ubuntu and CUDA not quite compatible or could there just be problems with the SDK examples?

I installed CUDA SDK on Ubuntu 7.10 on an AMD64 configuration with 8800 GTX and the 169.09 x64 driver. All examples of the sdk run without problem.

I agree. Have you tried basic hardware diagnostics? Run memtset86+ for a day. Try another 8800-card. Try your GPU in another machine. Verify the PS can supply power enough, etc.