GTX760 CUDA issue on Arch Linux: "Error: only 0 Devices available, 1 requested. Exiting."

Hi, this is actually a re-post of an original post I’ve written on the GeForce forums. I’ve been told to come here: https://forums.geforce.com/default/topic/820047/geforce-700-600-series/gtx760-cuda-issue-quot-error-only-0-devices-available-1-requested-quot-/

I’ve run the ‘nvidia-bug-report.sh’ and it reads:"


Start of NVIDIA bug report log file. Please include this file, along
with a detailed description of your problem, when reporting a graphics
driver bug via the NVIDIA Linux forum (see devtalk.nvidia.com)
or by sending email to ‘linux-bugs@nvidia.com’.

nvidia-bug-report.sh Version: 18811463

Date: Sat Mar 28 21:34:10 CET 2015
uname: Linux filouarch 3.19.2-1-ARCH #1 SMP PREEMPT Wed Mar 18 16:21:02 CET 2015 x86_64 GNU/Linux
command line flags:
"

The following was posted on 03/20/2015 03:53 PM

Hi,

I’m trying to run a code similar to the nbody NVidia’s example. I’m getting the error “Error: only 0 Devices available, 1 requested. Exiting.”, which causes a seg fault.

The output of the execution:

./nbody Clouda_config.conf
is
Start time :
Fri Mar 20 16:38:30 CET 2015

myReal = double
dt = 1.00e-03 us | p = 1.00 pa | T_BG = 297.0 K | harmo_lmax = 12
recordTimeAfterThermalization = 600000000.00 s | thermTime = 0.00e+00 us
nbody.cpp::showHelp() empty for now.
Error: only 0 Devices available, 1 requested. Exiting.

*** Break *** segmentation violation

===========================================================
There was a crash.
This is the entire stack trace of all threads:

#0 0x00007f7093025aaa in waitpid () from /usr/lib/libc.so.6
#1 0x00007f7092fad40b in do_system () from /usr/lib/libc.so.6
#2 0x00007f709b9be654 in TUnixSystem::StackTrace() () from /usr/lib/root/libCore.so.5.34
#3 0x00007f709b9c074c in TUnixSystem::DispatchSignals(ESignals) () from /usr/lib/root/libCore.so.5.34
#4
#5 0x00007f7092ffe61b in __memcpy_sse2_unaligned () from /usr/lib/libc.so.6
#6 0x00007f709b97cbed in ROOT::TGenericClassInfo::CreateRuleSet(std::vector<ROOT::TSchemaHelper, std::allocatorROOT::TSchemaHelper >&, bool) () from /usr/lib/root/libCore.so.5.34
#7 0x00007f709b97d055 in ROOT::TGenericClassInfo::GetClass() () from /usr/lib/root/libCore.so.5.34
#8 0x00007f70990b4f5a in TTree::Class() () from /usr/lib/root/libTree.so.5.34
#9 0x00007f709b9394bd in TObject::InheritsFrom(TClass const*) const () from /usr/lib/root/libCore.so.5.34
#10 0x00007f709a9fde0a in TDirectoryFile::Save() () from /usr/lib/root/libRIO.so.5.34
#11 0x00007f709a9fc068 in TDirectoryFile::Close(char const*) () from /usr/lib/root/libRIO.so.5.34
#12 0x00007f709a9f2bf4 in TFile::Close(char const*) () from /usr/lib/root/libRIO.so.5.34
#13 0x00007f709b903108 in ?? () from /usr/lib/root/libCore.so.5.34
#14 0x00007f709b90360a in TROOT::CloseFiles() () from /usr/lib/root/libCore.so.5.34
#15 0x00007f7092fa414f in __cxa_finalize () from /usr/lib/libc.so.6
#16 0x00007f709b8deaa3 in ?? () from /usr/lib/root/libCore.so.5.34
#17 0x00007fff7312c2f0 in ?? ()
#18 0x00007f709c0a57f7 in _dl_fini () from /lib64/ld-linux-x86-64.so.2

The lines below might hint at the cause of the crash.
If they do not help you then please submit a bug report at

http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.

#5 0x00007f7092ffe61b in __memcpy_sse2_unaligned () from /usr/lib/libc.so.6
#6 0x00007f709b97cbed in ROOT::TGenericClassInfo::CreateRuleSet(std::vector<ROOT::TSchemaHelper, std::allocatorROOT::TSchemaHelper >&, bool) () from /usr/lib/root/libCore.so.5.34
#7 0x00007f709b97d055 in ROOT::TGenericClassInfo::GetClass() () from /usr/lib/root/libCore.so.5.34
#8 0x00007f70990b4f5a in TTree::Class() () from /usr/lib/root/libTree.so.5.34
#9 0x00007f709b9394bd in TObject::InheritsFrom(TClass const*) const () from /usr/lib/root/libCore.so.5.34
#10 0x00007f709a9fde0a in TDirectoryFile::Save() () from /usr/lib/root/libRIO.so.5.34
#11 0x00007f709a9fc068 in TDirectoryFile::Close(char const*) () from /usr/lib/root/libRIO.so.5.34
#12 0x00007f709a9f2bf4 in TFile::Close(char const*) () from /usr/lib/root/libRIO.so.5.34
#13 0x00007f709b903108 in ?? () from /usr/lib/root/libCore.so.5.34
#14 0x00007f709b90360a in TROOT::CloseFiles() () from /usr/lib/root/libCore.so.5.34
#15 0x00007f7092fa414f in __cxa_finalize () from /usr/lib/libc.so.6
#16 0x00007f709b8deaa3 in ?? () from /usr/lib/root/libCore.so.5.34
#17 0x00007fff7312c2f0 in ?? ()
#18 0x00007f709c0a57f7 in _dl_fini () from /lib64/ld-linux-x86-64.so.2

Segmentation fault (core dumped)

Hardware:
CPU is I7-2600K
Video card is MSI GTX 760 driver 304.125

OS is Arch Linux: 3.18.6-1-ARCH

GCC version is 4.9.2 20150304 (prerelease)
root version 5.34/24
nvcc is version release 6.5, V6.5.16
I’m using KDE Version 4.14.6

I’ve already posted on https://bbs.archlinux.org/viewtopic.php?id=192701

What I don’t get is that the deviceID is set to 0 explicitly in the code. Maybe I got a file or link missing ?

Any help would be appreciated !

Thanks

Madsub

Please attach log generated by running nvidia-bug-report.sh script as root user. I think ArchLinux is not supported OS for cuda sdk, Is this issue occur on supported distros?

Also please provide sample code, compilation and execution method to reproduce this issue internally to verify.

Ok, I’ve resolved the issue: I was using the “nvidia-304xx” legacy branch drivers from the Extra repositories. The version of these was 304.125.
I now have the “nvidia” 346.47 drivers (from the Extra repositories too) and I don’t have problems anymore ! So I only needed an up-to-date driver.

Still, thanks for the help sandipt.