invalid device ordinal

deviceQuery and other Cuda apps return

cudaGetDeviceCount returned 10
→ invalid device ordinal

Problem began on inserting second graphics card Ge Force GTX 760 in addition to previous card GTX 660.
Ran fine on GTX 660 alone.

Centos 7, NVIdia driver 361.28

Evidently, the system sees both cards:
[rl@linus release]$ lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GK106 [GeForce GTX 660] (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation GM206 [GeForce GTX 960] (rev a1)
[rl@linus release]$

Any suggestions appreciated

ronalddl

Also works fine with GTX 960 alone.

Therefore the topic should be:
Can’t use two GPUs on linux

ronalddl

@ ronalddl

Is it still an issue by re-compiled the deviceQuery sample?
like
$cd ~/NVIDIA_CUDA-7.5_Samples/1_Utilities/deviceQuery
$ make clean
$ make

Also, did you see any errors in “$nvidia-smi” or “$dmesg | grep NVRM” output?

Hello, I have the same problem and here’s what I get from dmesg:
[ 48.038961] nvidia 0000:01:00.0: irq 48 for MSI/MSI-X
[ 48.970693] NVRM: RmInitAdapter failed! (0x25:0x1c:1373)
[ 48.970699] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 48.970713] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5
[ 48.970847] nvidia 0000:01:00.0: irq 48 for MSI/MSI-X
[ 49.282433] NVRM: RmInitAdapter failed! (0x25:0x1c:1373)
[ 49.282439] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 49.282452] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5
[ 61.820863] SFW2-INext-DROP-DEFLT IN=eno1 OUT= MAC= SRC=fe80:0000:0000:0000:0a62:66ff:fe4c:bd8d DST=ff02:0000:0000:0000:0000:0000:0000:00fb LEN=84 TC=0 HOPLIMIT=255 FLOWLBL=0 PROTO=UDP SPT=5353 DPT=5353 LEN=44
[ 62.096609] nvidia 0000:01:00.0: irq 48 for MSI/MSI-X
[ 62.408852] NVRM: RmInitAdapter failed! (0x25:0x1c:1373)
[ 62.408858] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 62.408871] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5
[ 62.409003] nvidia 0000:01:00.0: irq 48 for MSI/MSI-X
[ 62.717052] NVRM: RmInitAdapter failed! (0x25:0x1c:1373)
[ 62.717058] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 62.717071] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5
[ 78.045124] nvidia 0000:01:00.0: irq 48 for MSI/MSI-X
[ 78.362758] NVRM: RmInitAdapter failed! (0x25:0x1c:1373)
[ 78.362764] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 78.362777] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5
[ 78.362911] nvidia 0000:01:00.0: irq 48 for MSI/MSI-X
[ 78.671541] NVRM: RmInitAdapter failed! (0x25:0x1c:1373)
[ 78.671546] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 78.671560] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5

nvidia-smi gives me the following message:
$ nvidia-smi
Unable to determine the device handle for GPU 0000:01:00.0: Unable to communicate with GPU because it is insufficiently powered.
This may be because not all required external power cables are
attached, or the attached cables are not seated properly.

Now, I have seated the nvidia card properly AND have attached the 6 pin PCI Express Molex connector. I’m sorry I don’t have the nvidia card installation documentation with me. The card details are:
Device 0: “GeForce 8600 GTS”

Please let me know:

  • If I Should be attaching any other connectors.
  • Please can you point me to an online installation guide.

Thanks in advance.

8600 GTS isn’t supported by any current drivers. It is too old. What driver are you trying to use?

Ah!!!
That would DEFINITELY be one of the problems. I downloaded the latest nvidia driver (for Linux): NVIDIA-Linux-x8_64-367.35.run

(I have a 64 bit machine and have flashed the bios to the latest).

BTW, is there a hardware installation/user manual available on the web for this card?
Thanks for your inputs - I have been struggling for quite a while trying to install CUDA and get it to run with this card.

The last CUDA toolkit version that supports that card is CUDA 6.5
The driver that comes with the CUDA toolkit 6.5 installer should work for that card.

You can get older CUDA toolkits at the CUDA toolkit archive page:

[url]https://developer.nvidia.com/cuda-toolkit-archive[/url]

There are installation guides available with each toolkit.

Thanks, I’ll install this and will update here.

I also need the soft copy of the hardware installation for the card. Please can you point me to a link from where I can download this - I googled it but couldn’t find it.

[url]http://developer.download.nvidia.com/compute/cuda/6_5/rel/docs/CUDA_Getting_Started_Linux.pdf[/url]

I apologize, I didn’t make myself clear. I have the pdf for installing CUDA on Linux.

I need the installation manual for the physical card - the GeForce 8600 GTS as I’m not sure if I have installed the card properly.

The reason I ask is because I got the following when I ran nvidia-smi:

$ sudo nvidia-smi
Unable to determine the device handle for GPU 0000:01:00.0: Unable to communicate with GPU because it is insufficiently powered.
This may be because not all required external power cables are attached, or the attached cables are not seated properly.

Sorry, I don’t know where to find it. That card is almost 10 years old now.

OK!!
Let’s gamble on the fact that the physical interfaces don’t change very much. Please can you point me at the hardware manual of a similar (more contemporary) card.

That would be a bad gamble. Ten years is like eternity in a fast-moving field like GPU design. For what it is worth, the error message indicates a possible lack of power to the card.

Looking at a picture of the 8600 GTS, it doesn’t seem to have an extra PCIe power connector. That means it draws power through the PCIe slot only, up to 75 Watts according to PCIe specifications. Make sure the card is properly seated in a PCIe x16 slot, and secure the bracket on the card to the enclosure (maybe via a screw, or a latching mechanism on the case). [Later:] Looking at a different picture, there does seem to be a 6-pin PCIe power connector at the edge opposite the bracket. Make sure that is plugged in securely (usually there is a little tab that snaps into place).

Note that this GPU is based on G84, and is not supported by recent versions of CUDA (roughly, whatever shipped in the past two years). So a different hypothesis could be that the driver software you installed doesn’t recognize the GPU because it is old, unsupported hardware, in which case the specific message would be misleading.

In practical terms, I would suggest acquiring much more recent hardware, preferably Maxwell-based. Used lower-end GPUs of that nature, such as a GTX 750 Ti, should be available at very low cost.

Thanks, I fixed it!!!

The 8600 GTS does have a power interface on top (BTW, this card was a loaner by nVidia). I plugged in the power connectors and it worked.

So my learnings are:

  1. Yes, it is CUDA 6.5. I used the .run file to install. I had a couple of complaints during install - saying that I was installing it on a non-supported OS.
  2. The log file generated is extraordinarily helpful. It is a good idea to read it after each pass.
  3. OpenSuse 13.2 is better, it gives more error messages than the Ubuntu versions and that helps.
  4. Nouveau drivers need to be de-installed and cleaned out - simple blacklisting and rebuilding didn't seem to work for me.
  5. Ensure the power connectors on top of the card are connected.

Hope this helps you guys for any future questions you guys get for this specific card. Thanks for the support.