quad GTX 780 problems

Hello -

after successfully running 6 Titans for CUDA work I bought 8 GTX780, two machines with 4 each. But I don’t get them to work. Is this related to Nvidia only allowing 3 SLI in their drivers? There seems to be a fix for Windows( http://videocardz.com/45253/nvidia-geforce-gtx-780-4-way-sli-possible-by-simple-modification), but not for Linux. After all, the cards where sold without restrictions for CUDA. Here’s my output from running code. I installed the latest drivers and CUDA5.5.

me:~$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation Device 1004 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 0e1a (rev a1)
02:00.0 VGA compatible controller: NVIDIA Corporation Device 1004 (rev a1)
02:00.1 Audio device: NVIDIA Corporation Device 0e1a (rev a1)
03:00.0 VGA compatible controller: NVIDIA Corporation Device 1004 (rev a1)
03:00.1 Audio device: NVIDIA Corporation Device 0e1a (rev a1)
04:00.0 VGA compatible controller: NVIDIA Corporation Device 1004 (rev a1)
04:00.1 Audio device: NVIDIA Corporation Device 0e1a (rev a1)

me:~$ uname -m && cat /etc/*release
x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=12.04
DISTRIB_CODENAME=precise
DISTRIB_DESCRIPTION=“Ubuntu 12.04.2 LTS”
NAME=“Ubuntu”
VERSION=“12.04.2 LTS, Precise Pangolin”
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME=“Ubuntu precise (12.04.2 LTS)”
VERSION_ID=“12.04”

me:~$ gcc --version
gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
Copyright © 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

me:~$ nvidia-smi
NVIDIA: could not open the device file /dev/nvidia3 (Input/output error).
Unable to determine the device handle for GPU 0000:04:00.0: The NVIDIA kernel module detected an issue with GPU interrupts.Consult the “Common Problems” Chapter of the NVIDIA Driver README for
details and steps that can be taken to resolve this issue.

me:~/NVIDIA_CUDA-5.5_Samples/bin/x86_64/linux/release$ ./deviceQueryDrv
./deviceQueryDrv Starting…

CUDA Device Query (Driver API) statically linked version
cuInit(0) returned 101
-> CUDA_ERROR_INVALID_DEVICE (device specified is not a valid CUDA device)
Result = FAIL

me:~/NVIDIA_CUDA-5.5_Samples/bin/x86_64/linux/release$ ./simpleMultiGPU
Starting simpleMultiGPU
CUDA error at simpleMultiGPU.cu:84 code=10(cudaErrorInvalidDevice) “cudaGetDeviceCount(&GPU_N)”

This means that I can install everything, but the driver causes the system to not function. Would older versions work?

Desperate…
Thanks,
D.

I think I solved it. Just had to use the driver Version: 325.08 - Release Date: Mon Jul 01, 2013 and get over the fact that I got the following error message:

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 319.00 is required for CUDA 5.5 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run -silent -driver

What I did before was to then use the 319 driver. This seems to work.