trying to get a tesla k10 online. cuda_5.5.22_linux_64.run fails

I was able to download and install the video drive NVIDIA-Linux-x86_64-331.38.run http://bpaste.net/show/kZjiYIpyqFwp0cjICx0i/ but cudaminer still fails with “Unable to query CUDA driver version! Is an nVidia driver installed?”

If I try and install the cuda driver, I fail with -> http://bpaste.net/show/gFCA8s2oIhX8iVuwfX2O/

root@coined:/home# cat /etc/*-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=12.04
DISTRIB_CODENAME=precise
DISTRIB_DESCRIPTION=“Ubuntu 12.04.4 LTS”
NAME=“Ubuntu”
VERSION=“12.04.4 LTS, Precise Pangolin”
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME=“Ubuntu precise (12.04.4 LTS)”
VERSION_ID=“12.04”

coined:/home/coined/NVIDIA_CUDA-5.5_Samples/bin/x86_64/linux/release# lspci -k
00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06)
Subsystem: ASUSTeK Computer Inc. Device 8534
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller (rev 06)
Kernel driver in use: pcieport
Kernel modules: shpchp
00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x8 Controller (rev 06)
Kernel driver in use: pcieport
Kernel modules: shpchp
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06)
Subsystem: ASUSTeK Computer Inc. Device 8534
Kernel driver in use: i915
Kernel modules: i915
00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller (rev 06)
Subsystem: ASUSTeK Computer Inc. Device 8534
Kernel driver in use: snd_hda_intel
Kernel modules: snd-hda-intel
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 04)
Subsystem: ASUSTeK Computer Inc. Device 8534
Kernel driver in use: xhci_hcd
00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 (rev 04)
Subsystem: ASUSTeK Computer Inc. Device 8534
Kernel driver in use: mei_me
Kernel modules: mei-me
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-V (rev 04)
Subsystem: ASUSTeK Computer Inc. Device 859f
Kernel driver in use: e1000e
Kernel modules: e1000e
00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 04)
Subsystem: ASUSTeK Computer Inc. Device 8534
Kernel driver in use: ehci-pci
00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller (rev 04)
Subsystem: ASUSTeK Computer Inc. Device 8573
Kernel driver in use: snd_hda_intel
Kernel modules: snd-hda-intel
00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 (rev d4)
Kernel driver in use: pcieport
Kernel modules: shpchp
00:1c.1 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #2 (rev d4)
Kernel driver in use: pcieport
Kernel modules: shpchp
00:1c.4 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #5 (rev d4)
Kernel driver in use: pcieport
Kernel modules: shpchp
00:1c.6 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d4)
00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 (rev 04)
Subsystem: ASUSTeK Computer Inc. Device 8534
Kernel driver in use: ehci-pci
00:1f.0 ISA bridge: Intel Corporation Z87 Express LPC Controller (rev 04)
Subsystem: ASUSTeK Computer Inc. Device 8534
Kernel driver in use: lpc_ich
Kernel modules: lpc_ich
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 04)
Subsystem: ASUSTeK Computer Inc. Device 8534
Kernel driver in use: ahci
Kernel modules: ahci
00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller (rev 04)
Subsystem: ASUSTeK Computer Inc. Device 8534
Kernel modules: i2c-i801
01:00.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ba)
Kernel driver in use: pcieport
Kernel modules: shpchp
02:08.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ba)
Kernel driver in use: pcieport
Kernel modules: shpchp
02:10.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ba)
Kernel driver in use: pcieport
Kernel modules: shpchp
03:00.0 3D controller: NVIDIA Corporation GK104GL [Tesla K10] (rev a1)
Kernel driver in use: nvidia
Kernel modules: nvidia, nouveau, nvidiafb
04:00.0 3D controller: NVIDIA Corporation GK104GL [Tesla K10] (rev a1)
Subsystem: NVIDIA Corporation Device 0970
Kernel driver in use: nvidia
Kernel modules: nvidia, nouveau, nvidiafb
05:00.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ba)
Kernel driver in use: pcieport
Kernel modules: shpchp
06:08.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ba)
Kernel driver in use: pcieport
Kernel modules: shpchp
06:10.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ba)
Kernel driver in use: pcieport
Kernel modules: shpchp
07:00.0 3D controller: NVIDIA Corporation GK104GL [Tesla K10] (rev a1)
Subsystem: NVIDIA Corporation Device 0970
Kernel driver in use: nvidia
Kernel modules: nvidia, nouveau, nvidiafb
08:00.0 3D controller: NVIDIA Corporation GK104GL [Tesla K10] (rev a1)
Subsystem: NVIDIA Corporation Device 0970
Kernel driver in use: nvidia
Kernel modules: nvidia, nouveau, nvidiafb
0a:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01)
Subsystem: ASUSTeK Computer Inc. Device 858d
Kernel driver in use: ahci
Kernel modules: ahci
0b:00.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ba)
Kernel driver in use: pcieport
Kernel modules: shpchp
0c:08.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ba)
Kernel driver in use: pcieport
Kernel modules: shpchp
0c:10.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ba)
Kernel driver in use: pcieport
Kernel modules: shpchp
0d:00.0 3D controller: NVIDIA Corporation GK104GL [Tesla K10] (rev a1)
Subsystem: NVIDIA Corporation Device 0970
Kernel driver in use: nvidia
Kernel modules: nvidia, nouveau, nvidiafb
0e:00.0 3D controller: NVIDIA Corporation GK104GL [Tesla K10] (rev a1)
Subsystem: NVIDIA Corporation Device 0970
Kernel driver in use: nvidia
Kernel modules: nvidia, nouveau, nvidiafb
0f:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 03)

What sort of machine is that K10 plugged into? (i.e. what is the manufacturer and model number?)

z87-PLUS ASUS board with an 8core LGA chipset and 1500w power, corsair memory 8gig.
Ubuntu 12.04

Unless the K10 is in an officially qualified server chassis, it will probably overheat. It seems like you have it plugged into a desktop system, and you’re not going to have good results with that. It’s designed to be used in a system like a Dell R720 or an HP SL250. Not only are these systems capable of providing proper forced-air cooling, but the cooling system requires closed loop monitoring of the K10 thermals by the server BMC. You might be interested in reading this: http://stackoverflow.com/questions/13831694/kernel-time-increases-for-same-number-of-particles

No, I am not a hardware n00b, I have a full size fan on them keep all 3 video cards very cool.
No chassis, only a PE ground.
ASUS z87-plus with LGA 8 cores with a 1500w power supply, corsair 8gigs.
It is a desktop, but not in the sense you are thinking.

My issue seems much more to be a failed cuda installation than a tech 101 issue. lspci shows the gpus and the loaded kernel driver.

You’ll be surprised how hard it is to keep the K10 cool when you put it under load. A “full size fan” will be nowhere near enough airflow.

What does nvidia-smi -a show at the moment?

They are ice cold.
Not a single error I have pertains to heat.

Sorry, I wasn’t clear. Please capture the output of

nvidia-smi -a

and post it in a reply to this question.

linux-a5nu:/home # cat /proc/driver/nvidia/version
cat: /proc/driver/nvidia/version: No such file or directory
linux-a5nu:/home # nvidia-smi -a
Unable to determine the device handle for GPU 0000:03:00.0: Unknown Error

I switched to SUSE.
Completely gave up on Ubuntu, seeing similar issues with the nvidia driver, just utterly failing.
This time the ./NVIDIA-Linux-x86_64-331.38.run drive installs, but I cannot see any output of the version.
I personally removed the nouveau drivers from the kernel and rebooted, so they are not getting in the way.

I don’t see how it could be complaining about nouveau, I removed it completely from the kernel http://bpaste.net/show/DtFNS324Gr6NZVHjQee3/ I work with Gentoo a lot, I know how to tweak and compile a kernel properly.

Basically I am at the exact same place with suse. I can compile the toolkit, I can install the NVIDIA-Linux-x86_64-331.38.run file though unlike ubuntu, it does not appear with “cat /proc/driver/nvidia/version”. When I run cuda_5.5.22_linux_64.run and do not select the driver but only toolkit and samples, it completes by asking me to rerun cuda_5.5.22_linux_64.run -silent -driver and it completes error free, but there is zero sign of a driver installed.
It seems as though everything is installed buy cuda.

linux-a5nu:/home/NVIDIA_CUDA-5.5_Samples/NVIDIA_CUDA-5.5_Samples # cudaminer -H 1 -i 0 -l auto -C 1 -o stratum+tcp://fast-lemon.hashrapid.com:3334 -O tunage.tunage:1234
*** CudaMiner for nVidia GPUs by Christian Buchner ***
This is version 2014-02-09 (beta)
based on pooler-cpuminer 2.3.2 © 2010 Jeff Garzik, 2012 pooler
Cuda additions Copyright 2013,2014 Christian Buchner
LTC donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm
BTC donation address: 16hJF5mceSojnTD3ZTUDqdRhDyPJzoRakM
YAC donation address: Y87sptDEcpLkLeAuex6qZioDbvy1qXZEj4
[2014-02-16 01:35:57] Unable to query CUDA driver version! Is an nVidia driver installed?

http://bpaste.net/show/8uxniO7oQ1J6vCqxXY6Q/

If nvidia-smi cannot see the GPUs, it means the driver is unable to communicate with the GPUs and there is no point in running other applications, including your miner app. I can try to help, but I’m most familiar with RHEL/CentOS,also somewhat with Ubuntu. suse/SLES not so much.

as root, can you capture the output of:

nvidia-smi -a
dmesg |grep NVRM

and paste it into this question as a reply.

root@coined:/usr/local/cuda-5.5/samples/1_Utilities/deviceQuery# ls
deviceQuery deviceQuery.o Makefile readme.txt
deviceQuery.cpp findcudalib.mk NsightEclipse.xml
root@coined:/usr/local/cuda-5.5/samples/1_Utilities/deviceQuery# ./deviceQuery
./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 10
-> invalid device ordinal
Result = FAIL
root@coined:/usr/local/cuda-5.5/samples/1_Utilities/deviceQuery# ls /lib/modules/$(uname -r)/kernel/drivers/video
arcfb.ko fb_sys_fops.ko n411.ko sis uvesafb.ko
arkfb.ko goldfishfb.ko neofb.ko sm501fb.ko vermilion
aty hecubafb.ko nvidia smscufx.ko vesafb.ko
auo_k1900fb.ko hgafb.ko nvidia.ko sstfb.ko vga16fb.ko
auo_k1901fb.ko hyperv_fb.ko nvidia-uvm.ko svgalib.ko vgastate.ko
auo_k190x.ko i740fb.ko output.ko syscopyarea.ko via
backlight intelfb pm2fb.ko sysfillrect.ko vt8623fb.ko
broadsheetfb.ko kyro pm3fb.ko sysimgblt.ko xen-fbfront.ko
carminefb.ko macmodes.ko riva tdfxfb.ko
cirrusfb.ko matrox s1d13xxxfb.ko tmiofb.ko
cyber2000fb.ko mb862xx s3fb.ko tridentfb.ko
fb_ddc.ko metronomefb.ko savage udlfb.ko
root@coined:/usr/local/cuda-5.5/samples/1_Utilities/deviceQuery# cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 331.38 Wed Jan 8 19:32:30 PST 2014
GCC version: gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

root@coined:/usr/local/cuda-5.5/samples/1_Utilities/deviceQuery# nvidia-smi -a
Unable to determine the device handle for GPU 0000:03:00.0: Unknown Error
root@coined:/usr/local/cuda-5.5/samples/1_Utilities/deviceQuery# dmesg |grep NVRM
[ 8.627040] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 331.38 Wed Jan 8 19:32:30 PST 2014
[ 239.070386] NVRM: failed to copy vbios to system memory.
[ 239.072937] NVRM: RmInitAdapter failed! (0x30:0xffffffff:720)
[ 239.072943] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 239.072958] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5
[ 315.690680] NVRM: failed to copy vbios to system memory.
[ 315.693264] NVRM: RmInitAdapter failed! (0x30:0xffffffff:720)
[ 315.693270] NVRM: rm_init_adapter failed for device bearing minor number 0
[ 315.693285] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5

I think we are on to something. ^^
Googling now

oops

How about trying some older versions of the driver?

It’s possible that your motherboard is having some trouble allocating the PCI resources for three K10 devices (=6 GPUs). Just for test purposes, can you remove two of your 3 K10 devices and see what kind of results you get? I’m mainly just interested in whether:

nvidia-smi -a

gives sane results or not.
If it does not give sane results, then let’s see the output of:

dmesg |grep NVRM

If nvidia-smi does give sane results, then try adding another (so, use 2 of your 3) K10 devices.

If that also works, then I would check for any SBIOS updates for your ASUS motherboard.

Trying that now. I got it to boot once correctly where dmesg |grep NVRM came up with the intended results. but after getting cudaminer back in check, it borked again.
I just removed a card. BTW, I am back to Ubuntu again. SUSE was way worse.

Getting really close.
I initialized some settings in the bios and switched cables to add more juice to the cards.

root@coined:~# dmesg |grep NVRM
[ 6.293208] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 331.38 Wed Jan 8 19:32:30 PST 2014
root@coined:~# cudaminer -H 1 -i 0 -l auto -C 1 -o stratum+tcp://mine.hashrapid.com:3335 -O tunage.tunage:1234
*** CudaMiner for nVidia GPUs by Christian Buchner ***
This is version 2014-02-09 (beta)
based on pooler-cpuminer 2.3.2 © 2010 Jeff Garzik, 2012 pooler
Cuda additions Copyright 2013,2014 Christian Buchner
LTC donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm
BTC donation address: 16hJF5mceSojnTD3ZTUDqdRhDyPJzoRakM
YAC donation address: Y87sptDEcpLkLeAuex6qZioDbvy1qXZEj4
[2014-02-16 11:41:55] Unable to query CUDA driver version! Is an nVidia driver installed?
root@coined:~# cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 331.38 Wed Jan 8 19:32:30 PST 2014
GCC version: gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
root@coined:~# nvidia-smi -a
Unable to determine the device handle for GPU 0000:04:00.0: Unable to communicate with GPU because it is insufficiently powered.
This may be because not all required external power cables are
attached, or the attached cables are not seated properly.

I think I am good to go now!!

Thank you!

root@coined:~# cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 331.38 Wed Jan 8 19:32:30 PST 2014
GCC version: gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
root@coined:~# nvidia-smi -a

==============NVSMI LOG==============

Timestamp : Sun Feb 16 11:49:41 2014
Driver Version : 331.38

Attached GPUs : 4
GPU 0000:04:00.0
Product Name : Tesla K10.G1.8GB
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 128
Driver Model
Current : N/A
Pending : N/A
Serial Number : 0321613047527
GPU UUID : GPU-10eaa126-fafa-3c05-13f9-f371c9a0c232
Minor Number : 0
VBIOS Version : 80.04.59.00.2B
Inforom Version
Image Version : 2055.0200.01.06
OEM Object : 1.1
ECC Object : 2.0
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x04
Device : 0x00
Domain : 0x0000
Device Id : 0x118F10DE
Bus Id : 0000:04:00.0
Sub System Id : 0x097010DE
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : PLX
Firmware : 0
Fan Speed : N/A
Performance State : P0
Clocks Throttle Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
Unknown : Not Active
FB Memory Usage
Total : 3583 MiB
Used : 9 MiB
Free : 3574 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 2 MiB
Free : 254 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Ecc Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
Single Bit
Device Memory : 0
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : 0
Double Bit
Device Memory : 0
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : 0
Aggregate
Single Bit
Device Memory : 0
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : 0
Double Bit
Device Memory : 0
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
Gpu : 40 C
Power Readings
Power Management : Supported
Power Draw : 42.32 W
Power Limit : 117.50 W
Default Power Limit : 117.50 W
Enforced Power Limit : 117.50 W
Min Power Limit : 85.00 W
Max Power Limit : 125.00 W
Clocks
Graphics : 745 MHz
SM : 745 MHz
Memory : 2500 MHz
Applications Clocks
Graphics : 745 MHz
Memory : 2500 MHz
Default Applications Clocks
Graphics : 745 MHz
Memory : 2500 MHz
Max Clocks
Graphics : 745 MHz
SM : 745 MHz
Memory : 2500 MHz
Compute Processes : None

GPU 0000:05:00.0
Product Name : Tesla K10.G1.8GB
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 128
Driver Model
Current : N/A
Pending : N/A
Serial Number : 0321613047527
GPU UUID : GPU-6db4f28c-7ddf-baf6-3d29-0d9c338dac7c
Minor Number : 1
VBIOS Version : 80.04.59.00.2C
Inforom Version
Image Version : 2055.0200.01.06
OEM Object : 1.1
ECC Object : 2.0
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x05
Device : 0x00
Domain : 0x0000
Device Id : 0x118F10DE
Bus Id : 0000:05:00.0
Sub System Id : 0x097010DE
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : PLX
Firmware : 0
Fan Speed : N/A
Performance State : P0
Clocks Throttle Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
Unknown : Not Active
FB Memory Usage
Total : 3583 MiB
Used : 9 MiB
Free : 3574 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 2 MiB
Free : 254 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Ecc Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
Single Bit
Device Memory : 0
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : 0
Double Bit
Device Memory : 0
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : 0
Aggregate
Single Bit
Device Memory : 0
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : 0
Double Bit
Device Memory : 0
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
Gpu : 32 C
Power Readings
Power Management : Supported
Power Draw : 41.74 W
Power Limit : 117.50 W
Default Power Limit : 117.50 W
Enforced Power Limit : 117.50 W
Min Power Limit : 85.00 W
Max Power Limit : 125.00 W
Clocks
Graphics : 745 MHz
SM : 745 MHz
Memory : 2500 MHz
Applications Clocks
Graphics : 745 MHz
Memory : 2500 MHz
Default Applications Clocks
Graphics : 745 MHz
Memory : 2500 MHz
Max Clocks
Graphics : 745 MHz
SM : 745 MHz
Memory : 2500 MHz
Compute Processes : None

GPU 0000:0A:00.0
Product Name : Tesla K10.G1.8GB
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 128
Driver Model
Current : N/A
Pending : N/A
Serial Number : 0322812084574
GPU UUID : GPU-aa626009-02ae-e5fc-9e4f-9bf9eea27bbf
Minor Number : 2
VBIOS Version : 80.04.45.00.03
Inforom Version
Image Version : 2055.0200.01.04
OEM Object : 1.1
ECC Object : 2.0
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x0A
Device : 0x00
Domain : 0x0000
Device Id : 0x118F10DE
Bus Id : 0000:0A:00.0
Sub System Id : 0x097010DE
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : PLX
Firmware : 4031187200
Fan Speed : N/A
Performance State : P0
Clocks Throttle Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
Unknown : Not Active
FB Memory Usage
Total : 4095 MiB
Used : 9 MiB
Free : 4086 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 2 MiB
Free : 254 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Ecc Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
Gpu : 35 C
Power Readings
Power Management : Supported
Power Draw : 41.62 W
Power Limit : 117.50 W
Default Power Limit : 117.50 W
Enforced Power Limit : 117.50 W
Min Power Limit : 85.00 W
Max Power Limit : 125.00 W
Clocks
Graphics : 745 MHz
SM : 745 MHz
Memory : 2500 MHz
Applications Clocks
Graphics : 745 MHz
Memory : 2500 MHz
Default Applications Clocks
Graphics : 745 MHz
Memory : 2500 MHz
Max Clocks
Graphics : 745 MHz
SM : 745 MHz
Memory : 2500 MHz
Compute Processes : None

GPU 0000:0B:00.0
Product Name : Tesla K10.G1.8GB
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 128
Driver Model
Current : N/A
Pending : N/A
Serial Number : 0322812084574
GPU UUID : GPU-246f1850-11ce-2d97-23e4-1f0afd90071f
Minor Number : 3
VBIOS Version : 80.04.45.00.04
Inforom Version
Image Version : 2055.0200.01.04
OEM Object : 1.1
ECC Object : 2.0
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x0B
Device : 0x00
Domain : 0x0000
Device Id : 0x118F10DE
Bus Id : 0000:0B:00.0
Sub System Id : 0x097010DE
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : PLX
Firmware : 4031187200
Fan Speed : N/A
Performance State : P0
Clocks Throttle Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
Unknown : Not Active
FB Memory Usage
Total : 4095 MiB
Used : 9 MiB
Free : 4086 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 2 MiB
Free : 254 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Ecc Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
Gpu : 28 C
Power Readings
Power Management : Supported
Power Draw : 41.78 W
Power Limit : 117.50 W
Default Power Limit : 117.50 W
Enforced Power Limit : 117.50 W
Min Power Limit : 85.00 W
Max Power Limit : 125.00 W
Clocks
Graphics : 745 MHz
SM : 745 MHz
Memory : 2500 MHz
Applications Clocks
Graphics : 745 MHz
Memory : 2500 MHz
Default Applications Clocks
Graphics : 745 MHz
Memory : 2500 MHz
Max Clocks
Graphics : 745 MHz
SM : 745 MHz
Memory : 2500 MHz
Compute Processes : None

root@coined:~# cudaminer -H 1 -i 0 -l auto -C 1 -o stratum+tcp://mine.hashrapid.com:3335 -O tunage.tunage:1234
*** CudaMiner for nVidia GPUs by Christian Buchner ***
This is version 2014-02-09 (beta)
based on pooler-cpuminer 2.3.2 © 2010 Jeff Garzik, 2012 pooler
Cuda additions Copyright 2013,2014 Christian Buchner
LTC donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm
BTC donation address: 16hJF5mceSojnTD3ZTUDqdRhDyPJzoRakM
YAC donation address: Y87sptDEcpLkLeAuex6qZioDbvy1qXZEj4
[2014-02-16 11:49:52] Starting Stratum on stratum+tcp://mine.hashrapid.com:3335
[2014-02-16 11:49:52] 4 miner threads started, using ‘scrypt’ algorithm.

So it looks like you’re running with two K10 devices, not 3. I think that is probably why things are working now. The system BIOS may be struggling to assign PCI resources with all 3 K10 devices.

You’ll note that nvidia-smi can be used to display GPU temperatures.

I’d keep an eye on GPU temps using that tool (you can run it while your miner app is running, in a separate command console). Something like this:

nvidia-smi -a -l |grep -A 1 Temperature

If any of the temps get above 75-80C I’d keep a close eye on things.