I have two GTX 750 ti cards installed on Ubuntu 14.04 with CUDA 7.5. The machine was customarily built on Gigabyte GA-Z77P-D3 motherboard with the latest BIOS version F8e installed. Peer-to-peer memory access seems to fail according to the output of the simpleP2P program shown below.
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2
> GPU0 = "GeForce GTX 750 Ti" IS capable of Peer-to-Peer (P2P)
> GPU1 = "GeForce GTX 750 Ti" IS capable of Peer-to-Peer (P2P)
Checking GPU(s) for support of peer to peer memory access...
> Peer access from GeForce GTX 750 Ti (GPU0) -> GeForce GTX 750 Ti (GPU1) : No
> Peer access from GeForce GTX 750 Ti (GPU1) -> GeForce GTX 750 Ti (GPU0) : No
Two or more GPUs with SM 2.0 or higher capability are required for ./simpleP2P.
Peer to Peer access is not available amongst GPUs in the system, waiving test.
It seems to me that both GPUs are already on the same root complex according to the output from running
nvidia-smi topo -m
GPU0 GPU1 CPU Affinity
GPU0 X PHB 0-7
GPU1 PHB X 0-7
According to the instructions given in https://devtalk.nvidia.com/default/topic/883054/multi-gpu-peer-to-peer-access-failing-on-tesla-k80-/, I then provide these outputs.
dmesg |grep NVRM
[ 14.796257] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 352.39 Fri Aug 14 18:09:10 PDT 2015
nvidia-smi -a
==============NVSMI LOG==============
Timestamp : Fri Nov 4 18:27:22 2016
Driver Version : 352.39
Attached GPUs : 2
GPU 0000:01:00.0
Product Name : GeForce GTX 750 Ti
Product Brand : GeForce
Display Mode : Enabled
Display Active : Enabled
Persistence Mode : Disabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 1920
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-6f650f14-7f8a-2b8a-8e84-1d6664ae433e
Minor Number : 0
VBIOS Version : 82.07.55.00.34
MultiGPU Board : No
Board ID : 0x100
Inforom Version
Image Version : G001.0000.00.01
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x138010DE
Bus Id : 0000:01:00.0
Sub System Id : 0x84BB1043
GPU Link Info
PCIe Generation
Max : 3
Current : 1
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays since reset : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Fan Speed : 29 %
Performance State : P8
Clocks Throttle Reasons : Unknown Error
FB Memory Usage
Total : 2047 MiB
Used : 157 MiB
Free : 1890 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 3 MiB
Free : 253 MiB
Compute Mode : Default
Utilization
Gpu : 2 %
Memory : 3 %
Encoder : 0 %
Decoder : 0 %
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
GPU Current Temp : 31 C
GPU Shutdown Temp : 101 C
GPU Slowdown Temp : 96 C
Power Readings
Power Management : Supported
Power Draw : 0.60 W
Power Limit : 38.50 W
Default Power Limit : 38.50 W
Enforced Power Limit : 38.50 W
Min Power Limit : 30.00 W
Max Power Limit : 38.50 W
Clocks
Graphics : 135 MHz
SM : 135 MHz
Memory : 405 MHz
Applications Clocks
Graphics : 1071 MHz
Memory : 2700 MHz
Default Applications Clocks
Graphics : 1071 MHz
Memory : 2700 MHz
Max Clocks
Graphics : 1346 MHz
SM : 1346 MHz
Memory : 2700 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Processes
Process ID : 1217
Type : G
Name : /usr/bin/X
Used GPU Memory : 116 MiB
Process ID : 2077
Type : G
Name : compiz
Used GPU Memory : 32 MiB
GPU 0000:06:00.0
Product Name : GeForce GTX 750 Ti
Product Brand : GeForce
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 1920
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-2b61dac6-1c2b-8107-d8ad-00b2f7475326
Minor Number : 1
VBIOS Version : 82.07.55.00.34
MultiGPU Board : No
Board ID : 0x600
Inforom Version
Image Version : G001.0000.00.01
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x06
Device : 0x00
Domain : 0x0000
Device Id : 0x138010DE
Bus Id : 0000:06:00.0
Sub System Id : 0x84BB1043
GPU Link Info
PCIe Generation
Max : 2
Current : 1
Link Width
Max : 16x
Current : 4x
Bridge Chip
Type : N/A
Firmware : N/A
Replays since reset : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Fan Speed : 29 %
Performance State : P8
Clocks Throttle Reasons : Unknown Error
FB Memory Usage
Total : 2047 MiB
Used : 7 MiB
Free : 2040 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 1 MiB
Free : 255 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
GPU Current Temp : 28 C
GPU Shutdown Temp : 101 C
GPU Slowdown Temp : 96 C
Power Readings
Power Management : Supported
Power Draw : 0.62 W
Power Limit : 38.50 W
Default Power Limit : 38.50 W
Enforced Power Limit : 38.50 W
Min Power Limit : 30.00 W
Max Power Limit : 38.50 W
Clocks
Graphics : 135 MHz
SM : 135 MHz
Memory : 405 MHz
Applications Clocks
Graphics : 1071 MHz
Memory : 2700 MHz
Default Applications Clocks
Graphics : 1071 MHz
Memory : 2700 MHz
Max Clocks
Graphics : 1346 MHz
SM : 1346 MHz
Memory : 2700 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Processes : None
lspci |grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 750 Ti] (rev a2)
06:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 750 Ti] (rev a2)
lspci -s 01:00.0 -vvv
01:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 750 Ti] (rev a2) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. Device 84bb
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 30
Region 0: Memory at f6000000 (32-bit, non-prefetchable)
Region 1: Memory at e0000000 (64-bit, prefetchable)
Region 3: Memory at f0000000 (64-bit, prefetchable)
Region 5: I/O ports at e000
[virtual] Expansion ROM at f7000000 [disabled]
Capabilities: <access denied>
Kernel driver in use: nvidia
lspci -s 06:00.0 -vvv
06:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 750 Ti] (rev a2) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. Device 84bb
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 31
Region 0: Memory at f4000000 (32-bit, non-prefetchable)
Region 1: Memory at c0000000 (64-bit, prefetchable)
Region 3: Memory at d0000000 (64-bit, prefetchable)
Region 5: I/O ports at c000
[virtual] Expansion ROM at f5000000 [disabled]
Capabilities: <access denied>
Kernel driver in use: nvidia
Could it be that the motherboard is too old?
I also tried setting the BIOS to use the on-board display but simpleP2P still failed.
Is there anything I can do to get peer-to-peer memory access to work?