Issues on P2P transfer with Tesla K80

I’m testing Amber14 (Molecular Dynamics application) on a workstation with four Tesla K80s (logically 8gpus).

I have two issues:

  1. GPUDirect2.0 P2P transfer function is not available between any physical K80 cards.
    The P2P transfer function is only available between two GK210 gpus in each K80 card.
    (I doubt that PCI implementation of E5-2600 is not enough for handling P2P transfer on multi-layer PCIe switch.)

  2. When I run Amber14 using two GK210 gpus in one K80 card with P2P transfer function,
    it runs, but its performance is extremely slow.
    Ex. Amber14 in “DHFR NPT PME 4fs” case:
    without P2P (MPI transfer): Performance(ns/day) is 259.12
    with P2P: Performance(ns/day) is 6.82

I think Amber14 is not the root cause of these issues,
because I’ve also encountered same phenomenon on another P2P transfer-enabled CUDA application.

Is there anyone with similar issues ?

Here’s specification of my workstation:
Machine: Supermicro 7048GR-TR
CPU: Intel Xeon E5-2698v3 *2
Memory: 128GB DDR4 2133MHz
GPU: Tesla K80 *4

Best regards,

Quoting txbob from an older forum thread: “A requirement for P2P (GPU to GPU transfers in the same server node) is that both GPUs in question must be on the same PCIE root complex. This effectively means they must both be plugged into slots that are serviced by the same CPU socket.”

What is the output of lspci -t ?

Thank you for reply.

Unfortunately, at right now, two of four Tesla K80s are rented,
so I paste output of lspci -t with the rest two Tesla K80s:

-+-[0000:ff]-+-08.0
 |           +-08.2
 |           +-08.3
 |           +-09.0
 |           +-09.2
 |           +-09.3
 |           +-0b.0
 |           +-0b.1
 |           +-0b.2
 |           +-0c.0
 |           +-0c.1
 |           +-0c.2
 |           +-0c.3
 |           +-0c.4
 |           +-0c.5
 |           +-0c.6
 |           +-0c.7
 |           +-0d.0
 |           +-0d.1
 |           +-0d.2
 |           +-0d.3
 |           +-0d.4
 |           +-0d.5
 |           +-0d.6
 |           +-0d.7
 |           +-0f.0
 |           +-0f.1
 |           +-0f.2
 |           +-0f.3
 |           +-0f.4
 |           +-0f.5
 |           +-0f.6
 |           +-10.0
 |           +-10.1
 |           +-10.5
 |           +-10.6
 |           +-10.7
 |           +-12.0
 |           +-12.1
 |           +-12.4
 |           +-12.5
 |           +-13.0
 |           +-13.1
 |           +-13.2
 |           +-13.3
 |           +-13.6
 |           +-13.7
 |           +-14.0
 |           +-14.1
 |           +-14.2
 |           +-14.3
 |           +-14.4
 |           +-14.5
 |           +-14.6
 |           +-14.7
 |           +-16.0
 |           +-16.1
 |           +-16.2
 |           +-16.3
 |           +-16.6
 |           +-16.7
 |           +-17.0
 |           +-17.1
 |           +-17.2
 |           +-17.3
 |           +-17.4
 |           +-17.5
 |           +-17.6
 |           +-17.7
 |           +-1e.0
 |           +-1e.1
 |           +-1e.2
 |           +-1e.3
 |           +-1e.4
 |           +-1f.0
 |           \-1f.2
 +-[0000:80]-+-00.0-[81]--+-00.0
 |           |            \-00.1
 |           +-04.0
 |           +-04.1
 |           +-04.2
 |           +-04.3
 |           +-04.4
 |           +-04.5
 |           +-04.6
 |           +-04.7
 |           +-05.0
 |           +-05.1
 |           +-05.2
 |           \-05.4
 +-[0000:7f]-+-08.0
 |           +-08.2
 |           +-08.3
 |           +-09.0
 |           +-09.2
 |           +-09.3
 |           +-0b.0
 |           +-0b.1
 |           +-0b.2
 |           +-0c.0
 |           +-0c.1
 |           +-0c.2
 |           +-0c.3
 |           +-0c.4
 |           +-0c.5
 |           +-0c.6
 |           +-0c.7
 |           +-0d.0
 |           +-0d.1
 |           +-0d.2
 |           +-0d.3
 |           +-0d.4
 |           +-0d.5
 |           +-0d.6
 |           +-0d.7
 |           +-0f.0
 |           +-0f.1
 |           +-0f.2
 |           +-0f.3
 |           +-0f.4
 |           +-0f.5
 |           +-0f.6
 |           +-10.0
 |           +-10.1
 |           +-10.5
 |           +-10.6
 |           +-10.7
 |           +-12.0
 |           +-12.1
 |           +-12.4
 |           +-12.5
 |           +-13.0
 |           +-13.1
 |           +-13.2
 |           +-13.3
 |           +-13.6
 |           +-13.7
 |           +-14.0
 |           +-14.1
 |           +-14.2
 |           +-14.3
 |           +-14.4
 |           +-14.5
 |           +-14.6
 |           +-14.7
 |           +-16.0
 |           +-16.1
 |           +-16.2
 |           +-16.3
 |           +-16.6
 |           +-16.7
 |           +-17.0
 |           +-17.1
 |           +-17.2
 |           +-17.3
 |           +-17.4
 |           +-17.5
 |           +-17.6
 |           +-17.7
 |           +-1e.0
 |           +-1e.1
 |           +-1e.2
 |           +-1e.3
 |           +-1e.4
 |           +-1f.0
 |           \-1f.2
 \-[0000:00]-+-00.0
             +-01.0-[01]--
             +-02.0-[02-05]----00.0-[03-05]--+-08.0-[04]----00.0
             |                               \-10.0-[05]----00.0
             +-03.0-[06-09]----00.0-[07-09]--+-08.0-[08]----00.0
             |                               \-10.0-[09]----00.0
             +-04.0
             +-04.1
             +-04.2
             +-04.3
             +-04.4
             +-04.5
             +-04.6
             +-04.7
             +-05.0
             +-05.1
             +-05.2
             +-05.4
             +-11.0
             +-11.4
             +-14.0
             +-16.0
             +-16.1
             +-1a.0
             +-1b.0
             +-1c.0-[0a]--
             +-1c.3-[0b-0c]----00.0-[0c]----00.0
             +-1c.4-[0d-45]--
             +-1d.0
             +-1f.0
             +-1f.2
             +-1f.3
             \-1f.6

Here’s output of lspci -tv :

-+-[0000:ff]-+-08.0  Intel Corporation Haswell-E QPI Link 0
 |           +-08.2  Intel Corporation Haswell-E QPI Link 0
 |           +-08.3  Intel Corporation Haswell-E QPI Link 0
 |           +-09.0  Intel Corporation Haswell-E QPI Link 1
 |           +-09.2  Intel Corporation Haswell-E QPI Link 1
 |           +-09.3  Intel Corporation Haswell-E QPI Link 1
 |           +-0b.0  Intel Corporation Haswell-E R3 QPI Link 0 & 1 Monitoring
 |           +-0b.1  Intel Corporation Haswell-E R3 QPI Link 0 & 1 Monitoring
 |           +-0b.2  Intel Corporation Haswell-E R3 QPI Link 0 & 1 Monitoring
 |           +-0c.0  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.1  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.2  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.3  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.4  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.5  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.6  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.7  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.0  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.1  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.2  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.3  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.4  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.5  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.6  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.7  Intel Corporation Haswell-E Unicast Registers
 |           +-0f.0  Intel Corporation Haswell-E Buffered Ring Agent
 |           +-0f.1  Intel Corporation Haswell-E Buffered Ring Agent
 |           +-0f.2  Intel Corporation Haswell-E Buffered Ring Agent
 |           +-0f.3  Intel Corporation Haswell-E Buffered Ring Agent
 |           +-0f.4  Intel Corporation Haswell-E System Address Decoder & Broadcast Registers
 |           +-0f.5  Intel Corporation Haswell-E System Address Decoder & Broadcast Registers
 |           +-0f.6  Intel Corporation Haswell-E System Address Decoder & Broadcast Registers
 |           +-10.0  Intel Corporation Haswell-E PCIe Ring Interface
 |           +-10.1  Intel Corporation Haswell-E PCIe Ring Interface
 |           +-10.5  Intel Corporation Haswell-E Scratchpad & Semaphore Registers
 |           +-10.6  Intel Corporation Haswell-E Scratchpad & Semaphore Registers
 |           +-10.7  Intel Corporation Haswell-E Scratchpad & Semaphore Registers
 |           +-12.0  Intel Corporation Haswell-E Home Agent 0
 |           +-12.1  Intel Corporation Haswell-E Home Agent 0
 |           +-12.4  Intel Corporation Haswell-E Home Agent 1
 |           +-12.5  Intel Corporation Haswell-E Home Agent 1
 |           +-13.0  Intel Corporation Haswell-E Integrated Memory Controller 0 Target Address, Thermal & RAS Registers
 |           +-13.1  Intel Corporation Haswell-E Integrated Memory Controller 0 Target Address, Thermal & RAS Registers
 |           +-13.2  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel Target Address Decoder
 |           +-13.3  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel Target Address Decoder
 |           +-13.6  Intel Corporation Haswell-E DDRIO Channel 0/1 Broadcast
 |           +-13.7  Intel Corporation Haswell-E DDRIO Global Broadcast
 |           +-14.0  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel 0 Thermal Control
 |           +-14.1  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel 1 Thermal Control
 |           +-14.2  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel 0 ERROR Registers
 |           +-14.3  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel 1 ERROR Registers
 |           +-14.4  Intel Corporation Haswell-E DDRIO (VMSE) 0 & 1
 |           +-14.5  Intel Corporation Haswell-E DDRIO (VMSE) 0 & 1
 |           +-14.6  Intel Corporation Haswell-E DDRIO (VMSE) 0 & 1
 |           +-14.7  Intel Corporation Haswell-E DDRIO (VMSE) 0 & 1
 |           +-16.0  Intel Corporation Haswell-E Integrated Memory Controller 1 Target Address, Thermal & RAS Registers
 |           +-16.1  Intel Corporation Haswell-E Integrated Memory Controller 1 Target Address, Thermal & RAS Registers
 |           +-16.2  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel Target Address Decoder
 |           +-16.3  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel Target Address Decoder
 |           +-16.6  Intel Corporation Haswell-E DDRIO Channel 2/3 Broadcast
 |           +-16.7  Intel Corporation Haswell-E DDRIO Global Broadcast
 |           +-17.0  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel 0 Thermal Control
 |           +-17.1  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel 1 Thermal Control
 |           +-17.2  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel 0 ERROR Registers
 |           +-17.3  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel 1 ERROR Registers
 |           +-17.4  Intel Corporation Haswell-E DDRIO (VMSE) 2 & 3
 |           +-17.5  Intel Corporation Haswell-E DDRIO (VMSE) 2 & 3
 |           +-17.6  Intel Corporation Haswell-E DDRIO (VMSE) 2 & 3
 |           +-17.7  Intel Corporation Haswell-E DDRIO (VMSE) 2 & 3
 |           +-1e.0  Intel Corporation Haswell-E Power Control Unit
 |           +-1e.1  Intel Corporation Haswell-E Power Control Unit
 |           +-1e.2  Intel Corporation Haswell-E Power Control Unit
 |           +-1e.3  Intel Corporation Haswell-E Power Control Unit
 |           +-1e.4  Intel Corporation Haswell-E Power Control Unit
 |           +-1f.0  Intel Corporation Haswell-E VCU
 |           \-1f.2  Intel Corporation Haswell-E VCU
 +-[0000:80]-+-00.0-[81]--+-00.0  Intel Corporation I350 Gigabit Network Connection
 |           |            \-00.1  Intel Corporation I350 Gigabit Network Connection
 |           +-04.0  Intel Corporation Haswell-E DMA Channel 0
 |           +-04.1  Intel Corporation Haswell-E DMA Channel 1
 |           +-04.2  Intel Corporation Haswell-E DMA Channel 2
 |           +-04.3  Intel Corporation Haswell-E DMA Channel 3
 |           +-04.4  Intel Corporation Haswell-E DMA Channel 4
 |           +-04.5  Intel Corporation Haswell-E DMA Channel 5
 |           +-04.6  Intel Corporation Haswell-E DMA Channel 6
 |           +-04.7  Intel Corporation Haswell-E DMA Channel 7
 |           +-05.0  Intel Corporation Haswell-E Address Map, VTd_Misc, System Management
 |           +-05.1  Intel Corporation Haswell-E Hot Plug
 |           +-05.2  Intel Corporation Haswell-E RAS, Control Status and Global Errors
 |           \-05.4  Intel Corporation Haswell-E I/O Apic
 +-[0000:7f]-+-08.0  Intel Corporation Haswell-E QPI Link 0
 |           +-08.2  Intel Corporation Haswell-E QPI Link 0
 |           +-08.3  Intel Corporation Haswell-E QPI Link 0
 |           +-09.0  Intel Corporation Haswell-E QPI Link 1
 |           +-09.2  Intel Corporation Haswell-E QPI Link 1
 |           +-09.3  Intel Corporation Haswell-E QPI Link 1
 |           +-0b.0  Intel Corporation Haswell-E R3 QPI Link 0 & 1 Monitoring
 |           +-0b.1  Intel Corporation Haswell-E R3 QPI Link 0 & 1 Monitoring
 |           +-0b.2  Intel Corporation Haswell-E R3 QPI Link 0 & 1 Monitoring
 |           +-0c.0  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.1  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.2  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.3  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.4  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.5  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.6  Intel Corporation Haswell-E Unicast Registers
 |           +-0c.7  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.0  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.1  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.2  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.3  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.4  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.5  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.6  Intel Corporation Haswell-E Unicast Registers
 |           +-0d.7  Intel Corporation Haswell-E Unicast Registers
 |           +-0f.0  Intel Corporation Haswell-E Buffered Ring Agent
 |           +-0f.1  Intel Corporation Haswell-E Buffered Ring Agent
 |           +-0f.2  Intel Corporation Haswell-E Buffered Ring Agent
 |           +-0f.3  Intel Corporation Haswell-E Buffered Ring Agent
 |           +-0f.4  Intel Corporation Haswell-E System Address Decoder & Broadcast Registers
 |           +-0f.5  Intel Corporation Haswell-E System Address Decoder & Broadcast Registers
 |           +-0f.6  Intel Corporation Haswell-E System Address Decoder & Broadcast Registers
 |           +-10.0  Intel Corporation Haswell-E PCIe Ring Interface
 |           +-10.1  Intel Corporation Haswell-E PCIe Ring Interface
 |           +-10.5  Intel Corporation Haswell-E Scratchpad & Semaphore Registers
 |           +-10.6  Intel Corporation Haswell-E Scratchpad & Semaphore Registers
 |           +-10.7  Intel Corporation Haswell-E Scratchpad & Semaphore Registers
 |           +-12.0  Intel Corporation Haswell-E Home Agent 0
 |           +-12.1  Intel Corporation Haswell-E Home Agent 0
 |           +-12.4  Intel Corporation Haswell-E Home Agent 1
 |           +-12.5  Intel Corporation Haswell-E Home Agent 1
 |           +-13.0  Intel Corporation Haswell-E Integrated Memory Controller 0 Target Address, Thermal & RAS Registers
 |           +-13.1  Intel Corporation Haswell-E Integrated Memory Controller 0 Target Address, Thermal & RAS Registers
 |           +-13.2  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel Target Address Decoder
 |           +-13.3  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel Target Address Decoder
 |           +-13.6  Intel Corporation Haswell-E DDRIO Channel 0/1 Broadcast
 |           +-13.7  Intel Corporation Haswell-E DDRIO Global Broadcast
 |           +-14.0  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel 0 Thermal Control
 |           +-14.1  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel 1 Thermal Control
 |           +-14.2  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel 0 ERROR Registers
 |           +-14.3  Intel Corporation Haswell-E Integrated Memory Controller 0 Channel 1 ERROR Registers
 |           +-14.4  Intel Corporation Haswell-E DDRIO (VMSE) 0 & 1
 |           +-14.5  Intel Corporation Haswell-E DDRIO (VMSE) 0 & 1
 |           +-14.6  Intel Corporation Haswell-E DDRIO (VMSE) 0 & 1
 |           +-14.7  Intel Corporation Haswell-E DDRIO (VMSE) 0 & 1
 |           +-16.0  Intel Corporation Haswell-E Integrated Memory Controller 1 Target Address, Thermal & RAS Registers
 |           +-16.1  Intel Corporation Haswell-E Integrated Memory Controller 1 Target Address, Thermal & RAS Registers
 |           +-16.2  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel Target Address Decoder
 |           +-16.3  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel Target Address Decoder
 |           +-16.6  Intel Corporation Haswell-E DDRIO Channel 2/3 Broadcast
 |           +-16.7  Intel Corporation Haswell-E DDRIO Global Broadcast
 |           +-17.0  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel 0 Thermal Control
 |           +-17.1  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel 1 Thermal Control
 |           +-17.2  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel 0 ERROR Registers
 |           +-17.3  Intel Corporation Haswell-E Integrated Memory Controller 1 Channel 1 ERROR Registers
 |           +-17.4  Intel Corporation Haswell-E DDRIO (VMSE) 2 & 3
 |           +-17.5  Intel Corporation Haswell-E DDRIO (VMSE) 2 & 3
 |           +-17.6  Intel Corporation Haswell-E DDRIO (VMSE) 2 & 3
 |           +-17.7  Intel Corporation Haswell-E DDRIO (VMSE) 2 & 3
 |           +-1e.0  Intel Corporation Haswell-E Power Control Unit
 |           +-1e.1  Intel Corporation Haswell-E Power Control Unit
 |           +-1e.2  Intel Corporation Haswell-E Power Control Unit
 |           +-1e.3  Intel Corporation Haswell-E Power Control Unit
 |           +-1e.4  Intel Corporation Haswell-E Power Control Unit
 |           +-1f.0  Intel Corporation Haswell-E VCU
 |           \-1f.2  Intel Corporation Haswell-E VCU
 \-[0000:00]-+-00.0  Intel Corporation Haswell-E DMI2
             +-01.0-[01]--
             +-02.0-[02-05]----00.0-[03-05]--+-08.0-[04]----00.0  NVIDIA Corporation Device 102d
             |                               \-10.0-[05]----00.0  NVIDIA Corporation Device 102d
             +-03.0-[06-09]----00.0-[07-09]--+-08.0-[08]----00.0  NVIDIA Corporation Device 102d
             |                               \-10.0-[09]----00.0  NVIDIA Corporation Device 102d
             +-04.0  Intel Corporation Haswell-E DMA Channel 0
             +-04.1  Intel Corporation Haswell-E DMA Channel 1
             +-04.2  Intel Corporation Haswell-E DMA Channel 2
             +-04.3  Intel Corporation Haswell-E DMA Channel 3
             +-04.4  Intel Corporation Haswell-E DMA Channel 4
             +-04.5  Intel Corporation Haswell-E DMA Channel 5
             +-04.6  Intel Corporation Haswell-E DMA Channel 6
             +-04.7  Intel Corporation Haswell-E DMA Channel 7
             +-05.0  Intel Corporation Haswell-E Address Map, VTd_Misc, System Management
             +-05.1  Intel Corporation Haswell-E Hot Plug
             +-05.2  Intel Corporation Haswell-E RAS, Control Status and Global Errors
             +-05.4  Intel Corporation Haswell-E I/O Apic
             +-11.0  Intel Corporation Wellsburg SPSR
             +-11.4  Intel Corporation Wellsburg sSATA Controller [AHCI mode]
             +-14.0  Intel Corporation Wellsburg USB xHCI Host Controller
             +-16.0  Intel Corporation Wellsburg MEI Controller #1
             +-16.1  Intel Corporation Wellsburg MEI Controller #2
             +-1a.0  Intel Corporation Wellsburg USB Enhanced Host Controller #2
             +-1b.0  Intel Corporation Wellsburg HD Audio Controller
             +-1c.0-[0a]--
             +-1c.3-[0b-0c]----00.0-[0c]----00.0  ASPEED Technology, Inc. ASPEED Graphics Family
             +-1c.4-[0d-45]--
             +-1d.0  Intel Corporation Wellsburg USB Enhanced Host Controller #1
             +-1f.0  Intel Corporation Wellsburg LPC Controller
             +-1f.2  Intel Corporation Wellsburg 6-Port SATA Controller [AHCI mode]
             +-1f.3  Intel Corporation Wellsburg SMBus Controller
             \-1f.6  Intel Corporation Wellsburg Thermal Subsystem

Try updating your supermicro motherboard:

Supermicro 7048GR-TR

with the latest SBIOS version from Supermicro. Go to this page:

http://www.supermicro.com/products/system/4u/7048/sys-7048gr-tr.cfm

click on the BIOS link.

Then install the X10DRG5_107.zip bios that is listed there (Revision R 1.0b)

Then repeat your P2P tests.

Hello,
firstly please forgive the slightly commercial tone of this reply, but as our hardware is being used by Amber folks I thought it relevant.

the type of architecture that you’re looking for is one that our company has been working on for several years. We are now on the recommended hardware list for 8 GPU cards (k80 would be 16 gpu devices) all on a single root complex like TxBob mentioned.

here is a cut/paste of our lspci -tvv for our 8 GPU card box. we have a 4 GPU card server as well. forgive the spacing, i’m not sure how to do that nice paste of text.

each set of 4 GPU cards is connected to our SR3514 (5x16 switch riser, 1 connection to the host, 4 to the cards)
we put either 1 or 2 groups of 4 cards in a single server, and as you can see they are all connected to the same Local root, (02.0)
now all 16 devices (384 GB of Graphics Ram and 49k Cuda Cores) can be GPUDirect peers with each other (up to the 8 limit per peer group)

we’re at SC15 in Austin this week at booth 1627 Cirrascale

mark skinner
760-212-9five9five

Local Root
±02.0-[01-12]-00.0-[02-12] ±00.0-[03-06]----00.0-[04-06]–±08.0-[05]----00.0 NVIDIA Corporation Device [10de:102d]
| -10.0-[06]----00.0 NVIDIA Corporation Device [10de:102d]
| ±04.0-[07-0a]----00.0-[08-0a]–±08.0-[09]----00.0 NVIDA Corporation Device [10de:102d]
| -10.0-[0a]----00.0 NVIDIA Corporation Device [10de:102d]
| ±08.0-[0b-0e]----00.0-[0c-0e]–±08.0-[0d]----00.0 NVIDIA Corporation Device [10de:102d]
| 0.0-[0e]----00.0 NVIDIA Corporation Device [10de:102d]
| -0c.0-[0f-12]----00.0-[10-12]–±08.0-[11]----00.0 NVIDIA Corporation Device [10de:102d]
| -10.0-[12]----00.0 NVIDIA Corporation Device [10de:102d]
|
±03.0-[13-24]----00.0-[14-24]–±00.0-[15-18]----00.0-[16-18]–±08.0-[17]----00.0 NVIDIA Corporation Device [10de:102d]
| | -10.0-[18]----00.0 NVIDIA Corporation Device [10de:102d]
| ±04.0-[19-1c]----00.0-[1a-1c]–±08.0-[1b]----00.0 NVIDIA Corporation Device [10de:102d]
| -10.0-[1c]----00.0 NVIDIA Corporation Device [10de:102d]
| ±08.0-[1d-20]----00.0-[1e-20]–±08.0-[1f]----00.0 NVIDIA Corporation Device [10de:102d]
| | -10.0-[20]----00.0 NVIDIA Corporation Device [10de:102d]
| -0c.0-[21-24]----00.0-[22-24]–±08.0-[23]----00.0 NVIDIA Corporation Device [10de:102d]
| -10.0-[24]----00.0 NVIDIA Corporation Device [10de:102d]
i’ll attach a jpg that is a bit clearer.