Basic question about 2-in-1 GPUs (ie GTX Titan Z or K80)

It seems to the naive eye that the only advantage to these 2-in-1 GPUs is the fact that you get the computational abilities of 2 GPUs in one PCI-E 3.0 x16 slot.

I have never used such a GPU, so my question is what is the memory bandwidth transfer speed between the two distinct GPUs in the same case?

Since most commercial PC’s at best offer two PCI-E 3.0 slots at full x16 speed (assuming 40 lanes for CPU), in such a machine is this the only way to have more than two GPUs which can get that full pinned x16 data rate across the bus?

Also anyone out there have experience with either the Titan-Z or the K80 ?

With modern versions of these dual-GPU devices, the maximum memory transfer speed between devices is approximately limited by the PCIE switch between them. In modern versions, this is a PCIE Gen3 switch, so peer-to-peer transfers can occur at about 10-12GB/s, per direction.

Your second question is not clear to me.

I have experience with K80, and I have used Titan-Z briefly.

Here is a portion of the output of the p2pBandwidthLatencyTest test for a single K80:

Unidirectional P2P=Enabled Bandwidth Matrix (GB/s)
D\D 0 1
0 86.62 12.39
1 12.39 86.71

86GB/s represents the speed when the source and destination are the same device.

12.39GB/s is the transfer speed when going from one device to the other, in the same K80 module.

p2pBandwidthLatencyTest:
http://docs.nvidia.com/cuda/cuda-samples/index.html#peer-to-peer-bandwidth-latency-test-with-multi-gpus

That is essentially what I was asking, sorry if it was unclear.

Thanks