CUDA and Titan Z

MingVonMongo · May 31, 2014, 8:54pm

Hi everyone! :)

I do have a question concerning the recently release Nvidia Titan Z GPU. Titan Z is technically a Multi-GPU, that has two GK110 on one PCB.

My questions are now (I am not familiar with multi-GPUs at all):

1.) How are those two silicon chips interconnected (sli interface, yes or no?, if not, what kind of interface do they use?)? Or is this detail a company secret?

2.) At the GTC 2014 it was stated, that there is a feature, that is called “power balancing”. This means that the capability, compared to a “2-way SLI” System is extended, right? Is there any kind of whitepaper, which explains this feature in detail and how does it affect CUDA programming?

Thanks!

Robert_Crovella · May 31, 2014, 11:14pm

They are connected via a Gen3 PCIE switch. The switch has 2 downstream ports, each connected to a GPU. The upstream port is connected to the PCIE edge connector on the card. Most recent dual GPU cards since the emergence of PCIE use this general configuration. For most purposes, the switch is “transparent”.

[url]http://www.geforce.com/whats-new/articles/introducing-nvidia-geforce-gtx-titan-z[/url]

"Beneath sits the GTX TITAN Z’s dual GK110 GTX TITAN Black GPUs, and our revolutionary 12-phase dynamic power balancing technology that ensures peak performance at all times.

In the past, exactly half of the power was routed to each GPU. However, depending on variations in cooling, workload and manufacturing process, one GPU may have required more power than the other. Now, we dynamically route power to each GPU, ensuring each is fully optimized"

I don’t know of a whitepaper. It means that the on-board power regulator is no longer limited to half of the power for each GPU, but in some cases can exceed this limit for one GPU, depending on what is going on with the other GPU.

seibert · May 31, 2014, 11:37pm

If the Titan Z is like all the previous dual GPU cards that NVIDIA has made, the card will be contain two completely separate GPU systems connected via an on-card PCI-Express switch to each other and the external PCI-Express interface. The two GPUs will each have their own distinct memory, and any CUDA-related data transfer will have to go through the PCI-Express switch. You will see two CUDA devices, and you will need to use standard multi-GPU programming techniques to use both GPUs at the same time. Kernels can be launched on each device individually, and the GPUs can access each others memory (thanks to unified virtual addressing on 64-bit platforms), but only at PCI-Express speeds.

Basically, as far as CUDA is concerned, I expect the Titan Z to behave as if you had two separate underclocked GTX Titan Black cards with a full x16 PCI-Express connection between them. Honestly, for $3k, there isn’t much of a compelling reason to buy a Titan Z, since a pair of Titan Black cards would only cost $2k and be higher performance. Only when you are genuinely constrained by the number of PCI-Express slots in your computer does such a card become useful. (This has generally been true of all “Gemini” cards, as NVIDIA calls them internally.)

Regarding the power balancing, the minimal information available suggests that if this affects CUDA at all, it will mostly impact the “Boost Clock” behavior. If the Titan Z can dynamically route more power to one GPU or the other, it is possible that you will see the dynamically adjusted clock rate on one GPU go higher than the other one when the power budget allows. This sounds like a great idea for a dual GPU card, but it doesn’t actually give it any capability over two discrete cards, which have more power available to each GPU directly from the PSU.

robosmith · June 1, 2014, 6:40am

They are connected by an SLI bus, but memory access only communicates over PCIE?

Is that true for all SLI connected cards?

MingVonMongo · June 1, 2014, 7:15am

txbob, seibert, robosmith, thanks for your replies.

But what is now true? Two of you are stating its an on-Board PCIe switch. One is stating that it is an SLI bus. I am confussed…

Robert_Crovella · June 1, 2014, 2:50pm

Why do you assume it has to be one or the other? SLI and PCIE are orthogonal. Both interconnects can exist, and may serve different purposes. From a connection standpoint, I reiterate, Titan Z is very much like other dual PCIE GPU cards that NVIDIA has produced, meaning the principal interconnect is a PCIE switch. SLI is not really relevant to CUDA anyway. All CUDA data traffic between host and GPU, or GPU-GPU via P2P flows over PCIE.

seibert · June 1, 2014, 3:13pm

Also, you should note that robosmith’s statement is in the form of a question. :)

Many people have asked what the SLI connector actually does, and not gotten a particularly clear answer from NVIDIA. If you have an SLI cable connecting two GPUs, they still appear as two separate GPUs to CUDA, and there are no SLI-related CUDA function calls. If there is a benefit to having an SLI connection between two cards for CUDA programs, NVIDIA has kept it secret from everyone. :)

For 3D rendering tasks, I’ve assumed that the point of SLI is to offer a low-latency link for coordination (clock sync, pushing parts of frame buffers around) that other devices in the system, like the motherboard chipset, can’t interfere with.

MingVonMongo · June 1, 2014, 4:44pm

txbob, seibert and robosmith: Thanks for all your comments and insight! My question is now answered! Thank you!

Topic		Replies	Views
Transfer data from PCIe device to GPU memory? Why not incorporate the SLI bridge into CUDA? CUDA Programming and Performance	15	11531	March 28, 2010
GTX Titan Z CUDA support CUDA Setup and Installation	2	1590	June 20, 2014
How will 9800 GX2 appear to CUDA? CUDA Programming and Performance	15	10870	March 19, 2008
GTX295 multi GPU programming CUDA Programming and Performance	22	10654	July 9, 2009
Confused about GTX Titan Z Peer-To-Peer (P2) capability CUDA Programming and Performance	19	5072	February 23, 2015
peer-to-peer between different type of cards CUDA Programming and Performance	14	4159	June 17, 2015
CUDA with SLI CUDA Programming and Performance	11	4482	October 13, 2010
SLI use with CUDA programming General CUDA programming CUDA Programming and Performance	9	73949	July 29, 2014
simpleP2P example and multi-GPU network training causes system freeze and ERR in nvidia-smi Linux	7	3758	October 14, 2021
Single dual-GPU card vs. 2x single GPU cards CUDA Programming and Performance	8	4127	August 15, 2014

CUDA and Titan Z

Related topics