CUDA and Titan Z

Hi everyone! :)

I do have a question concerning the recently release Nvidia Titan Z GPU. Titan Z is technically a Multi-GPU, that has two GK110 on one PCB.

My questions are now (I am not familiar with multi-GPUs at all):

1.) How are those two silicon chips interconnected (sli interface, yes or no?, if not, what kind of interface do they use?)? Or is this detail a company secret?

2.) At the GTC 2014 it was stated, that there is a feature, that is called “power balancing”. This means that the capability, compared to a “2-way SLI” System is extended, right? Is there any kind of whitepaper, which explains this feature in detail and how does it affect CUDA programming?


  1. They are connected via a Gen3 PCIE switch. The switch has 2 downstream ports, each connected to a GPU. The upstream port is connected to the PCIE edge connector on the card. Most recent dual GPU cards since the emergence of PCIE use this general configuration. For most purposes, the switch is “transparent”.

"Beneath sits the GTX TITAN Z’s dual GK110 GTX TITAN Black GPUs, and our revolutionary 12-phase dynamic power balancing technology that ensures peak performance at all times.

In the past, exactly half of the power was routed to each GPU. However, depending on variations in cooling, workload and manufacturing process, one GPU may have required more power than the other. Now, we dynamically route power to each GPU, ensuring each is fully optimized"

I don’t know of a whitepaper. It means that the on-board power regulator is no longer limited to half of the power for each GPU, but in some cases can exceed this limit for one GPU, depending on what is going on with the other GPU.

If the Titan Z is like all the previous dual GPU cards that NVIDIA has made, the card will be contain two completely separate GPU systems connected via an on-card PCI-Express switch to each other and the external PCI-Express interface. The two GPUs will each have their own distinct memory, and any CUDA-related data transfer will have to go through the PCI-Express switch. You will see two CUDA devices, and you will need to use standard multi-GPU programming techniques to use both GPUs at the same time. Kernels can be launched on each device individually, and the GPUs can access each others memory (thanks to unified virtual addressing on 64-bit platforms), but only at PCI-Express speeds.

Basically, as far as CUDA is concerned, I expect the Titan Z to behave as if you had two separate underclocked GTX Titan Black cards with a full x16 PCI-Express connection between them. Honestly, for $3k, there isn’t much of a compelling reason to buy a Titan Z, since a pair of Titan Black cards would only cost $2k and be higher performance. Only when you are genuinely constrained by the number of PCI-Express slots in your computer does such a card become useful. (This has generally been true of all “Gemini” cards, as NVIDIA calls them internally.)

Regarding the power balancing, the minimal information available suggests that if this affects CUDA at all, it will mostly impact the “Boost Clock” behavior. If the Titan Z can dynamically route more power to one GPU or the other, it is possible that you will see the dynamically adjusted clock rate on one GPU go higher than the other one when the power budget allows. This sounds like a great idea for a dual GPU card, but it doesn’t actually give it any capability over two discrete cards, which have more power available to each GPU directly from the PSU.

They are connected by an SLI bus, but memory access only communicates over PCIE?

Is that true for all SLI connected cards?

txbob, seibert, robosmith, thanks for your replies.

But what is now true? Two of you are stating its an on-Board PCIe switch. One is stating that it is an SLI bus. I am confussed…

Why do you assume it has to be one or the other? SLI and PCIE are orthogonal. Both interconnects can exist, and may serve different purposes. From a connection standpoint, I reiterate, Titan Z is very much like other dual PCIE GPU cards that NVIDIA has produced, meaning the principal interconnect is a PCIE switch. SLI is not really relevant to CUDA anyway. All CUDA data traffic between host and GPU, or GPU-GPU via P2P flows over PCIE.

Also, you should note that robosmith’s statement is in the form of a question. :)

Many people have asked what the SLI connector actually does, and not gotten a particularly clear answer from NVIDIA. If you have an SLI cable connecting two GPUs, they still appear as two separate GPUs to CUDA, and there are no SLI-related CUDA function calls. If there is a benefit to having an SLI connection between two cards for CUDA programs, NVIDIA has kept it secret from everyone. :)

For 3D rendering tasks, I’ve assumed that the point of SLI is to offer a low-latency link for coordination (clock sync, pushing parts of frame buffers around) that other devices in the system, like the motherboard chipset, can’t interfere with.

txbob, seibert and robosmith: Thanks for all your comments and insight! My question is now answered! Thank you!