Does Titan RTX support P2P access w/o NVLink?

skywalkeria4ke · February 1, 2019, 12:45pm

GPU0	GPU1	GPU2	GPU3	GPU4	GPU5	GPU6	GPU7	
 GPU0	X	CNS	CNS	CNS	CNS	CNS	CNS	CNS	
 GPU1	CNS	X	CNS	CNS	CNS	CNS	CNS	CNS	
 GPU2	CNS	CNS	X	CNS	CNS	CNS	CNS	CNS	
 GPU3	CNS	CNS	CNS	X	CNS	CNS	CNS	CNS	
 GPU4	CNS	CNS	CNS	CNS	X	CNS	CNS	CNS	
 GPU5	CNS	CNS	CNS	CNS	CNS	X	CNS	CNS	
 GPU6	CNS	CNS	CNS	CNS	CNS	CNS	X	CNS	
 GPU7	CNS	CNS	CNS	CNS	CNS	CNS	CNS	X	

Legend:

  X    = Self
  OK   = Status Ok
  CNS  = Chipset not supported
  GNS  = GPU not supported
  TNS  = Topology not supported
  NS   = Not supported
  U    = Unknown

It says “Chipset not supported” … but this MB supports other Titan series …
Anyone knows what makes this difference ?

Here is specs. of my machine:

Chipset: SuperMicro X11DPG-OT-CPU

CPU: two Xeon 12 Cores 24 Threads CPUs

GPU: eight Titan RTX

OS: Ubuntu server 18.04 LTS

Driver: 415.27

CUDA: 10, CUDNN, NCCL installed.

Lezz · April 17, 2019, 2:26am

Sadly, it looks like Nvidia is not going to answer this.

Just for reference, here’s another similar question. With at least some sort of answer from Nvidia…

Robert_Crovella · April 17, 2019, 9:49am

on Titan RTX and also Turing-family GeForce GPUs such as 2080/2080Ti, P2P is only supported if/when the NVLink bridge is in place (i.e. only over NVLink). For Turing family GeForce GPUs without a NVLink bridge option, P2P is not supported.

What NVIDIA supports is determined by the tool already run. The behavior may vary by GPU, and is not strictly a function of the motherboard. Since I am not providing a matrix that covers every question for every case, the best determinant of features like this is what the tool reports.

NVIDIA may choose to design a product that supports P2P over NVLink but not PCIE. That is what happened with Titan RTX. I won’t be able to give further rationalization or explanation.

In particular, questions like “why is it this way?” I am not able to answer.

For example, “Anyone knows what makes this difference ?” is a question I wouldn’t be able to answer, other than by saying that is the way the product is designed.

alexey.kostin · May 10, 2019, 8:03am

Thank you Robert for clarifications.

We are going to get NVLink bridges to pair the cards. I will post our results when we test them.

sherwoac · August 15, 2019, 3:03pm

Hey Alexey, any update on this, did you test the 2x2xTitan RTX setup?

ryork · August 15, 2019, 9:55pm

I would like to know how this worked out too.

I have two Titan RTXs with an NVlink between them. From what I can see, each card can accommodate just one NVlink so you would end up with pairs of them connected. As it turns out, I doubt I will need any connection between them for my purposes but it’s nice to have in case we do need it.

sherwoac · August 15, 2019, 11:13pm

Yeah, mebe it’s just worth restating what the impact of p2p memory copy is? I’m interested in tensorflow and pytorch deep learning over multiple GPUs. Atm we are 1080tis using pcie, but soon upgrading to Turing/RTX cards.

Any info much appreciated as atm it doesn’t look like it’s worth upgrading, but it could be just a poor appreciation of the impact of p2p.

Robert_Crovella · November 23, 2019, 3:06pm

https://devtalk.nvidia.com/default/topic/1066863/cuda-programming-and-performance/using-nvlink-bridge-makes-big-impact-on-the-training-speed-on-2x-rtx-2080-multi-gpu-training-with-p2p-/

https://www.servethehome.com/dual-nvidia-titan-rtx-review-with-nvlink/

saareliad · December 15, 2019, 8:58am

I

Robert_Crovella:

on Titan RTX and also Turing-family GeForce GPUs such as 2080/2080Ti, P2P is only supported if/when the NVLink bridge is in place (i.e. only over NVLink). For Turing family GeForce GPUs without a NVLink bridge option, P2P is not supported.

What NVIDIA supports is determined by the tool already run. The behavior may vary by GPU, and is not strictly a function of the motherboard. Since I am not providing a matrix that covers every question for every case, the best determinant of features like this is what the tool reports.

NVIDIA may choose to design a product that supports P2P over NVLink but not PCIE. That is what happened with Titan RTX. I won’t be able to give further rationalization or explanation.

In particular, questions like “why is it this way?” I am not able to answer.

For example, “Anyone knows what makes this difference ?” is a question I wouldn’t be able to answer, other than by saying that is the way the product is designed.

On what level this behavior is defined?
If its part of the driver we (external developers) may be able to change it, if its part of internal architecture - we probably won’t.

Not everyone needs NVlinks, its a product to increase communication BW, but BW is not always the bottleneck.
When we don’t want to use IB, or when BW is not the bottleneck, NVlinks improve performance sometimes only by 1%, therefore may not worth the price.
By doing this design NVidia coupled Turing-family GeForce GPUs to another (sometimes unnecessary) product.

I wonder which HW should we buy if we want to use use P2P with “commodity” communication.

RoyInHouston · December 15, 2019, 10:17pm

I

Robert_Crovella:

on Titan RTX and also Turing-family GeForce GPUs such as 2080/2080Ti, P2P is only supported if/when the NVLink bridge is in place (i.e. only over NVLink). For Turing family GeForce GPUs without a NVLink bridge option, P2P is not supported.

What NVIDIA supports is determined by the tool already run. The behavior may vary by GPU, and is not strictly a function of the motherboard. Since I am not providing a matrix that covers every question for every case, the best determinant of features like this is what the tool reports.

NVIDIA may choose to design a product that supports P2P over NVLink but not PCIE. That is what happened with Titan RTX. I won’t be able to give further rationalization or explanation.

In particular, questions like “why is it this way?” I am not able to answer.

For example, “Anyone knows what makes this difference ?” is a question I wouldn’t be able to answer, other than by saying that is the way the product is designed.

On what level this behavior is defined?
If its part of the driver we (external developers) may be able to change it, if its part of internal architecture - we probably won’t.

Not everyone needs NVlinks, its a product to increase communication BW, but BW is not always the bottleneck.
When we don’t want to use IB, or when BW is not the bottleneck, NVlinks improve performance sometimes only by 1%, therefore may not worth the price.
By doing this design NVidia coupled Turing-family GeForce GPUs to another (sometimes unnecessary) product.

I wonder which HW should we buy if we want to use use P2P with “commodity” communication.

This appears to be an intentional “gimp” to push professional users to quadro / tesla professional cards for memory pooling. Nvidia, probably correctly, reasons that without such “gimp” commercial users would be tempted to use the RTX Titans in a server type environment despite the warnings not to.

Just a hunch.

Topic		Replies	Views
What is the meaning of CNS(chipset not supported) error in nvidia-smi? CUDA Setup and Installation	12	3562	April 18, 2019
RTX 3090 + NVLink + CUDA P2P - not working on Linux or Windows, in different ways? CUDA Programming and Performance	9	7571	May 24, 2023
How can I tell which NVIDIA GPUs will have P2P access to the same GPU on PCIe? CUDA Programming and Performance	6	7901	January 20, 2025
Using multiple RTX 2080 Ti cards in parallel not possible? CUDA Programming and Performance	7	4409	May 13, 2019
Optimal multi-GPU system CUDA Programming and Performance	7	1088	September 6, 2017
simpleP2P example and multi-GPU network training causes system freeze and ERR in nvidia-smi Linux	7	3881	October 14, 2021
Confused about GTX Titan Z Peer-To-Peer (P2) capability CUDA Programming and Performance	19	5087	February 23, 2015
4x RTX Titan and NVLink TensorRT	8	4370	February 15, 2019
Compatibility of NVLink bridges OptiX	4	3518	June 14, 2022
P2P Transfers Across Single PCIe Switch Fail CUDA Programming and Performance	5	1380	April 15, 2024

Does Titan RTX support P2P access w/o NVLink?

Related topics