Partial fail of peer access in 8 Volta GPU instance (p3.16xlarge) on AWS -> huge slowdown

robosmith · February 8, 2018, 6:26pm

According to this: CUDA Pro Tip: Control GPU Visibility with CUDA_VISIBLE_DEVICES | NVIDIA Technical Blog

you are correct that CUDA_VISIBLE_DEVICES will enable me to run at full speed on 4 of the 8 GPUs. However, I have already verified that my code runs fast on 4 GPUs. Thanks for that suggestion.

What I need is for NVidia/AWS to provide a solution that allows me to utilize UVM and Peer-to-Peer at full speed on an 8 GPU system.

Any suggestion on how to get this fixed?

Topic		Replies	Views
cudaMemcpyPeerAsync behavior for different hardware CUDA Programming and Performance cuda	6	673	May 13, 2024
MultiGPU P2P Access Weird result. CUDA Programming and Performance	10	1339	June 13, 2016
One GPU NOT capable of Peer-to-Peer (P2P) CUDA Programming and Performance	22	5594	November 27, 2018
cuda 4.0rc2 cudaMemcpyPeer(Async) performance issues CUDA Programming and Performance	11	13151	May 3, 2011
Peer access not supported between devices CUDA Programming and Performance	11	7597	November 9, 2017
How to enable P2P on Amazon EC2 g2.x8large CUDA Programming and Performance	4	1741	August 23, 2016
multi-GPU Peer to Peer access CUDA SDK example not working, why? CUDA Programming and Performance	13	5358	February 26, 2015
Multi-GPU Peer to Peer access failing on Tesla K80 CUDA Programming and Performance	25	26634	November 24, 2016
Standard nVidia CUDA tests fail with dual RTX 4090 Linux box Linux	54	22490	April 29, 2024
P2P: How do I know if cudaMemcpy falls back to non-P2P? CUDA Programming and Performance	8	2646	October 12, 2021

Partial fail of peer access in 8 Volta GPU instance (p3.16xlarge) on AWS -> huge slowdown

Related topics