Confused about CUDA p2pbandwidthlatency sample

SpaceMan · April 19, 2021, 8:49am

Hi everyone,

I seem to be a little confused about CUDA “p2pbandwidthlatency”.

I installed 3 NVIDIA A40 GPUs without NVLink and ran this example.

The bandwidth of device 0 to 0 is about 640 GB/s. This data is based on official specifications." GPU Memory Bandwidth"?

SPECIFICATIONS

GPU Memory 48 GB GDDR6 with error-correcting code (ECC)

GPU Memory Bandwidth 696 GB/s

Interconnect

NVIDIA NVLink 112.5 GB/s (bidirectional)
PCIE Gen4 x16 31.5 GB/s (bidirectional)

NVLink 2-way low profile (2-slot)
Display Ports 3x DisplayPort 1.4*
Max Power Consumption 300 W
Form Factor 4.4" (H) x 10.5" (L) Dual Slot
Thermal Passive
vGPU Software Support NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation, NVIDIA Virtual Compute Server
vGPU Profiles Supported See the Virtual GPU Licensing Guide
NVENC NVDEC 1x 2x (includes AV1 decode)
Secure and Measured Boot with Hardware Root of Trust Yes
NEBS Ready Level 3
Power Connector 8-pin CPU

In addition, what is the reference direction of “0 to 1” and “0 to 2” Bidirectional P2P Bandwidth?

Reference “Interconnect” PCIe Gen4 x 16?

Robert_Crovella · April 19, 2021, 5:53pm

Yes, that is a measurement of device memory bandwidth, and should be roughly similar to the report provided by bandwidthTest (the 3rd value reported).

Yes, that is going to measure the interconnect bandwidth, with P2P enabled, and conducting transfers simultaneously in both directions.

BTW you have the source code for this code, so you can confirm these yourself.

Topic		Replies	Views
the bandwidth is low between my gpus. tested with p2pBandwidthLatencyTest CUDA Programming and Performance	0	647	March 28, 2018
How to measure GPU Memory Bandwidth？ CUDA Programming and Performance	1	512	September 4, 2024
Questions about p2pBandwidthLatencyTest CUDA Programming and Performance	2	837	July 16, 2019
How can I improve the 'p2p enabled' bandwidth when testing NCCL performance with two A5000 GPU using PCIe 4.0 x16? CUDA Programming and Performance cuda	2	1114	September 15, 2023
Using bandwidthTest tool, D2D performance More than the official given bandwidth CUDA Programming and Performance cuda	6	833	October 28, 2022
Is my bandwidth calculation right? bandwidth CUDA Programming and Performance	3	1447	November 13, 2009
Low P2P GPU bandwidth performance between GeForce GPUs CUDA Programming and Performance	20	590	October 9, 2024
20% of the bandwidth is missing CUDA Programming and Performance	4	1254	August 12, 2014
Bandwith Problem CUDA Programming and Performance	7	2633	March 16, 2009
How to calculate the theoretical memory bandwidth? CUDA Programming and Performance	8	7948	December 18, 2024

Confused about CUDA p2pbandwidthlatency sample

SPECIFICATIONS

Related topics