Hi,
I’m trying to check p2p access (GPU ↔ GPU) via PCIe.
But, NVIDIA’s sample code don’t work on my enviroment.
Item | Content |
---|---|
MB | ASUS PROART Z790-CREATOR WIFI (DDR5) |
OS | Ubuntu 22.04.4 LTS |
CPU | 13th Gen Intel(R) Core™ i9-13900 |
GPU0 | NVIDIA RTX 6000 Ada Generation |
GPU1 | NVIDIA RTX 4000 SFF Ada Generation |
NVIDIA Driver | Driver Version: 555.42.02 (Open) |
NVCC | V12.5.40 |
$ make run
./simpleP2P
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2
Checking GPU(s) for support of peer to peer memory access...
> Peer access from NVIDIA RTX 6000 Ada Generation (GPU0) -> NVIDIA RTX 4000 SFF Ada Generation (GPU1) : No
> Peer access from NVIDIA RTX 4000 SFF Ada Generation (GPU1) -> NVIDIA RTX 6000 Ada Generation (GPU0) : No
Two or more GPUs with Peer-to-Peer access capability are required for ./simpleP2P.
Peer to Peer access is not available amongst GPUs in the system, waiving test.
make: *** [Makefile:373: run] Error 2
So I checked topology of GPUs and find “NS”.
There are many topics related to CNS, but I couldn’t find any topics related to NS.
Does anyone know the cause of this?
$ nvidia-smi topo -m
GPU0 GPU1 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X PHB 0-15 0 N/A
GPU1 PHB X 0-15 0 N/A
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks
$ nvidia-smi topo -p2p r
GPU0 GPU1
GPU0 X NS
GPU1 NS X
Legend:
X = Self
OK = Status Ok
CNS = Chipset not supported
GNS = GPU not supported
TNS = Topology not supported
NS = Not supported
U = Unknown
More infos
nvidia-smi
$ nvidia-smi -a
==============NVSMI LOG==============
Timestamp : Mon Aug 19 20:41:03 2024
Driver Version : 555.42.02
CUDA Version : 12.5
Attached GPUs : 2
GPU 00000000:01:00.0
Product Name : NVIDIA RTX 6000 Ada Generation
Product Brand : NVIDIA RTX
Product Architecture : Ada Lovelace
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : Disabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1322123003492
GPU UUID : GPU-6a8f4713-9e95-c4df-e5a6-ce2ea034111e
Minor Number : 0
VBIOS Version : 95.02.59.00.09
MultiGPU Board : No
Board ID : 0x100
Board Part Number : 900-5G133-2250-000
GPU Part Number : 26B1-870-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : G133.0510.00.02
OEM Object : 2.1
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : N/A
GSP Firmware Version : 555.42.02
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x26B110DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x16A110DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 4
Link Width
Max : 16x
Current : 8x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 49140 MiB
Reserved : 603 MiB
Used : 2 MiB
Free : 48536 MiB
BAR1 Memory Usage
Total : 65536 MiB
Used : 11 MiB
Free : 65525 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
SRAM Threshold Exceeded : N/A
Aggregate Uncorrectable SRAM Sources
SRAM L2 : N/A
SRAM SM : N/A
SRAM Microcontroller : N/A
SRAM PCIE : N/A
SRAM Other : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 45 C
GPU T.Limit Temp : 45 C
GPU Shutdown T.Limit Temp : -7 C
GPU Slowdown T.Limit Temp : -2 C
GPU Max Operating T.Limit Temp : 0 C
GPU Target Temperature : 85 C
Memory Current Temp : N/A
Memory Max Operating T.Limit Temp : N/A
GPU Power Readings
Power Draw : 24.64 W
Current Power Limit : 300.00 W
Requested Power Limit : 300.00 W
Default Power Limit : 300.00 W
Min Power Limit : 100.00 W
Max Power Limit : 300.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 210 MHz
SM : 210 MHz
Memory : 405 MHz
Video : 1185 MHz
Applications Clocks
Graphics : 2505 MHz
Memory : 10001 MHz
Default Applications Clocks
Graphics : 2505 MHz
Memory : 10001 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 3105 MHz
SM : 3105 MHz
Memory : 10001 MHz
Video : 2415 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 910.000 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes : None
Capabilities
EGM : disabled
GPU 00000000:02:00.0
Product Name : NVIDIA RTX 4000 SFF Ada Generation
Product Brand : NVIDIA RTX
Product Architecture : Ada Lovelace
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1321423017491
GPU UUID : GPU-128f0adb-018e-b8a9-29a2-cdbad6f96dec
Minor Number : 1
VBIOS Version : 95.04.46.00.08
MultiGPU Board : No
Board ID : 0x200
Board Part Number : 900-5G192-2571-000
GPU Part Number : 27B0-850-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : G192.0550.00.02
OEM Object : 2.1
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : N/A
GSP Firmware Version : 555.42.02
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x02
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x27B010DE
Bus Id : 00000000:02:00.0
Sub System Id : 0x16FA10DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 4
Link Width
Max : 16x
Current : 8x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 1 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 20475 MiB
Reserved : 420 MiB
Used : 2 MiB
Free : 20055 MiB
BAR1 Memory Usage
Total : 32768 MiB
Used : 2 MiB
Free : 32766 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
SRAM Threshold Exceeded : N/A
Aggregate Uncorrectable SRAM Sources
SRAM L2 : N/A
SRAM SM : N/A
SRAM Microcontroller : N/A
SRAM PCIE : N/A
SRAM Other : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 96 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 37 C
GPU T.Limit Temp : 47 C
GPU Shutdown T.Limit Temp : -7 C
GPU Slowdown T.Limit Temp : -2 C
GPU Max Operating T.Limit Temp : 0 C
GPU Target Temperature : 88 C
Memory Current Temp : N/A
Memory Max Operating T.Limit Temp : N/A
GPU Power Readings
Power Draw : 6.57 W
Current Power Limit : 70.00 W
Requested Power Limit : 70.00 W
Default Power Limit : 70.00 W
Min Power Limit : 30.00 W
Max Power Limit : 70.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 210 MHz
SM : 210 MHz
Memory : 405 MHz
Video : 765 MHz
Applications Clocks
Graphics : 1560 MHz
Memory : 7001 MHz
Default Applications Clocks
Graphics : 1560 MHz
Memory : 7001 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 3105 MHz
SM : 3105 MHz
Memory : 7001 MHz
Video : 2415 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 650.000 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes : None
Capabilities
EGM : disabled