# The SYS legend given by "nvidia-smi topo -m" conflicts with the "NUMA Affinity" field

I have a machine with four V100-PCIe GPUs, and I want to understand the specific PCIe connectivity topology between them. The results obtained using `nvidia-smi topo -m` are as follows:

``````        GPU0    GPU1    GPU2    GPU3    NIC0    NIC1    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      PIX     SYS     SYS     PIX     PIX     0-9,20-29       0               N/A
GPU1    PIX      X      SYS     SYS     PIX     PIX     0-9,20-29       0               N/A
GPU2    SYS     SYS      X      PIX     SYS     SYS     0-9,20-29       0               N/A
GPU3    SYS     SYS     PIX      X      SYS     SYS     0-9,20-29       0               N/A
NIC0    PIX     PIX     SYS     SYS      X      PIX
NIC1    PIX     PIX     SYS     SYS     PIX      X

Legend:

X    = Self
SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX  = Connection traversing at most a single PCIe bridge
NV#  = Connection traversing a bonded set of # NVLinks

NIC Legend:

NIC0: mlx5_0
NIC1: mlx5_1
``````

The `SYS` in above result indicates that GPU0 and GPU1 are in the same NUMA node, while GPU1 and GPU2 are in another NUMA node. But the `NUMA Affinity` field indicates all GPUs all in NUMA node 0.

The results obtained through `lspci -tv` are as follows (I only kept the relevant part related to GPUs):"

`````` +-[0000:3a]-+-00.0-[3b-41]----00.0-[3c-41]--+-04.0-[3d]----00.0  NVIDIA Corporation GV100GL [Tesla V100 PCIe 32GB]
|           |                               +-08.0-[3e]--
|           |                               +-0c.0-[3f]--
|           |                               +-10.0-[40]----00.0  Xilinx Corporation Device d004
|           |                               \-14.0-[41]----00.0  NVIDIA Corporation GV100GL [Tesla V100 PCIe 32GB]
....
+-[0000:17]-+-00.0-[18-1e]----00.0-[19-1e]--+-04.0-[1a]--+-00.0  Mellanox Technologies MT27800 Family [ConnectX-5]
|           |                               |            \-00.1  Mellanox Technologies MT27800 Family [ConnectX-5]
|           |                               +-08.0-[1b]----00.0  NVIDIA Corporation GV100GL [Tesla V100 PCIe 32GB]
|           |                               +-0c.0-[1c]--
|           |                               +-10.0-[1d]--
|           |                               \-14.0-[1e]----00.0  NVIDIA Corporation GV100GL [Tesla V100 PCIe 32GB]
``````

This indicates that two GPUs are under bus3a, while the other two are under bus17.

Then I used `dmesg | grep NUMA` to view the information about NUMA nodes, and the result is as follows:

``````[    2.496394] pci_bus 0000:00: on NUMA node 0
[    2.527717] pci_bus 0000:17: on NUMA node 0
[    2.547933] pci_bus 0000:3a: on NUMA node 0
[    2.554105] pci_bus 0000:5d: on NUMA node 0
[    2.558604] pci_bus 0000:80: on NUMA node 1
[    2.567157] pci_bus 0000:85: on NUMA node 1
[    2.574463] pci_bus 0000:ae: on NUMA node 1
[    2.579994] pci_bus 0000:d7: on NUMA node 1
``````

This result indicates that bus3a and bus17 are in the same NUMA node, meaning that all four GPUs are in the same node. This contradicts the `SYS` obtained from `nvidia-smi topo -m` but agree with “NUMA Affinity” field.

I think the `SYS` in nvidia-smi result should be `NODE`. Is my understanding of the `SYS` legend incorrect?

Hi Xiangzhouliiu,