'nvidia-smi topo -m' revisited

RikaK · June 6, 2022, 3:24am

Following https://forums.developer.nvidia.com/t/how-to-change-cpu-affinity-in-nvidia-smi-topo/190990 and https://stackoverflow.com/questions/55364149/understanding-nvidia-smi-topo-m-output (especially the awesome figure) I am trying to make sense of my output from ‘nvidia-smi topo -m’.

              GPU0      GPU1    GPU2    GPU3    mlx5_0  CPU Affinity    NUMA Affinity
GPU0     X             NV2        NV2      NV2         SYS     0-3,7-9,13-15            0
GPU1    NV2           X           NV2      NV2         SYS     0-3,7-9,13-15            0
GPU2    NV2         NV2          X         NV2      NODE    24-27,31-33               2
GPU3    NV2         NV2        NV2        X         NODE    24-27,31-33               2
mlx5_0  SYS        SYS       NODE    NODE      X

This is the output from one of our Volta nodes.
I understand that this is 4 GPUs connected by NVLink across 2 NUMA nodes.
It is the CPU Affinity column I am trying to get to grips with.
In previous years I had a script passing this output to assign CPU “controllers” for each GPU (which I guess I can still do) but topology seemed more intuitive in those days which CPU was closest to the GPU, because it was consecutive or numerically strided. The above CPU affinity column feels unintuitive, especially as the node has 48 CPUs.
Can you explain that smi output and advise on the best choice of matching the CPU to GPU where the CPU is only acting as controller and the remaining CPUs are doing other tasks in the background?

Robert_Crovella · June 6, 2022, 3:50am

If you have a process that is going to access GPU0 or GPU1, then use something like:

taskset -c x  ./my_executable

where x is one of 0,1,2,3, 7,8,9, 13, 14,15, to place the execution of my_executable in a CPU core that has an “affinity relationship” to GPUs 0 and 1. That’s pretty much all you need to know for basic process placement.

For additional observations:

This system probably has 2 numa nodes per socket (if it is a 2-socket system) or 4 numa nodes per socket (if it is a 1-socket system). AMD CPUs are often configured like this. They will have PCIE lanes (connected to the GPUs) that are “closer” to CPU cores that are associated with particular numa nodes. What this means is that some CPU cores don’t have an affinity relationship to any GPU, thus, they don’t appear in the list. Thus your list may not include all 48 cores.

I can’t explain the core numbering exactly. That’s all I would bother to say/guess at without the CPU information (OEM system type, number of sockets, actual part numbers, etc.) If you want best insight into what’s going on, it’s important to have this info as well. The nvidia-smi topo -m output doesn’t contain all information that might be interesting.

RikaK · June 6, 2022, 4:05am

Wow, thanks for the speedy reply.
FYI(?)

gpuvolta
Normal priority queue, nodes equipped with NVIDIA Volta GPUs, 160 nodes total
2 x 24-core Intel Xeon Platinum 8268 (Cascade Lake) 2.9 GHz CPUs per node
384 GB RAM per node
2 CPU sockets per node, each with 2 NUMA nodes
12 CPU cores per NUMA node
96 GB local RAM per NUMA node
4 x Nvidia Tesla Volta V100-SXM2-32GB per node
480 GB local SSD disk per node 
Max request of 960 CPU cores (80 GPUs)

but I think you have answered all I was puzzled about. To give you context, I’m working with Gaussian (Kyle J can vouch for me) and nvidia-smi is still part of the instructions https://gaussian.com/gpu/. I probably still have the information I need. I just understand the other core numbering stuff better now so thank you.

ben.menadue · June 6, 2022, 4:51am

Just for reference, the NUMA topology you’re seeing here is because sub-NUMA clustering is enabled. At least on this hardware, I’m pretty sure that makes the platform expose the internal topology of the cores within the physical CPU package to the OS, hence the multiple discontinuous ranges. Each of the CPU packages presents its own, potentially-unique arrangement.

However, the output of nvidia-smi can be a bit misleading: here’s the topology of an example node:

[bjm900@gadi-gpu-v100-0001 ~]$ lscpu | grep NUMA
NUMA node(s):        4
NUMA node0 CPU(s):   0-3,7-9,13-15,19,20,48-51,55-57,61-63,67,68
NUMA node1 CPU(s):   4-6,10-12,16-18,21-23,52-54,58-60,64-66,69-71
NUMA node2 CPU(s):   24-27,31-33,37-39,43,44,72-75,79-81,85-87,91,92
NUMA node3 CPU(s):   28-30,34-36,40-42,45-47,76-78,82-84,88-90,93-95

(HyperThreading is also enabled here, so ignore “cores” 48-95), but here’s the output of nvidia-smi:

[bjm900@gadi-gpu-v100-0001 ~]$ nvidia-smi topo -m | head -n6
        GPU0    GPU1    GPU2    GPU3    mlx5_0  CPU Affinity    NUMA Affinity
GPU0     X      NV2     NV2     NV2     SYS     0-3,7-9,13-15   0
GPU1    NV2      X      NV2     NV2     SYS     0-3,7-9,13-15   0
GPU2    NV2     NV2      X      NV2     NODE    24-27,31-33     2
GPU3    NV2     NV2     NV2      X      NODE    24-27,31-33     2
mlx5_0  SYS     SYS     NODE    NODE	 X

The CPU affinity column looks to have been silently truncated. But I’m guessing that’s just a limitation on the output?

RikaK · June 6, 2022, 6:05am

Lol @ben.menadue you kind of said that on our Slack but I didn’t clock what you meant by sub-NUMA clustering and didn’t realise how it explained the CPU numbering. Not bothered about the hyperthreading. It was not being sure where the other 32 cpus had got to and what they were doing amongst other things. lscpu is a cool command (I was catting cpuinfo). Anyway, as Robert reinforced what you said I’m happy. Thank you both.

system · June 20, 2022, 6:06am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to change CPU Affinity in nvidia-smi topo CUDA Programming and Performance	4	6300	October 26, 2021
Controlling IRQ Affinity How to distribute IRQs on a NUMA machine with 8 GPUs? CUDA Programming and Performance	2	7557	January 11, 2012
NUMA domains on ROME with multi-GPU nvc, nvc++ and nvfortran	1	680	March 4, 2021
The SYS legend given by "nvidia-smi topo -m" conflicts with the "NUMA Affinity" field Network Management Products nvidia-smi	1	1319	April 29, 2024
Assigning GPUs to specific CPU cores using MPI and Server 2008 CUDA Programming and Performance	0	1829	June 30, 2010
What is the correct topology of 8 K80 GPUs? CUDA Setup and Installation	1	1111	April 12, 2016
Gde_copy_to_bar Failing Due to CPU Affinity CUDA Programming and Performance	1	426	May 13, 2020
Assigning each CPU with a GPU General Discussion	0	1829	April 1, 2022
cuda taskmanager affinity mod CUDA Programming and Performance	2	5458	August 1, 2008
Strange output by nvidia-smi topo CUDA Programming and Performance	1	3646	December 13, 2018

'nvidia-smi topo -m' revisited

Related topics