Understanding which QP's mapped to which IRQ

Hey,

I am trying to figure out if there is a way to check which QP’s are mapped to which IRQ as seen in the /proc/interrupts file:

[root@nvme101 eliott]# cat /proc/interrupts | grep ens1

149: 0 0 0 0 0 0 319141058 340480 0 13057634 94143 0 0 0 0 0 0 0 0 217826 0 232449 0 0 IR-PCI-MSI-edge ens1-12

150: 0 0 0 0 0 0 0 219867439 0 1647483 0 0 0 0 0 0 0 0 0 19729035 0 0 0 0 IR-PCI-MSI-edge ens1-13

151: 0 0 0 0 0 0 0 0 136004829 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge ens1-14

152: 0 0 0 0 0 0 0 0 0 1573337 0 0 0 0 0 0 0 0 37493613 0 0 0 0 0 IR-PCI-MSI-edge ens1-15

155: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1975255376 0 0 0 0 0 IR-PCI-MSI-edge ens1-0

156: 0 0 0 0 0 0 202624 0 0 0 0 0 0 0 0 0 0 0 0 199828517 0 13153183 0 0 IR-PCI-MSI-edge ens1-1

157: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 116234175 0 0 0 IR-PCI-MSI-edge ens1-2

158: 0 0 0 0 0 0 0 0 0 0 0 112023074 0 0 0 0 0 0 0 0 0 127688879 5 0 IR-PCI-MSI-edge ens1-3

159: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1435783855 0 IR-PCI-MSI-edge ens1-4

160: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 240198063 IR-PCI-MSI-edge ens1-5

167: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 169419823 0 0 0 0 0 IR-PCI-MSI-edge ens1-6

168: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 81519183 1 3742930 0 0 IR-PCI-MSI-edge ens1-7

169: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 91266031 0 0 0 IR-PCI-MSI-edge ens1-8

170: 0 0 0 0 0 0 0 0 0 0 5448319 0 0 0 0 0 0 0 0 118 0 154722364 0 0 IR-PCI-MSI-edge ens1-9

171: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 268194063 0 IR-PCI-MSI-edge ens1-10

172: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 190925030 IR-PCI-MSI-edge ens1-11

So in my case, I can see that device ens1 have a total of 16 IRQs, I would like to understand for example how many QPs are mapped under ‘ens1-0’ and what are them QP number.

Also I would like to understand if there is a way to affect this mapping to something more customizable ?

Following the lspci output of my NIC:

82:00.0 Network controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]

Subsystem: Mellanox Technologies Device 0008

Physical Slot: 1

Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+

Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-

Latency: 0, Cache Line Size: 32 bytes

Interrupt: pin A routed to IRQ 39

NUMA node: 1

Region 0: Memory at fbe00000 (64-bit, non-prefetchable) [size=1M]

Region 2: Memory at fb000000 (64-bit, prefetchable) [size=8M]

Expansion ROM at fbd00000 [disabled] [size=1M]

Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-

Vector table: BAR=0 offset=0007c000

PBA: BAR=0 offset=0007d000

a.zip (2.09 KB)

Hi,

The OS maps the interrupts, it allocates interrupt request to each rx queue.

  • You can observe them this way:

[root@localhost tuning_scripts]# cat /proc/interrupts | grep ens5f0

219: 0 0 0 0 0 0 0 0 0 21678 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1024 0 0 0 0 0 0 PCI-MSI-edge ens5f0-0

220: 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1025 0 0 0 0 18411 PCI-MSI-edge ens5f0-1

[root@localhost tuning_scripts]# cat /proc/interrupts | grep ens5f1

258: 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1025 0 0 6628312 0 PCI-MSI-edge ens5f1-0

259: 0 0 0 0 0 0 0 0 0 1 41850479 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1024 0 0 0 PCI-MSI-edge ens5f1-1

The above describes a situation in which interrupt 219 is allocated to ens5f0-0, which is queue 0, and 220 to ens5f0-1, which is queue 1. A similar situation is with ens5f1, which also has 2 queues and interrupts.

As you can see, each queue has it’s own interrupt, and user cannot force one interrupt to serve different queues, or some queue to have multiple interrupts - rigid one interrupt to one queue mapping.

However, interrupts can be spread across different cores, meaning that each time when a packet arrives to a queue it can be serviced by a different CPU core.

- Interrupt mapping to CPU cores is controlled through IRQ affinity settings:

[root@localhost tuning_scripts]# cat /proc/irq/219/smp_affinity

0,00000200

[root@localhost tuning_scripts]# cat /proc/irq/220/smp_affinity

8,00000000

[root@localhost tuning_scripts]# cat /proc/irq/258/smp_affinity

4,00000000

[root@localhost tuning_scripts]# cat /proc/irq/259/smp_affinity

0,00000400

[root@localhost tuning_scripts]#

As observed above, each core has only one interrupt mapped to it, because those are bits in a bitmask, so translating those numbers from hex to binary will give the exact index of the core that is currently processing this interrupt, and by this serves the queue.

- In order to change the affinity, one can use linux commands:

[root@localhost tuning_scripts]# echo 0xff > /proc/irq/219/smp_affinity

[root@localhost tuning_scripts]# cat /proc/irq/219/smp_affinity

0,000000ff

The above maps the interrupt 219, which is associated with queue 0 of ens5f0, to cores 0 to 7, which means that 8 different cores will be involved in processing interrupts caused by traffic that will be hashed into queue 0.

Mellanox recommendation is to always have one interrupt mapped to single core, meaning that each one of the interrupts caused by traffic that goes to this que will be processed by one and only one core.

Thanks,

Samer

Is there any relationship between the NIC QP’s and the interrupt itself?

I would like to control the number of QP’s mapped to specific interrupt. Is there a way to do it ?

mellanox-admin samerka @ Infrastructure & Networking - NVIDIA Developer Forums

Any help please?

Already read it, It do not relate information about the different NIC Queue Pairs. The purpose of my messages is to understand mapping QP’s to IRQ’s. (And not mapping IRQ’s to core)

So for example, mapping all my NIC QP’s to only 1 IRQ address.

samerka Infrastructure & Networking - NVIDIA Developer Forums Thank you a lot for this explanation

Hi ,

I suggest reviewing this community link :

What is IRQ Affinity? https://community.mellanox.com/s/article/what-is-irq-affinity-x

Thanks,

Samer

Hi Eliott,

There could be a way to do it depending on you implementation:

Are you using verbs and OFED?

Noam

Hi Eliott

the way it can be done is by defining the interrupt to the completion queue used by the QP : this can be done setting the completion vector in ibv_create_cq()

If you need more details please open a case in our support portal

Thanks

Noam

I am using MOFED and inbox driver as well, so if there are multiple way to do it, I will need both.

Thank you!