questions regarding SRQ performance


I am trying to benchmark the SRQ performance in terms of operations per second. The setup is as follows:

Two senders, 1 QP per sender and 1 thread per sender.

One receiver, 2 QPs per receiver.

Scenario #1: Private receiving queue per QP. 2 threads, each thread is working on 1 QP

The aggregate throughput is 18Mops/s

Scenario #2: 1 SRQ, 1 thread for 2 QPs.

The aggregate throughput is 12Mops/s

Scenario #3: 2SRQs, 2 threads, 1 thread per QP

The aggregate throughput is 12Mops/s, and the throughput of each SRQ is around 6Mops/s

By increasing the number of SRQs, the aggregate throughput is not increasing. I am wondering what might be the reason for this?

I have also tried the multi-process setup, and the result is similar.

Thank you,

Hi Sam,

For the above question please open a ticket with Mellanox support by sending the info to .

this need to be investigated in order to provide you with the most accurate reply.

Thank you,