Performance Degradation when using ConnectX-6 Dx to transfer UDP Multicast Jumbo Frames

I am sending UDP multicast data between 2 PowerEdge servers that have ConnectX-6 Dx cards installed. The operating system is RHEL 8.7 (4.18.0-425.3.1.el8.x86_64). When running tests with iperf3 at high data rates (~27Gbits/sec), the packet loss increases from <1% (MTU 1500) to >35% (MTU 9000) when switching to jumbo frames (using ifconfig). Are there any settings that need to change on the network card to support larger packet sizes sent at this rate?

I would confirm that you have our latest MLNX_OFED driver/FW in place.

There is/are no other setting(s) to set jumbo frame MTU to 9000 others than the ip link set or ifconfig.
The MTU should be set based on the message size sent.

I would recommend using iperf/iperf2 and not iperf3 (We recommend using iperf and iperf2 and not iperf3. iperf3 lacks several features found in iperf2, for example multicast tests, bidirectional tests, multi-threading, and official Windows support).

Some tuned up might be as well applicable here; we have different community articles and our driver UM you can consult.

(If ethtool statistics report out of buffer, you can increase the RX/TX buffer up to 8192).

At last, should you have a support contract with Nvidia, we can investigate your issue with further analysis.
Enterprise Support EnterpriseSupport@nvidia.com

I was able to get much better performance by disabling adaptive interrupt coalescence using ethtool:

ethtool -C [devname] adaptive-rx off rx-usecs 0 rx-frames 0

Now, I can transfer data using jumbo frames at a high rate with minimal packet loss.

Thanks for your help.