Performance Problem between CentOS 7.3 and Windows 10 via iperf - fio - etc. >> 40GBE

Hello,

I have a strange network performance problem between Linux and Windows. I hope that somebody can help me to find a solution.

Setup 1:

Two machines with the follwing hardware:

Core i7, 8 Cores in summary with hyper threading

16GB RAM

Mellanox Connect-X3 40GBE >> PCI-Gen2-Slot >> about 25 Gbit

CentOS 7.3 as operating system

the latest ofed driver and firmware

I configured the settings in sysctl.conf as described in the guide. Furthermore I deactivated all offload-features, rss, interrupt moderation, etc.

In this setup and configuration I get around 21 Gbit/s between these two machines via iperf3 with 4 parallel streams in summary. That seems to be ok.

Setup 2:

Third machine hardware:

Xeon E5-2660 2,60 x 2

40 Cores in summary with hyper threading

64GB RAM

Mellanox Connect-X3 40GBE >> PCI-Gen3-Slot

Windows 10 Enterprise 64-bit with all updates installed

VPI-WinOF 5.35 driver

I also deactivated all features as I described before in Linux. I tried to disable autotuning for windows and configure congestionprovider etc. but I got a better performance when autotuning is set to normal.

Now I tested via iperf3 between Windows 10 and CentOS 7.3 Linux and I got only around 9,xx Gbit/s and not more!

Could somebody explain me what is going wrong here? I got the same results without a switch. I also tested the speed with same hardware and Windows 7 64-bit. The results with autotuning were around 2-3 Gbit/s better than with Windows 10.

Many thanks in advance!

Matthias

• Align all the fabric to a single max value of MTU

Please try: a fixed gap of 14 between Linux hosts and windows hosts (either decrease Linux by 14 or increase*** windows by 14

for example: Linux == 2030, Windows == 2044.

Let me know your results.

I added the following parameter to sysctl.conf:

net.ipv4.tcp_mtu_probing = 1

After a few seconds I got the same results about only 9 Gbit/s

sysctl.conf before adding the parameter:

net.ipv4.tcp_timestamps = 1

net.ipv4.tcp_sack = 1

net.ipv4.tcp_low_latency = 0

net.core.netdev_max_backlog = 250000

net.ipv4.tcp_rmem = 4096 87380 134217728

net.ipv4.tcp_wmem = 4096 65536 134217728

net.ipv4.tcp_congestion_control=htcp

net.core.default_qdisc=fq

net.ipv4.tcp_adv_win_scale = 2

net.ipv4.tcp_mem = 16777216 16777216 16777216

net.ipv4.tcp_reordering = 3

net.ipv4.tcp_limit_output_bytes = 131072

net.core.netdev_budget = 300

net.core.somaxconn = 2048

#net.core.optmem_max = 268435456

#net.core.rmem_default = 268435456

#net.core.wmem_default = 268435456

net.core.rmem_max = 268435456

net.core.wmem_max = 268435456

net.ipv4.tcp_mtu_probing = 1

In my standard configuration the MTU values are set to 1500 and 1514 and got the the results as I described before. If I change the values to 9000 or rather 9014 or your example values the performance breaks down to nothing.

[ 4] 7.00-7.36 sec 0.00 Bytes 0.00 bits/sec0 3.87 KBytes[ 6] 7.00-7.36 sec 0.00 Bytes 0.00 bits/sec0 3.87 KBytes[ 8] 7.00-7.36 sec 0.00 Bytes 0.00 bits/sec0 3.87 KBytes[ 10] 7.00-7.36 sec 0.00 Bytes 0.00 bits/sec0 3.87 KBytes[SUM] 7.00-7.36 sec 0.00 Bytes 0.00 bits/sec0


[ ID] Interval Transfer Bandwidth Retr[ 4] 0.00-7.36 sec 3.75 MBytes 4.28 Mbits/sec5 sender[ 4] 0.00-7.36 sec 0.00 Bytes 0.00 bits/sec receiver[ 6] 0.00-7.36 sec 3.75 MBytes 4.28 Mbits/sec5 sender[ 6] 0.00-7.36 sec 0.00 Bytes 0.00 bits/sec receiver[ 8] 0.00-7.36 sec 3.75 MBytes 4.28 Mbits/sec5 sender[ 8] 0.00-7.36 sec 0.00 Bytes 0.00 bits/sec receiver[ 10] 0.00-7.36 sec 3.75 MBytes 4.28 Mbits/sec5 sender[ 10] 0.00-7.36 sec 0.00 Bytes 0.00 bits/sec receiver[SUM] 0.00-7.36 sec 15.0 MBytes 17.1 Mbits/sec 20 sender[SUM] 0.00-7.36 sec 0.00 Bytes 0.00 bits/sec receiver

iperf3: interrupt - the client has terminated

Any idea for this results?

Thanks!

Nobody an idea?

The Windows Installation is not activated. Perhaps that could be a reason… but I do not think so!