ConnectX-5 bit rate expected with Winsock "send" and "recv" versus RoCE

Can a ConnecX-5 adapter improve non-RoCE throughput for a legacy application using simple “send” and “recv” Winsock calls. For testing, I’m using a simple application that repeatedly passes a buffer ranging from tens of MB to one GB to Winsock APIs “send” and “recv”, where “recv” is called with flag MSG_WAITALL. We are hoping for speeds of around 25 Gbps without changing the application to use RoCE, but I’m only seeing speeds on the order of 8.5 to 9.0 Gbps. The application provides about 9.5 Gbps on a setup with old Intel X520 10GbE adapters.

On the same setup, your nd_read_bw.exe and nd_write_bw.exe performance tests show speeds of 96 Gbps.

Should “send” and “recv” calls on Windows be able to achieve more than 10 Gbps throughput with ConnectX-5 adapters? If so, can you send me some suggestions?