Assistance Required for Resolving CRC Errors and Packet Loss in RDMA and UDP Data Transfers

I am currently using an Ubuntu 18.04.6 system with Kernel 5.4.0-150-generic, equipped with a Mellanox ConnectX-4 LX NIC card. My setup includes a ZCU102 FPGA that is programmed to send and receive 10G data.

Current Configuration:

  • Operating System: Ubuntu 18.04.6
  • Kernel Version: 5.4.0-150-generic
  • NIC: Mellanox ConnectX-4 LX
  • FPGA: ZCU102
  • Data Rate: 1 Gbps
  • Data Transfer Method: RDMA and UDP sockets

Issue Description:

I have successfully captured data from the FPGA to the PC using RDMA without any packet loss. However, I am facing issues when sending data from the PC to the FPGA using RDMA. Specifically, I am encountering packet loss and CRC errors on the FPGA side.

Additionally, I attempted to send data over a UDP socket bound to the Mellanox NIC, but I still observe the same packet loss and CRC errors.

Steps Taken:

  1. Verified the integrity and configuration of the FPGA program.
  2. Checked the NIC settings and ensured proper configuration for RDMA.
  3. Used both RDMA and UDP sockets for data transfer.

Despite these efforts, the issues persist.

Request for Assistance:

I seek your assistance in resolving the following:

  1. CRC Errors:
  • What could be causing the CRC errors on the FPGA side when receiving data from the PC?
  • Are there any specific NIC configurations or settings that need to be adjusted to mitigate these errors?
  1. Packet Loss:
  • What steps can be taken to reduce or eliminate packet loss during data transfer to the FPGA?
  • Are there any recommended practices for optimizing RDMA and UDP socket configurations to ensure reliable data transfer?

Any insights or guidance you can provide would be greatly appreciated. If additional information or logs are required for further diagnosis, please let me know.

cx4-lx EOL long time ago, and ubuntu 1804 also out of support.

Thank you for your response. I would appreciate it if you could elaborate on a few points:

  1. “cx4-lx EOL”: Could you please clarify what you mean by “cx4-lx EOL”? Specifically, which models of the ConnectX-4 LX NIC are considered End-of-Life (EOL), and what are the implications for support and updates?
  2. “ubuntu 1804 out of support”: Could you provide details on the support status of Ubuntu 18.04? Which versions of Ubuntu currently support the ConnectX-4 LX NIC? Additionally, could you suggest the latest NIC cards and corresponding Ubuntu versions that can help achieve “zero packet loss” and resolve the CRC error issue we are facing?

Your guidance on these points will be highly valuable in ensuring we select the most appropriate hardware and software combinations for our requirements.

Thank you for your assistance.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.