Hi, I am currently using Mellanox ConnectX-5 for RDMA and have a question. I’m wondering if there is guaranteed atomicity between CPU CAS and RDMA CAS?
Thanks!
Hi,
There is no guaranteed atomicity between CPU Compare-and-Swap (CAS) operations and RDMA CAS operations when using Mellanox ConnectX-5 or similar hardware. This stems from the hardware-specific implementation of RDMA atomic operations.
CPU CAS and RDMA CAS operate through different paths:
- CPU CAS acts directly on cache-coherent memory, using CPU instructions that respect the memory model and cache coherence protocol (like MESIF on x86).
- RDMA CAS bypasses the host CPU and OS, working directly on memory via the RNIC (in your case, ConnectX-5), often through a mechanism like PCIe DMA.
This means:
- The CPU and RNIC may not see a consistent view of memory at all times, especially if the memory region is cached by the CPU.
- There’s no hardware arbitration between the CPU and RNIC to enforce a global order or atomicity across their CAS operations.
I hope this helps.
If there are still any questions or issues, please open a case with enterprisesupport@nvidia.com and it will be handled based on entitlement.
Thanks,
Jonathan.