Does CPU CAS work correctly with RDMA CAS?

Hi, I am currently using Mellanox ConnectX-5 for RDMA and have a question. I’m wondering if there is guaranteed atomicity between CPU CAS and RDMA CAS?
Thanks!

Hi,

There is no guaranteed atomicity between CPU Compare-and-Swap (CAS) operations and RDMA CAS operations when using Mellanox ConnectX-5 or similar hardware. This stems from the hardware-specific implementation of RDMA atomic operations.

CPU CAS and RDMA CAS operate through different paths:

  • CPU CAS acts directly on cache-coherent memory, using CPU instructions that respect the memory model and cache coherence protocol (like MESIF on x86).
  • RDMA CAS bypasses the host CPU and OS, working directly on memory via the RNIC (in your case, ConnectX-5), often through a mechanism like PCIe DMA.

This means:

  • The CPU and RNIC may not see a consistent view of memory at all times, especially if the memory region is cached by the CPU.
  • There’s no hardware arbitration between the CPU and RNIC to enforce a global order or atomicity across their CAS operations.

I hope this helps.
If there are still any questions or issues, please open a case with enterprisesupport@nvidia.com and it will be handled based on entitlement.

Thanks,
Jonathan.