Can the interleaved memory regions (UMR) provided by the MLNX_OFED mlx5dv library be used for raw packet QPs?

I have an application where I am using MLNX_OFED and the ibverbs API to receive packets into a queue pair with type IBV_QPT_RAW_PACKET. I would like to investigate using the User-Mode Registation feature in MLNX_OFED, as I’m interested to see if it can help with some of my packet processing needs (e.g. header/data split and potentially doing some reordering of the packet payload bytes). I could get header-data split already by using multiple SGEs for each packet, but I’m hoping that an interleaved region might give me better performance.

The documentation on how to use this interface is kind of sparse, but I think I’ve pieced it together from various sources. It looks like I have to:

  • Register the two memory regions that I want to interleave data between using ibv_reg_mr() as I do already.
  • Create an mlx5dv_qp using mlx5dv_create_qp(), passing it the send_ops_flag MLX5DV_QP_EX_WITH_MR_INTERLEAVED and the create flag MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE. It looks like the type of this QP must be RC in order to have this functionality available.
  • Create an indirect mkey using mlx5dv_create_mkey().
  • Create an array of mlx5dv_mr_interleaved objects that describe how the data should be interleaved between the two underlying memory regions.
  • Use the ibv_wr_*() API to post a work request that will configure the interleaved memory region using ibv_wr_start(), mlx5dv_wr_mr_interleaved(), and ibv_wr_complete().
  • Poll the completion queue for this work request (assuming I specified IBV_SEND_SIGNALED when I posted it).
  • The completion queue should have a completion event for the MR reconfiguration.
  • I can then use the indirect MR with other QPs.

I can provide a more detailed code example if needed, but I first wanted to make sure that what I’m doing is even on the right track.

I’m finding that I never get a completion posted to the CQ after calling ibv_wr_complete() for the indirect mkey setup. Do I need to modify the RC QP’s state using ibv_modify_qp() in order to configure the indirect mkey? I am not using RC communication in my application, I’m just using raw Ethernet, so I don’t have all of the parameters required to put the RC QP into RTR or RTS states.

If I am able to successfully configure the interleaved indirect mkey, can I then use that mkey with QPs of RAW_PACKET type to receive packets?

Are there any full code examples of how to configure interleaved memory regions? All of the code snippets in the man pages are abbreviated and incomplete. Likewise, the only reference to it I can find in the rdma-core source code is in the pyverbs unit tests, which are difficult to follow as an example (they also use RC data transfers in the test, which I can’t use). It would be useful to have a full example somewhere.