NVSHMEM on libfabric optimal configuation

Hi!

I am trying to build an application using NVSHMEM on a cluster that has Slingshot 11 and libfabric as the transport backend. The same application on an IB cluster scales really well (without IBGDA) but poorly on this libfabric machine

I wanted to understand if there are specific flags that need to be set to optimise the performance of NVSHMEM over libfabric.