How can we enable Tag matching offloading while using NCCL/OPNEMPI
Btw does NCCL Supports tag matching offlaods support of NVIDIA NICS
Hi,
For OpenMPI with UCX, you can enable hardware tag matching offload with the following env variables:
For RC_X (Reliable Connection) transport:
• UCX_RC_MLX5_TM_ENABLE=y
• For DC_X (Dynamically Connected) transport:
• UCX_DC_MLX5_TM_ENABLE=y
Regarding youyr second question, NCCL itself does not use the MPI tag matching semantics or the hardware tag matching offload provided by NVIDIA NICs. NCCL uses a different communication paradigm, focusing on GPU-to-GPU collectives and communication channels, and does not provide MPI-like tag-based message matching.
It does support advanced offloads such as GPUDirect RDMA, SHARP, etc.
You can set up NCCL alongside OpenMPI (with hardware tag matching), but the NCCL collectives themselves will not benefit from this offload.
If there are still any questions, please open a case with enterprisesupport@nvidia.com, and it will be handled based on entitlement.
Thanks,
Jonathan.
@jtal Thanks for the info , i want to use libfabric instead of UCX
does libfabric has the support? is it via