Can you post NCCL benchmark results with this?
But in any case, it won’t solve non-NCCL performance over TCP/IP stack, right?
Can you post NCCL benchmark results with this?
But in any case, it won’t solve non-NCCL performance over TCP/IP stack, right?