# nccl-tests version 2.17.9 nccl-headers=22809 nccl-library=22809
# Collective test starting: all_gather_perf
# nThread 1 nGpus 1 minBytes 17179869184 maxBytes 17179869184 step: 2(factor) warmup iters: 1 iters: 20 agg iters: 1 validation: 1 graph: 0
#
# Using devices
# Rank 0 Group 0 Pid 31907 on node1 device 0 [000f:01:00] NVIDIA GB10
# Rank 1 Group 0 Pid 30790 on node2 device 0 [000f:01:00] NVIDIA GB10
#
# out-of-place in-place
# size count type redop root time algbw busbw #wrong time algbw busbw #wrong
# (B) (elements) (us) (GB/s) (GB/s) (us) (GB/s) (GB/s)
17179869184 2147483648 float none -1 543291 31.62 15.81 0 564775 30.42 15.21 0
# Out of bounds values : 0 OK
# Avg bus bandwidth : 15.5102
#
# Collective test concluded: all_gather_perf
#
roceP2p1s0f0 port 1 ==> enP2p1s0f0np0 (Down)
roceP2p1s0f1 port 1 ==> enP2p1s0f1np1 (Up)
rocep1s0f0 port 1 ==> enp1s0f0np0 (Down)
rocep1s0f1 port 1 ==> enp1s0f1np1 (Up)
As far as I know, ConnectX-7 supports 200GbE, so I would expect to see more than 20 GB/s. However, I am only getting about 15 GB/s. Why could this be happening?
Previously, I was able to achieve around 20 GB/s, but when I ran the test again after some time, I am now only seeing 15 GB/s.