DGX Spark NCCL Test: 15GB/s So Slow

# nccl-tests version 2.17.9 nccl-headers=22809 nccl-library=22809
# Collective test starting: all_gather_perf
# nThread 1 nGpus 1 minBytes 17179869184 maxBytes 17179869184 step: 2(factor) warmup iters: 1 iters: 20 agg iters: 1 validation: 1 graph: 0
#
# Using devices
#  Rank  0 Group  0 Pid  31907 on node1 device  0 [000f:01:00] NVIDIA GB10
#  Rank  1 Group  0 Pid  30790 on node2 device  0 [000f:01:00] NVIDIA GB10
#
#                                                              out-of-place                       in-place          
#       size         count      type   redop    root     time   algbw   busbw  #wrong     time   algbw   busbw  #wrong 
#        (B)    (elements)                               (us)  (GB/s)  (GB/s)             (us)  (GB/s)  (GB/s)         
 17179869184    2147483648     float    none      -1   543291   31.62   15.81       0   564775   30.42   15.21       0
# Out of bounds values : 0 OK
# Avg bus bandwidth    : 15.5102 
#
# Collective test concluded: all_gather_perf
#


roceP2p1s0f0 port 1 ==> enP2p1s0f0np0 (Down)        
roceP2p1s0f1 port 1 ==> enP2p1s0f1np1 (Up)                                                                              
rocep1s0f0 port 1 ==> enp1s0f0np0 (Down)                                                                               
rocep1s0f1 port 1 ==> enp1s0f1np1 (Up)

As far as I know, ConnectX-7 supports 200GbE, so I would expect to see more than 20 GB/s. However, I am only getting about 15 GB/s. Why could this be happening?

Previously, I was able to achieve around 20 GB/s, but when I ran the test again after some time, I am now only seeing 15 GB/s.

There is a known issue with the firmware. Fix is in the works.