What is the busBW in nccl-tests?

Hi, Experts

what does busBw in nccl-tests stand for ? the algorithm bandwidth is quite clear while it equals to the bytes transferred divided by the time spent, but i see there is a factor for the calculation of busbw,


void ScatterGetBw(size_t count, int typesize, double sec, double* algBw, double* busBw, int nranks) {
double baseBw = (double)(count * nranks * typesize) / 1.0E9 / sec;

*algBw = baseBw;
double factor = ((double)(nranks-1))/((double)(nranks));
*busBw = baseBw * factor;

for alltoall/gather/all_gather/reduce_scatter/scatter
the factor is
double factor = ((double)(nranks - 1))/((double)nranks);
double factor = ((double)(2*(nranks - 1)))/((double)nranks);
double factor = 1;

please help me to understand the difference of busbw/algbw and busbw!


The published info on NCCL performance is here. The alg bw is measured for the operation, the busbw number is simply a calculation. For example, with 8 ranks it is just (2*7/8)*alg bw. The purpose of that number (bus bw) is to compare to “expected/achievable” bus bandwidth as measured by some other direct method (e.g. p2pBandwidthLatencyTest).