NCCL discussion bug

I’ve had some technical information provided to me regarding NCCL on NVidia bug 2079668, however, the link no longer works, any suggestion how I can find this bug?

Original link: https://developer.nvidia.com/nvidia_bug/2079668

There was only one public communication made on that bug. If you want I can post it here.

Otherwise, if your intent is to continue the discussion, then that bug has already been closed internally. You can file a new bug referencing the old one if you wish.

There was a solution posted on that bug on how to use multiple rings, could you repost it here?

A solution with NCCL 2.1 is to use the environment variable NCCL_RINGS.

NCCL_RINGS can contain a string of ranks whose values range from 0 to n-1, where n is the numbers of GPUs in your communicator. The ranks can be separated by any non-digit character (like " ", “-”) except “|”.

Multiple rings can be specified separated by the pipe character “|”.

For example, in a communicator of 2 nodes, each node with 4 GPUs, you can form 3 rings by setting:
NCCL_RINGS=“0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7”

Please note that NCCL_RINGS are direction sensitive. For example, “0 1 2 3 4 5 6 7 | 7 6 5 4 3 2 1 0” would form two rings, each in one direction.

[url]https://docs.nvidia.com/deeplearning/sdk/nccl-developer-guide/index.html#ncclknobs[/url]