NCCL 2.0 support for other Linux OS.

franco.manamana · August 24, 2017, 3:42pm

Hi stuff,

I have GitHub version of NCCL 1.x integrated into a my application.
I am interested in moving toward NCCL 2.0 (Github version no more maintained/supported).
I see from the download area that Ubuntu platform is supported.
Do you have any plan to support other Linux OS?
I’m interested in CentOS.\

Thanks,
Franco

sjeaugey · August 25, 2017, 5:23pm

Hi Franco,

You may extract deb archives using the “ar” tool.

Below is a script to extract the NCCL debs into a directory. You can use it this way :

$ ./script nccl-2.0.4-1+cuda8.0 libnccl2_2.0.4-1+cuda8.0_amd64.deb libnccl-dev_2.0.4-1+cuda8.0_amd64.deb

#!/bin/bash
# Extract debs into a directory
if [ "$2" == "" ]; then echo "Usage : $0 <dir> <nccl debs>" ; exit 1; fi

DIR=`realpath $1`; shift
DEBS=""
while [ "$1" != "" ]; do DEBS=`realpath $1` ; shift; done

mkdir temp && cd temp
for deb in $DEBS; do
  ar x $deb && tar xf data.tar.xz
  rm data.tar.xz control.tar.gz debian-binary
done

mkdir -p $DIR
mv usr/include usr/share $DIR
mv usr/lib/x86_64-linux-gnu $DIR/lib

cd .. && rm -Rf temp

franco.manamana · August 29, 2017, 9:40am

Hi NCCL team,

thanks pointing me at the script. Let me try!

franco.manamana · August 29, 2017, 9:54am

I had to play a little bit with teh script since I didn’t have “realpath” natively in my installation.
Said that, I unpacked this nccl-repo-ubuntu1404-2.0.4-ga_2.0.4-1_amd64.deb downloaded from nvidia site.

At the end, I cannot find any lib/x86_64-linux-gnu and include file after extraction.

What’s wrong?
How can I have generic x86_64 library and include?
After adding comment at temp folder removal code lines into the script, I see the following.

./
./usr/
./usr/share/
./usr/share/doc/
./usr/share/doc/nccl-repo-ubuntu1404-2.0.4-ga/
./usr/share/doc/nccl-repo-ubuntu1404-2.0.4-ga/changelog.Debian.gz
./var/
./var/nccl-repo-2.0.4-ga/
./var/nccl-repo-2.0.4-ga/Release.gpg
./var/nccl-repo-2.0.4-ga/7fa2af80.pub
./var/nccl-repo-2.0.4-ga/libnccl2_2.0.4-1+cuda8.0_amd64.deb
./var/nccl-repo-2.0.4-ga/libnccl-dev_2.0.4-1+cuda8.0_amd64.deb
./var/nccl-repo-2.0.4-ga/Release
./var/nccl-repo-2.0.4-ga/Packages.gz
./etc/
./etc/apt/
./etc/apt/sources.list.d/
./etc/apt/sources.list.d/nccl-2.0.4-ga.list

sjeaugey · August 29, 2017, 4:18pm

Sorry, that’s right, the debs you download are deb repositories. So you should first extract the repository deb :

ar x nccl-repo-ubuntu1404-2.0.4-ga_2.0.4-1_amd64.deb
tar xvf data.tar.xz
rm data.tar.xz control.tar.gz debian-binary

Then use the script to extract the two debs in var/nccl-2.0.4-1+cuda8.0 :

./script nccl-2.0.4-1+cuda8.0 var/nccl-repo-2.0.4-ga/libnccl2_2.0.4-1+cuda8.0_amd64.deb var/nccl-repo-2.0.4-ga/libnccl-dev_2.0.4-1+cuda8.0_amd64.deb

franco.manamana · August 30, 2017, 8:17am

Ok, now it’s fine.

Meanwhile I started having a look at the NNCL2 documentation.
I actually have with NCCL1 (GitHub version) the concurrency problem mentioned in the documentation (I simply pasted in the following the whole paragragh “Concurrency between NCCL and CUDA calls”).

Do you have some sample code or more detailed post where I can have a look at the work-around proposed?
The problem is really quite annoyied and I need to find THE SOLUTION at the problem.
Actually I’m using a CPU barrier (mpi concept) to protect the entering in AllReduce Nickel.
But this solution is time wasting since cpu threads are tightly synchronized as well.

I’ll wait your comment on that,
Franco

Concurrency between NCCL and CUDA calls
NCCL uses CUDA kernels to perform inter-GPU communication. The NCCL kernels synchronize with each other, therefore, each kernel requires other kernels on other GPUs to be also executed in order to complete. The application should therefore make sure that nothing prevents the NCCL kernels from being executed concurrently on the different devices of a NCCL communicator.

For example, let’s say you have a process managing multiple CUDA devices, and, also features a thread which calls CUDA functions asynchronously. In this case, CUDA calls could be executed between the enqueuing of two NCCL kernels. The CUDA call may wait for the first NCCL kernel to complete and prevent the second one from being launched, causing a deadlock since the first kernel will not complete until the second one is executed. To avoid this issue, one solution is to have a lock around the NCCL launch on multiple devices (around ncclGroupStart and ncclGroupEnd when using a single thread, around the NCCL launch when using multiple threads, using thread synchronization if necessary) and take this lock when calling CUDA from the asynchronous thread.

mserrano · September 14, 2017, 8:47pm

I looked at the download site, but I only see binary support for x86_64 (amd). Is there any plan to support Power binaries (for agnostic systems) ?

Topic		Replies	Views
Does NCCL2 support ppc64le, IBM's power8 machine GPU-Accelerated Libraries	2	783	July 21, 2018
when will NCCL2.x be released for Windows platform? GPU-Accelerated Libraries	0	562	December 8, 2018
Is there a NCCL 2.x for Windows? GPU-Accelerated Libraries	4	6436	August 1, 2020
have strange problem on installing NCCL CUDA Setup and Installation	1	7672	April 25, 2018
NCCL 2 on Windows GPU-Accelerated Libraries	1	1478	November 12, 2018
Problems installing nccl on Ubuntu 22.04 Linux	1	1109	January 18, 2024
Is there NCCL (NVIDIA Collective Communications Library) activity for Jetson TX2 ARM64 ? Jetson TX2	1	1929	May 7, 2018
Unable to install libnccl2 in WSL2 Ubuntu 20.04 GPU-Accelerated Libraries cuda	2	1542	August 12, 2025
Having some trouble installing NCCL on Ubuntu 18.04 + cuda10.0 .Can you help me ? GPU-Accelerated Libraries	0	442	November 4, 2019
Cannot Download Local installers (x86) - NVIDIA Collective Communications Library (NCCL) CUDA Setup and Installation	2	72	January 11, 2025

NCCL 2.0 support for other Linux OS.

Related topics