I recently installed Jetpack 3.3 and I’m trying to install PyTorch. I noticed that NVIDIA has been nice enough to provide wheels for Python2.7 and Python3.6, but I’m stuck using Python3.5 because it’s the Python version that I have to work with on this project. I’m trying to install PyTorch from source but I seem to be having a lot of trouble with NCCL.
I’ve tried disabling NCCL wherever I could (CMakeLists.txt file and setup.py file) as many others have suggested. This is the output of my most recent failure.
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_apply.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_conversion_dispatch.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_dtypes.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_flatten.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_layouts.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_list.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_new.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_numpy.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tensor_types.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/utils/tuple_parser.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/Module.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/Storage.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/Stream.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/utils.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/comm.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/python_comm.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/serialization.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/nn/THCUNN.cpp.o
[ 99%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/Module.cpp.o
[100%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/c10d/init.cpp.o
[100%] Building CXX object caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/c10d/ddp.cpp.o
In file included from /home/nvidia/pytorch/torch/csrc/distributed/c10d/ddp.cpp:6:0:
/home/nvidia/pytorch/torch/csrc/cuda/nccl.h:8:18: fatal error: nccl.h: No such file or directory
compilation terminated.
caffe2/torch/CMakeFiles/torch_python.dir/build.make:1867: recipe for target 'caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/c10d/ddp.cpp.o' failed
make[2]: *** [caffe2/torch/CMakeFiles/torch_python.dir/csrc/distributed/c10d/ddp.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
CMakeFiles/Makefile2:8019: recipe for target 'caffe2/torch/CMakeFiles/torch_python.dir/all' failed
make[1]: *** [caffe2/torch/CMakeFiles/torch_python.dir/all] Error 2
Makefile:138: recipe for target 'all' failed
make: *** [all] Error 2
Failed to run 'bash ../tools/build_pytorch_libs.sh --use-cuda --use-nnpack --use-qnnpack caffe2'
I’ve installed all the necessary requirements using pip3 and my basic approach is similar to this script from Dustin: Install procedure for pyTorch on NVIDIA Jetson TX1/TX2 with JetPack <= 3.2.1. For JetPack 4.2 and Xavier/Nano/TX2, see https://devtalk.nvidia.com/default/topic/1049071/jetson-nano/pytorch-for-jetson-nano/ · GitHub with obvious redirections to python3 and pip3 wherever necessary.
In short: Is there an easy way to install PyTorch with Jetpack3.3 and Python3.5? I’ve had way more success in the past with Jetpack3.1.
Best regards,
Shreyas