PyTorch Install problem (Solved)

Hello.
I am installing PyTorch on Xavier.
I am building from the source code by referring to


but I have failed. .
Although it seems to be a problem of CUDA 10. .
Is there a build method?

Hi,

Could you share some error log with us?
Thanks.

Hi,

We can build PyTorch from source successfully.

Here are the installation steps:

1. Install tool

$ sudo apt-get install python-pip cmake
$ pip install -U pip

2. Hack pip for Ubuntu 18.0
Edit file ‘/usr/bin/pip’

diff --git a/pip b/pip
index 56bbb2b..62f26b9 100755
--- a/pip
+++ b/pip
@@ -6,6 +6,6 @@ import sys
 # Run the main entry point, similarly to how setuptools does it, but because
 # we didn't install the actual entry point from setup.py, don't use the
 # pkg_resources API.
-from pip import main
+from pip import __main__
 if __name__ == '__main__':
-    sys.exit(main())
+    sys.exit(__main__._main())

3. Prepare PyTorch Source

$ git clone http://github.com/pytorch/pytorch
$ cd pytorch

Apply CUDA 10.0 patch

diff --git a/caffe2/utils/GpuDefs.cuh b/caffe2/utils/GpuDefs.cuh
index cf54f9e85..c69fcd38c 100644
--- a/caffe2/utils/GpuDefs.cuh
+++ b/caffe2/utils/GpuDefs.cuh
@@ -8,7 +8,7 @@ namespace caffe2 {
 // Static definition of GPU warp size for unrolling and code generation
 
 #ifdef __CUDA_ARCH__
-#if __CUDA_ARCH__ <= 700
+#if __CUDA_ARCH__ <= 730
 constexpr int kWarpSize = 32;
 #else
 #error Unknown __CUDA_ARCH__; please define parameters for compute capability

4. Install dependencies

$ pip install scikit-build --user
$ pip install ninja --user

5. Build

$ git submodule update --init
$ sudo pip install -U setuptools
$ sudo pip install -r requirements.txt
$ python setup.py build_deps
$ sudo python setup.py develop

Thanks.

I am sorry, reply is late.

Thank you very much for let me know the installation method in detail.

Thank you !!

Hi,

We also build a pip wheel:

Python2.7
Download wheel file from here:

sudo apt-get install python-pip
pip install torch-1.0.0a0+8601b33-cp27-cp27mu-linux_aarch64.whl
pip install numpy

Python3.6
Download wheel file from here:

sudo apt-get install python3-pip
pip3 install torch-1.0.0a0+8601b33-cp36-cp36m-linux_aarch64.whl
pip3 install numpy

Thanks.

I built it successfully by your method.
Thank you for offering the wheel file.
Thanks!

Hi, can you please send me a pytorch pip wheel to install on TX2 with JetPack 3.3.
Thanks.

Hi, tinydao

You can use this script to automatically build pyTorch from source:
https://gist.github.com/dusty-nv/ef2b372301c00c0a9d3203e42fd83426

Thanks.

Hi, AastaLLL

I tried to use this source, it showed error:
nvidia@tegra-ubuntu:~/Downloads/pytorch-master$ python setup.py build_deps
fatal: Not a git repository (or any of the parent directories): .git
Building wheel torch-1.0.0a0
running build_deps
setup.py::build_deps::run()

  • SYNC_COMMAND=cp
    ++ command -v rsync
  • ‘[’ -x /usr/bin/rsync ‘]’
  • SYNC_COMMAND=‘rsync -lptgoD’
  • USE_CUDA=1
  • USE_ROCM=0
  • USE_NNPACK=0
  • USE_MKLDNN=0
  • USE_GLOO_IBVERBS=0
  • CAFFE2_STATIC_LINK_CUDA=0
  • RERUN_CMAKE=0
  • [[ 8 -gt 0 ]]
  • case “$1” in
  • USE_CUDA=1
  • shift
  • [[ 7 -gt 0 ]]
  • case “$1” in
  • USE_NNPACK=1
  • shift
  • [[ 6 -gt 0 ]]
  • case “$1” in
  • break
  • CMAKE_INSTALL=‘make install’
  • BUILD_SHARED_LIBS=ON
  • USER_CFLAGS=
  • USER_LDFLAGS=
  • [[ -n ‘’ ]]
  • [[ -n ‘’ ]]
  • [[ -n ‘’ ]]
    ++ uname
  • ‘[’ Linux == Darwin ‘]’
    +++ dirname …/tools/build_pytorch_libs.sh
    ++ cd …/tools/…
    +++ pwd
    ++ printf ‘%q\n’ /home/nvidia/Downloads/pytorch-master
  • BASE_DIR=/home/nvidia/Downloads/pytorch-master
  • TORCH_LIB_DIR=/home/nvidia/Downloads/pytorch-master/torch/lib
  • INSTALL_DIR=/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install
  • THIRD_PARTY_DIR=/home/nvidia/Downloads/pytorch-master/third_party
  • CMAKE_VERSION=cmake
  • C_FLAGS=’ -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/TH" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THC" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THS" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THCS" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THNN" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THCUNN"’
  • C_FLAGS=’ -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/TH" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THC" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THS" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THCS" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THNN" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THCUNN" -DOMPI_SKIP_MPICXX=1’
  • LDFLAGS=’-L"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/lib" ’
  • LD_POSTFIX=.so
    ++ uname
  • [[ Linux == \D\a\r\w\i\n ]]
  • [[ 0 -eq 1 ]]
  • LDFLAGS=’-L"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/lib" -Wl,-rpath,$ORIGIN’
  • CPP_FLAGS=’ -std=c++11 ’
  • GLOO_FLAGS=’-DBUILD_TEST=OFF ’
  • THD_FLAGS=
  • NCCL_ROOT_DIR=/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install
  • [[ 1 -eq 1 ]]
  • GLOO_FLAGS+=’-DUSE_CUDA=1 -DNCCL_ROOT_DIR=/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install’
  • [[ 0 -eq 1 ]]
  • CWRAP_FILES=’/home/nvidia/Downloads/pytorch-master/torch/lib/ATen/Declarations.cwrap;/home/nvidia/Downloads/pytorch-master/torch/lib/THNN/generic/THNN.h;/home/nvidia/Downloads/pytorch-master/torch/lib/THCUNN/generic/THCUNN.h;/home/nvidia/Downloads/pytorch-master/torch/lib/ATen/nn.yaml’
  • CUDA_NVCC_FLAGS=’ -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/TH" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THC" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THS" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THCS" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THNN" -I"/home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/include/THCUNN" -DOMPI_SKIP_MPICXX=1’
  • [[ -z ‘’ ]]
  • CUDA_DEVICE_DEBUG=0
  • ‘[’ -z ‘’ ‘]’
    ++ getconf _NPROCESSORS_ONLN
  • MAX_JOBS=6
  • BUILD_TYPE=Release
  • [[ -n ‘’ ]]
  • [[ -n ‘’ ]]
  • echo ‘Building in Release mode’
    Building in Release mode
  • mkdir -p /home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install
  • for arg in ‘"$@"’
  • [[ nccl == \n\c\c\l ]]
  • pushd /home/nvidia/Downloads/pytorch-master/third_party
    ~/Downloads/pytorch-master/third_party ~/Downloads/pytorch-master/build
  • build_nccl
  • mkdir -p build/nccl
  • pushd build/nccl
    ~/Downloads/pytorch-master/third_party/build/nccl ~/Downloads/pytorch-master/third_party ~/Downloads/pytorch-master/build
  • [[ 0 -eq 1 ]]
  • ‘[’ ‘!’ -f CMakeCache.txt ‘]’
  • make install -j6
    [100%] Built target nccl
    Install the project…
    – Install configuration: “Release”
  • mkdir -p /home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/lib
  • find lib -name ‘libnccl.so*’
  • xargs -I ‘{}’ rsync -lptgoD ‘{}’ /home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/lib/
  • ‘[’ ‘!’ -f /home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/lib/libnccl.so ‘]’
  • popd
    ~/Downloads/pytorch-master/third_party ~/Downloads/pytorch-master/build
  • popd
    ~/Downloads/pytorch-master/build
  • for arg in ‘"$@"’
  • [[ caffe2 == \n\c\c\l ]]
  • [[ caffe2 == \g\l\o\o ]]
  • [[ caffe2 == \c\a\f\f\e\2 ]]
  • build_caffe2
  • [[ -z ‘’ ]]
  • EXTRA_CAFFE2_CMAKE_FLAGS=()
  • [[ -n ‘’ ]]
  • [[ -n /usr/lib/python2.7/dist-packages ]]
  • EXTRA_CAFFE2_CMAKE_FLAGS+=("-DCMAKE_PREFIX_PATH=$CMAKE_PREFIX_PATH")
  • [[ 0 -eq 1 ]]
  • ‘[’ ‘!’ -f CMakeCache.txt ‘]’
  • ‘[’ -f /home/nvidia/Downloads/pytorch-master/torch/lib/tmp_install/lib/libnccl.so ‘]’
  • ‘[’ ‘!’ -f lib/libnccl.so.1 ‘]’
  • make install -j6
    make: *** No rule to make target ‘install’. Stop.
    Failed to run ‘bash …/tools/build_pytorch_libs.sh --use-cuda --use-nnpack nccl caffe2 libshm gloo c10d THD’

Hi,

Based on your log, there are some issues when you cloning the repository.

nvidia@tegra-ubuntu:~/Downloads/pytorch-master$ python setup.py build_deps
fatal: Not a git repository (or any of the parent directories): .git

Please try to clone the repository again.
Or you can install the pyTorch with the package in comment #5 directly.

Thanks.

In addition to using the provided .whl file I had to

pip3 install torchvision

Only then could I verify torch installed with import torch.

Hi @AastaLL I am using CUDA version 10 but on https://pytorch.org/get-started/locally/ website there is no version available for CUDA 10. What should I do ? please guide me. Thanks

Hi, AastaLLL,

Thanks for your reply. It showed the new error:
Do you know why?

Thanks!

nvidia@tegra-ubuntu:~/pytorch$ python setup.py build_deps
Building wheel torch-1.0.0a0+0e44db8
running build_deps
setup.py::build_deps::run()

  • SYNC_COMMAND=cp
    ++ command -v rsync
  • ‘[’ -x /usr/bin/rsync ‘]’
  • SYNC_COMMAND=‘rsync -lptgoD’

[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/operators/tensor_protos_db_input_gpu.cc.o
[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/operators/while_op_gpu.cc.o
[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/operators/zero_gradient_op_gpu.cc.o
[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/operators/rnn/recurrent_op_cudnn.cc.o
[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/operators/rnn/recurrent_network_blob_fetcher_op_gpu.cc.o
[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/operators/rnn/recurrent_network_executor_gpu.cc.o
[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/queue/queue_ops_gpu.cc.o
[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/sgd/iter_op_gpu.cc.o
[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/sgd/learning_rate_op_gpu.cc.o
[ 80%] Linking CXX shared library …/lib/libcaffe2_gpu.so
/usr/local/cuda/lib64/libcudnn.so.7: error adding symbols: File in wrong format
collect2: error: ld returned 1 exit status
caffe2/CMakeFiles/caffe2_gpu.dir/build.make:5942: recipe for target ‘lib/libcaffe2_gpu.so’ failed
make[2]: *** [lib/libcaffe2_gpu.so] Error 1
CMakeFiles/Makefile2:3824: recipe for target ‘caffe2/CMakeFiles/caffe2_gpu.dir/all’ failed
make[1]: *** [caffe2/CMakeFiles/caffe2_gpu.dir/all] Error 2
Makefile:138: recipe for target ‘all’ failed
make: *** [all] Error 2
Failed to run ‘bash …/tools/build_pytorch_libs.sh --use-cuda --use-nnpack nccl caffe2 libshm gloo c10d THD’

Hi,

Do you use the Jetson AGX platform?
If yes, the package shared in comment#5 is built with CUDA 10.0 for Jetson platform.

Thanks.

Hi,

You may meet some incompatible issue when compiling.
Could you clean the ‘build’ folder and retry it again?

Thanks.

I clean the build folder and retry to install it, but it still does not work. Is there any wrong with cudnn ?

Thank you downloading and installing the whl worked for me!

Hi,

Here are the package for pyTorch on JetPack4.1.1:

python2.7
https://drive.google.com/open?id=1F-P1w6s2s8teFxcy25rRFfie_9CAyLcX

python3.6
https://drive.google.com/open?id=1yepdDCjqcoGARir9GqXAkpXkAyd6HASF

Thanks.

Thank you so much AastaLLL!

I tried to use the .whl to install PyTorch but I’m still running into issues. I just flashed my Jetson Xavier with JetPack-L4T-4.1.1-linux-x64_b57.run. I installed python3.6 with

sudo apt-get python3-pip

. I then downloaded the Python3.6 whl from comment #18 and ran

pip3 install torch-1.0.0a0+7a65461-cp36-cp36m-linux_aarch64.whl

which ran without any issues. Now when I start an instance of python and import torch, I get the following error:

python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/__init__.py", line 84, in <module>
    from torch._C import *
ImportError: libtorch.so.1: cannot open shared object file: No such file or directory

Any suggestions? Thank you!!