Hi,
When I try to install the pytorch from source, following the instuctions: PyTorch for Jetson
I got the following error:
running build_ext
– Building with NumPy bindings
– Not using cuDNN
– Not using MIOpen
– Detected CUDA at /usr/local/cuda
– Not using MKLDNN
– Not using NCCL
– Building without distributed package
Copying extension caffe2.python.caffe2_pybind11_state
Copying caffe2.python.caffe2_pybind11_state from torch/lib/python3/dist-packages/caffe2/python/caffe2_pybind11_state.cpython-36m-aarch64-linux-gnu.so to /home/nvidia/dl/PyTorch/pytorch/build/lib.linux-aarch64-3.6/caffe2/python/caffe2_pybind11_state.cpython-36m-aarch64-linux-gnu.so
Copying extension caffe2.python.caffe2_pybind11_state_gpu
torch/lib/python3/dist-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-36m-aarch64-linux-gnu.so does not exist
building ‘torch._C’ extension
aarch64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fdebug-prefix-map=/build/python3.6-M4RXmS/python3.6-3.6.5=. -specs=/usr/share/dpkg/no-pie-compile.specs -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.6m -c torch/csrc/stub.cpp -o build/temp.linux-aarch64-3.6/torch/csrc/stub.o -std=c++11 -Wall -Wextra -Wno-strict-overflow -Wno-unused-parameter -Wno-missing-field-initializers -Wno-write-strings -Wno-unknown-pragmas -Wno-deprecated-declarations -fno-strict-aliasing -Wno-missing-braces
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
aarch64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -specs=/usr/share/dpkg/no-pie-link.specs -Wl,-z,relro -Wl,-Bsymbolic-functions -specs=/usr/share/dpkg/no-pie-link.specs -Wl,-z,relro -g -fdebug-prefix-map=/build/python3.6-M4RXmS/python3.6-3.6.5=. -specs=/usr/share/dpkg/no-pie-compile.specs -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-aarch64-3.6/torch/csrc/stub.o -L/home/nvidia/dl/PyTorch/pytorch/torch/lib -L/usr/local/cuda/lib64 -lshm -ltorch_python -o build/lib.linux-aarch64-3.6/torch/_C.cpython-36m-aarch64-linux-gnu.so -Wl,–no-as-needed /home/nvidia/dl/PyTorch/pytorch/torch/lib/libcaffe2_gpu.so -Wl,–as-needed -Wl,-rpath,$ORIGIN/lib
aarch64-linux-gnu-g++: error: /home/nvidia/dl/PyTorch/pytorch/torch/lib/libcaffe2_gpu.so: No such file or directory error: command ‘aarch64-linux-gnu-g++’ failed with exit status 1
You are referring to a post on the Jetson family, and the Drive products are different in targetting purposes and the software stack each has.
It seems like the version the whl file of 1.5.0 of PyTorch is depended on cudnn 8.
this version is not installed on the Drive with the current latest version: Drive Software 10. Note: it is not recommended to overwrite manually what is delivered as a packaged flashed on the Drive AGX.
So I assume you didn’t use a whl file.
Please provide the following information:
what is your hardware platform?
what is the software version you are using?
what are you trying to achieve?
Did you apply a patch to the version once you’ve cloned it? what version did you cloned?
can you please describe the whole process you have executed?
what is your hardware platform?
The device is the xavier A in the DRIVE AGX Xavier™ Developer Kit
what is the software version you are using?
DRIVE Software 10
what are you trying to achieve?
try to install the pytorch1.1.0 (which need to be compatible with CUDA 10.2 in Drive Software 10.0)
Did you apply a patch to the version once you’ve cloned it? what version did you cloned?
I didn’t apply any patches, and the version branch is v1.1.0
can you please describe the whole process you have executed?
I just followed the instructions for jetson but skipped the step of “apply patches”. PyTorch for Jetson
before following the instructions, executing the following command sudo dpkg -i /var/cuda-repo-10-2-local-10.2.19/cuda-toolkit-10-2_10.2.19-1_arm64.deb (this will output an error) sudo apt --fix-broken -y install
(this will install all necessary include files, as the Drive platform is not meant for local development rather remote development, so include files are not needed)
apply a patch to the version as follow:
in file aten/src/ATen/native/quantized/cpu/qnnpack/src/q8gemm/8x8-dq-aarch64-neon.S (line 662)
This has compiled and the wheel was created on my side. also the “Verification” phase passed.
NOTE: please take into account that using PyTorch is not optimized for the Drive Platform, and using the TensorRT framework is the right way in order to best utilize the HW accelerators of the Drive AGX and by that get the best performance with DNN.
Hi,
Thanks for your detailed reply.
Yesterday, when I used the aptitude to resolve the dependencies, I didn’t realize that many packages can be removed to make the software system corrupted. So I have to re-flash the Software-10.0 to the AGX device, and after the flash, I found that the pip3 was not installed, and “sudo apt-get install python3-pip” also involves dependency problems:
The following packages have unmet dependencies: python3-pip : Depends: python-pip-whl (= 9.0.1-2) but 9.0.1-2.3~ubuntu1.18.04.1 is to be installed
Recommends: python3-dev (>= 3.2) but it is not going to be installed
Recommends: python3-wheel but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
The dependency problem can be resolved by the aptitude tool, but its solutions includes remove or downgrade, and I’m not sure whether it will affect the system completetion.