PyTorch for Jetson

ddbabich · November 24, 2021, 12:37am

Actually I just noticed the build from scratch instructions. I will give that a try and if I have success I will post the wheel. If not I will try out the containers.

realimposter · November 25, 2021, 3:53am

Hi,

I am a user that is running a program that requires both torch and torchvision.

So first, I tried the installation as above. Torch installed fine, and I ran the tests, and it worked. However, torchvision, after running the commands above, can be imported, but has no version.

Precisely, my steps were to navigate to project directory, then create a virtualenv, activate that virtualenv, then create a folder called req_folder, cd into req_folder, then install torchvision with exactly the instructions above. I think I tried importing like this, but it didn’t work, so I copied the torchvision folder after install into my virtualenv site-packages folder. I could then import torchvision, but there is no version method.

There is a functional issue as my program cannot run torchvision.transforms either. it appears the insall is broken. And yes, I retried with the installation instructions above, making sure the versions are right.

Also, I tried installing torch and torchvision from pypi. It appears that they have cp36, maylinux and aarch64 support. However, I could not get my cuda devices to be available. Is there a way to get those packages to run, or are they incompatible with jetson despite being aarch64?

realimposter · November 25, 2021, 3:56am

Oh the issue appears to be solved by downloading torchvision directly into the site-packages folders, and installing without --user. Can someone explain why this is the case?

1114265484 · November 26, 2021, 7:15am

How did you solve this problem? Have you been waiting? Looking forward to your reply

harrison-matt · November 28, 2021, 6:49pm

On both Xavier and Nano Jetpack 4.6 I am trying to build PyTorch 1.6.0 from the source and am getting an error at the same spot starting here:

[1956/4009] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qclamp.cpp.o
FAILED: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qclamp.cpp.o
/usr/bin/c++  -DCPUINFO_SUPPORTED_PLATFORM=1 -DFMT_HEADER_ONLY=1 -DFXDIV_USE_INLINE_ASSEMBLY=0 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNNP_CONVOLUTION_ONLY=0 -DNNP_INFERENCE_ONLY=0 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -Dtorch_cpu_EXPORTS -Iaten/src -I../aten/src -I. -I../ -isystem ../cmake/../third_party/googletest/googlemock/include -isystem ../cmake/../third_party/googletest/googletest/include -isystem ../third_party/protobuf/src -isystem ../third_party/XNNPACK/include -I../cmake/../third_party/benchmark/include -isystem ../cmake/../third_party/eigen -isystem /home/jetson/.pyenv/versions/3.7.4/include/python3.7m -isystem /home/jetson/.virtualenvs/hetseq/lib/python3.7/site-packages/numpy/core/include -isystem ../cmake/../third_party/pybind11/include -isystem ../cmake/../third_party/cub -Icaffe2/contrib/aten -I../third_party/onnx -Ithird_party/onnx -I../third_party/foxi -Ithird_party/foxi -I/usr/local/cuda/include -I../torch/csrc/api -I../torch/csrc/api/include -I../caffe2/aten/src/TH -Icaffe2/aten/src/TH -Icaffe2/aten/src -Icaffe2/../aten/src -Icaffe2/../aten/src/ATen -I../torch/csrc -I../third_party/miniz-2.0.8 -I../aten/src/TH -I../aten/../third_party/catch/single_include -I../aten/src/ATen/.. -Icaffe2/aten/src/ATen -I../caffe2/core/nomnigraph/include -isystem include -I../third_party/FXdiv/include -I../c10/.. -I../third_party/pthreadpool/include -I../third_party/cpuinfo/include -I../third_party/NNPACK/include -I../third_party/FP16/include -I../third_party/fmt/include -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow -O3 -DNDEBUG -DNDEBUG -fPIC   -DCUDA_HAS_FP16=1 -D__NEON__ -DUSE_GCC_GET_CPUID -DTH_HAVE_THREAD -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wno-write-strings -Wno-unknown-pragmas -Wno-missing-braces -Wno-maybe-uninitialized -fvisibility=hidden -O2 -fopenmp -DCAFFE2_BUILD_MAIN_LIB -pthread -std=gnu++14 -MD -MT caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qclamp.cpp.o -MF caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qclamp.cpp.o.d -o caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qclamp.cpp.o -c ../aten/src/ATen/native/quantized/cpu/qclamp.cpp
In file included from ../aten/src/ATen/cpu/vec256/vec256.h:10:0,
                 from ../aten/src/ATen/native/cpu/Loops.h:35,
                 from ../aten/src/ATen/native/quantized/cpu/qclamp.cpp:5:
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:262:3: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
   const float operator[](int idx) const {
   ^~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:267:3: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
   const float operator[](int idx) {
   ^~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h: In static member function ‘static at::vec256::{anonymous}::Vec256<float> at::vec256::{anonymous}::Vec256<float>::loadu(const void*, int64_t)’:
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:213:14: error: ‘vld1q_f32_x2’ was not declared in this scope
       return vld1q_f32_x2(reinterpret_cast<const float*>(ptr));
              ^~~~~~~~~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:213:14: note: suggested alternative: ‘vld1q_f32’
       return vld1q_f32_x2(reinterpret_cast<const float*>(ptr));
              ^~~~~~~~~~~~
              vld1q_f32
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:230:14: error: ‘vld1q_f32_x2’ was not declared in this scope
       return vld1q_f32_x2(reinterpret_cast<const float*>(tmp_values));
              ^~~~~~~~~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:230:14: note: suggested alternative: ‘vld1q_f32’
       return vld1q_f32_x2(reinterpret_cast<const float*>(tmp_values));
              ^~~~~~~~~~~~
              vld1q_f32
../aten/src/ATen/cpu/vec256/vec256_float_neon.h: In member function ‘void at::vec256::{anonymous}::Vec256<float>::store(void*, int64_t) const’:
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:235:7: error: ‘vst1q_f32_x2’ was not declared in this scope
       vst1q_f32_x2(reinterpret_cast<float*>(ptr), values);
       ^~~~~~~~~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:235:7: note: suggested alternative: ‘vst1q_f32’
       vst1q_f32_x2(reinterpret_cast<float*>(ptr), values);
       ^~~~~~~~~~~~
       vst1q_f32
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:242:7: error: ‘vst1q_f32_x2’ was not declared in this scope
       vst1q_f32_x2(reinterpret_cast<float*>(tmp_values), values);
       ^~~~~~~~~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:242:7: note: suggested alternative: ‘vst1q_f32’
       vst1q_f32_x2(reinterpret_cast<float*>(tmp_values), values);
       ^~~~~~~~~~~~
       vst1q_f32
[1957/4009] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qadd.cpp.o
FAILED: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qadd.cpp.o
/usr/bin/c++  -DCPUINFO_SUPPORTED_PLATFORM=1 -DFMT_HEADER_ONLY=1 -DFXDIV_USE_INLINE_ASSEMBLY=0 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNNP_CONVOLUTION_ONLY=0 -DNNP_INFERENCE_ONLY=0 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -Dtorch_cpu_EXPORTS -Iaten/src -I../aten/src -I. -I../ -isystem ../cmake/../third_party/googletest/googlemock/include -isystem ../cmake/../third_party/googletest/googletest/include -isystem ../third_party/protobuf/src -isystem ../third_party/XNNPACK/include -I../cmake/../third_party/benchmark/include -isystem ../cmake/../third_party/eigen -isystem /home/jetson/.pyenv/versions/3.7.4/include/python3.7m -isystem /home/jetson/.virtualenvs/hetseq/lib/python3.7/site-packages/numpy/core/include -isystem ../cmake/../third_party/pybind11/include -isystem ../cmake/../third_party/cub -Icaffe2/contrib/aten -I../third_party/onnx -Ithird_party/onnx -I../third_party/foxi -Ithird_party/foxi -I/usr/local/cuda/include -I../torch/csrc/api -I../torch/csrc/api/include -I../caffe2/aten/src/TH -Icaffe2/aten/src/TH -Icaffe2/aten/src -Icaffe2/../aten/src -Icaffe2/../aten/src/ATen -I../torch/csrc -I../third_party/miniz-2.0.8 -I../aten/src/TH -I../aten/../third_party/catch/single_include -I../aten/src/ATen/.. -Icaffe2/aten/src/ATen -I../caffe2/core/nomnigraph/include -isystem include -I../third_party/FXdiv/include -I../c10/.. -I../third_party/pthreadpool/include -I../third_party/cpuinfo/include -I../third_party/NNPACK/include -I../third_party/FP16/include -I../third_party/fmt/include -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow -O3 -DNDEBUG -DNDEBUG -fPIC   -DCUDA_HAS_FP16=1 -D__NEON__ -DUSE_GCC_GET_CPUID -DTH_HAVE_THREAD -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wno-write-strings -Wno-unknown-pragmas -Wno-missing-braces -Wno-maybe-uninitialized -fvisibility=hidden -O2 -fopenmp -DCAFFE2_BUILD_MAIN_LIB -pthread -std=gnu++14 -MD -MT caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qadd.cpp.o -MF caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qadd.cpp.o.d -o caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/quantized/cpu/qadd.cpp.o -c ../aten/src/ATen/native/quantized/cpu/qadd.cpp
In file included from ../aten/src/ATen/cpu/vec256/vec256.h:10:0,
                 from ../aten/src/ATen/native/quantized/cpu/qadd.cpp:3:
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:262:3: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
   const float operator[](int idx) const {
   ^~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:267:3: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
   const float operator[](int idx) {
   ^~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h: In static member function ‘static at::vec256::{anonymous}::Vec256<float> at::vec256::{anonymous}::Vec256<float>::loadu(const void*, int64_t)’:
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:213:14: error: ‘vld1q_f32_x2’ was not declared in this scope
       return vld1q_f32_x2(reinterpret_cast<const float*>(ptr));
              ^~~~~~~~~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:213:14: note: suggested alternative: ‘vld1q_f32’
       return vld1q_f32_x2(reinterpret_cast<const float*>(ptr));
              ^~~~~~~~~~~~
              vld1q_f32
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:230:14: error: ‘vld1q_f32_x2’ was not declared in this scope
       return vld1q_f32_x2(reinterpret_cast<const float*>(tmp_values));
              ^~~~~~~~~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:230:14: note: suggested alternative: ‘vld1q_f32’
       return vld1q_f32_x2(reinterpret_cast<const float*>(tmp_values));
              ^~~~~~~~~~~~
              vld1q_f32
../aten/src/ATen/cpu/vec256/vec256_float_neon.h: In member function ‘void at::vec256::{anonymous}::Vec256<float>::store(void*, int64_t) const’:
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:235:7: error: ‘vst1q_f32_x2’ was not declared in this scope
       vst1q_f32_x2(reinterpret_cast<float*>(ptr), values);
       ^~~~~~~~~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:235:7: note: suggested alternative: ‘vst1q_f32’
       vst1q_f32_x2(reinterpret_cast<float*>(ptr), values);
       ^~~~~~~~~~~~
       vst1q_f32
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:242:7: error: ‘vst1q_f32_x2’ was not declared in this scope
       vst1q_f32_x2(reinterpret_cast<float*>(tmp_values), values);
       ^~~~~~~~~~~~
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:242:7: note: suggested alternative: ‘vst1q_f32’
       vst1q_f32_x2(reinterpret_cast<float*>(tmp_values), values);
       ^~~~~~~~~~~~
       vst1q_f32
[1961/4009] Building CXX object caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/ParallelCommon.cpp.o
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "setup.py", line 737, in <module>
    build_deps()
  File "setup.py", line 321, in build_deps
    cmake=cmake)
  File "/home/jetson/dev/source/pytorch/tools/build_pytorch_libs.py", line 62, in build_caffe2
    cmake.build(my_env)
  File "/home/jetson/dev/source/pytorch/tools/setup_helpers/cmake.py", line 345, in build
    self.run(build_args, my_env)
  File "/home/jetson/dev/source/pytorch/tools/setup_helpers/cmake.py", line 141, in run
    check_call(command, cwd=self.build_dir, env=env)
  File "/home/jetson/.pyenv/versions/3.7.4/lib/python3.7/subprocess.py", line 347, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--target', 'install', '--config', 'Release', '--', '-j', '6']' returned non-zero exit status 1.

I have a virtualenv with python 3.7.4 (from source via pyenv) and am trying working on getting https://github.com/yifding/hetseq tested on multiple jetsons. I followed the steps from above, shown here:

git clone http://github.com/pytorch/pytorch --recursive --branch 1.6
cd pytorch
export USE_NCCL=0
export USE_DISTRIBUTED=0
export USE_QNNPACK=0
export USE_PYTORCH_QNNPACK=0
export TORCH_CUDA_ARCH_LIST="5.3;6.2;7.2"
export PYTORCH_BUILD_VERSION=1.6.0
export PYTORCH_BUILD_NUMBER=1
sudo apt install cmake libopenblas-dev
pip install -r requirements.txt
pip install scikit-build
pip install ninja
python setup.py bdist_wheel

Any thoughts?

I do want to note that the git clone command noted in the Build from Source instructions is not written in the correct order, it should be git clone http://github.com/pytorch/pytorch --recursive --branch <version>

I ran the build again and logged it here:

python setup.py bdist_wheel > build.log 2>&1

build.log (13.1 KB)

dusty_nv · November 29, 2021, 9:15pm

Hi @realimposter, this is because the torch packages for Arm on pypi aren’t built with CUDA support, so they won’t detect your GPU.

I’m not exactly sure why this would be, but then again I hadn’t tried installing it in a virtualenv. Either way, glad that you were able to get it working.

dusty_nv · November 29, 2021, 9:19pm

Hi @harrison-matt, I don’t think 1.6 is a release tag for PyTorch, it appears to be a dev branch. v1.6.0 would be the release to clone (check their tags on GitHub). There are also patches to apply that are linked to under the Build from Source section of the instructions.

I haven’t tried building this version of PyTorch myself for JetPack 4.6 and with Python 3.7, so if you continue to encounter issues you may want to try a newer version of PyTorch.

ddbabich · November 30, 2021, 11:58pm

@dusty_nv, building from scratch failed for me, so I’ve decided to save that can of worms as a last resort. Instead I’d like to use the PyTorch docker container. I setup my TX2i build with nvidia-docker and installed the nvidia-container-cli, nvidia-container-runtime and nvidia-container-toolkit. After pulling the l4t-pytorch container, when I try to run I get this error:

Blockquote
docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-pytorch:r32.6.1-pth1.9-py3
[ 384.762835] audit: type=1701 audit(1638316271.090:31): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=5503 comm=“nvc:[driver]” exe=“/usr/bin/nvidia-container-cli” sig=11
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.

Any ideas? The messages are so ambiguous. Thanks.

dusty_nv · December 1, 2021, 10:14pm

See this post for a list of other related container packages that you may want to install with apt: https://forums.developer.nvidia.com/t/unable-to-run-nvidia-docker/195321/2?u=dusty_nv

If you flash your device with SDK Manager, it will install the NVIDIA Docker Runtime for you (I would recommend doing it this way since you have encountered errors here).

Also, since you mentioned Python 3.9 before I will point out that the l4t-pytorch containers use the same Python 3.6-based wheels that are from this topic.

ddbabich · December 2, 2021, 9:08pm

I was able to get the docker container working on my target TX2i. My build had an old version of a library used by the latest nvidia-docker implementation (libtirpc) so it was crashing on boot. I loaded an updated patch to my rootfs build system and it fixed the issue. Thanks, and thanks for the other links. These containers seem like a better solution than trying to install it directly into the rootfs.

user72106 · December 4, 2021, 2:55pm

Should export CUDA_HOME=/usr/local/cuda-X.X be added to the torchvision install instructions? Looks like setup.py checks for it and my environment didn’t have it defined (no errors occur, but I think it builds it without cuda support.

In my case, running export CUDA_HOME=/usr/local/cuda-10.2 from the command line before the install worked fine.

user7452 · December 5, 2021, 5:50pm

from china
use jetson NX
download torch-1.8.0-cp36-cp36m-linux_aarch64.whl
i have already download the torchvision v0.9.0 and i can verify torchvision’s version is 0.9.0 in python3
but when i run yolov5, error happens:
Runtimeerror:No such operator torchvision::nms
i need help!!!

user74806 · December 6, 2021, 6:42am

Hi,
I’m getting a weird error while importing. I have installed pytorch 1.6 from source on python 3.8 as specified in the top post.

git clone --recursive --branch v1.6.0 GitHub - pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration
cd pytorch
export USE_NCCL=0
export USE_DISTRIBUTED=0
export USE_QNNPACK=0
export USE_PYTORCH_QNNPACK=0
export TORCH_CUDA_ARCH_LIST=“5.3;6.2;7.2”
export PYTORCH_BUILD_VERSION=1.6
export PYTORCH_BUILD_NUMBER=1
sudo apt-get install python3-pip cmake libopenblas-dev
pip3 install -r requirements.txt
pip3 install scikit-build
pip3 install ninja
python3 setup.py bdist_wheel

After compiling succesfully, when I try to do import torch I am getting below error

import torch
Traceback (most recent call last):
File “”, line 1, in
File “/home/pytorch/torch/init.py”, line 335, in
from .random import set_rng_state, get_rng_state, manual_seed, initial_seed, seed
File “/home/pytorch/torch/random.py”, line 4, in
from torch._C import default_generator
ImportError: cannot import name ‘default_generator’ from ‘torch._C’ (unknown location)

markus.weiss · December 6, 2021, 8:34am

Hello @venketramana1

I am having a similar problem on my Xavier with Jetpack 4.6.
I am also getting this error:

“”"
c++: internal compiler error: Segmentation fault (program cc1plus)
“”"

I am not trying to build torch for python3.7.
I just installed the torch wheel for python3.6, everything seems fine.
Now I am trying to build my C++ program, but this “Segmentation fault” error keeps occuring.

How did you fix the error?

Thanks and Greetings.

user74806 · December 6, 2021, 1:27pm

coming out from the root directory, and doing the import torch is not causing any error. Running from the root is causing causing this error.

markus.weiss · December 6, 2021, 2:16pm

Hello @venketramana1

I fixed the issue by using clang instead of gcc.

Greetings.

dusty_nv · December 6, 2021, 5:15pm

Hi @user7452, yes you should not try to import torch or import torchvision from the PyTorch / torchvision source directories. Please change the current working directory to a different directory before trying to import them.

dusty_nv · December 6, 2021, 8:22pm

Hmm I have not needed to set this before on the various versions of JetPack that I have built torchvision on. I just built it again to double check, and it detected CUDA and compiled .cu files during the building of torchvision. Will keep this setting in mind though.

It looks like the reason this normally works automatically, is that PyTorch looks for CUDA in /usr/local/cuda if CUDA_HOME is not set:
https://github.com/pytorch/pytorch/blob/bede18b0617c7c3818a1a6cbd1a37e06b93c619d/torch/utils/cpp_extension.py#L79

On a normal install, /usr/local/cuda is symbolically linked to the actual CUDA version (i.e. /usr/local/cuda-10.2) Maybe that symlink was missing on your system?

bearson · December 8, 2021, 12:24pm

Hi, I successfully compiled pytorch 1.10.0 with USE_QNNPACK=1 and USE_PYTORCH_QNNPACK=1 on a Jetson Nano. I do however get runtime errors using qnnpack which is tracked here: DISABLED test_quantized_rnn (test_quanization.PostTrainingDynamicQuantTest) · Issue #32644 · pytorch/pytorch · GitHub

I was just wondering, what was the original issue with qnnpack and it there any reason not to enable it now?

dusty_nv · December 8, 2021, 4:08pm

Hi @bearson, IIRC there was originally compilation errors when QNNPACK was enabled, so I had to disable it. That is useful to know that it does build now with PyTorch 1.10, but there are runtime errors. If PyTorch fixes those errors, perhaps I could be re-enabled in my builds.