PyTorch for Jetson - version 1.10 now available

Are you familiar with h5py? When downloading h5py it keeps giving errors:
“Using cached h5py-3.1.0.tar.gz (371 kB)
Installing build dependencies … done
Getting requirements to build wheel … done
Installing backend dependencies … error
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 /usr/local/lib/python3.8/dist-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-tiye2d9d/normal --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple – ‘numpy==1.19.3; python_version >= “3.9”’ ‘Cython>=0.29.14; python_version >= “3.8”’ ‘numpy==1.12; python_version == “3.6”’ ‘numpy==1.17.5; python_version == “3.8”’ ‘Cython>=0.29; python_version < “3.8”’ pkgconfig ‘numpy==1.14.5; python_version == “3.7”’
cwd: None
Complete output (719 lines):
Ignoring numpy: markers ‘python_version >= “3.9”’ don’t match your environment
Ignoring numpy: markers ‘python_version == “3.6”’ don’t match your environment
Ignoring Cython: markers ‘python_version < “3.8”’ don’t match your environment
Ignoring numpy: markers ‘python_version == “3.7”’ don’t match your environment
Collecting Cython>=0.29.14
Using cached Cython-0.29.22-py2.py3-none-any.whl (980 kB)
Collecting numpy==1.17.5
Using cached numpy-1.17.5.zip (6.4 MB)
Collecting pkgconfig
Using cached pkgconfig-1.5.2-py2.py3-none-any.whl (6.4 kB)
Building wheels for collected packages: numpy
Building wheel for numpy (setup.py): started
Building wheel for numpy (setup.py): finished with status ‘error’
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '”’"’/tmp/pip-install-5uzaorju/numpy_24ecee83328646b09eb3ad01f0d03151/setup.py’"’"’; file=’"’"’/tmp/pip-install-5uzaorju/numpy_24ecee83328646b09eb3ad01f0d03151/setup.py’"’"’;f=getattr(tokenize, ‘"’"‘open’"’"’, open)(file);code=f.read().replace(’"’"’\r\n’"’"’, ‘"’"’\n’"’"’);f.close();exec(compile(code, file, ‘"’"‘exec’"’"’))’ bdist_wheel -d /tmp/pip-wheel-v0_2az6f
cwd: /tmp/pip-install-5uzaorju/numpy_24ecee83328646b09eb3ad01f0d03151/
Complete output (341 lines):
Running from numpy source directory.
blas_opt_info:
blas_mkl_info:
customize UnixCCompiler
libraries mkl_rt not found in [’/usr/local/lib’, ‘/usr/lib’, ‘/usr/lib/aarch64-linux-gnu’]
NOT AVAILABLE
"

Sorry, I am not familiar with h5py, although I think you can install it with sudo apt-get install python3-h5py

You may also want to try this suggestion to install BLAS/LAPACK first:

sudo apt-get install libblas-dev liblapack-dev libatlas-base-dev gfortran

If that still doesn’t fix it, you may want to post a new topic about it. Thanks.

Hi all, hoping you can help me. I am trying to get torch and torchvision installed on my new Jetson Nano with Python version 3.6.7. I successfully installed torch and can import torch with no errors, but I am stuck on installing torchvision. When I try python setup.py install --user I receive the following error:

Building wheel torchvision-0.7.0a0+78ed10c
running install
running bdist_egg
running egg_info
writing torchvision.egg-info/PKG-INFO
writing dependency_links to torchvision.egg-info/dependency_links.txt
writing requirements to torchvision.egg-info/requires.txt
writing top-level names to torchvision.egg-info/top_level.txt
/home/bricklayer/archiconda3/envs/aislebrain/lib/python3.6/site-packages/torch/utils/cpp_extension.py:335: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
  warnings.warn(msg.format('we could not find ninja.'))
reading manifest file 'torchvision.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no previously-included files matching '__pycache__' found under directory '*'
warning: no previously-included files matching '*.py[co]' found under directory '*'
writing manifest file 'torchvision.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-aarch64/egg
running install_lib
running build_py
copying torchvision/version.py -> build/lib.linux-aarch64-3.6/torchvision
running build_ext
building 'torchvision.video_reader' extension
/home/bricklayer/archiconda3/envs/aislebrain/bin/aarch64-conda_cos7-linux-gnu-cc -DNDEBUG -fwrapv -O3 -Wall -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -DNDEBUG -D_FORTIFY_SOURCE=2 -O3 -fPIC -I/home/bricklayer/torchvision/torchvision/csrc/cpu/decoder -I/home/bricklayer/torchvision/torchvision/csrc/cpu/video_reader -I/usr/include -I/home/bricklayer/torchvision/torchvision/csrc -I/home/bricklayer/archiconda3/envs/aislebrain/lib/python3.6/site-packages/torch/include -I/home/bricklayer/archiconda3/envs/aislebrain/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/bricklayer/archiconda3/envs/aislebrain/lib/python3.6/site-packages/torch/include/TH -I/home/bricklayer/archiconda3/envs/aislebrain/lib/python3.6/site-packages/torch/include/THC -I/home/bricklayer/archiconda3/envs/aislebrain/include/python3.6m -c /home/bricklayer/torchvision/torchvision/csrc/cpu/video_reader/VideoReader.cpp -o build/temp.linux-aarch64-3.6/home/bricklayer/torchvision/torchvision/csrc/cpu/video_reader/VideoReader.o -std=c++14 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=video_reader -D_GLIBCXX_USE_CXX11_ABI=1
In file included from /home/bricklayer/archiconda3/envs/aislebrain/aarch64-conda_cos7-linux-gnu/include/c++/7.3.0/cwchar:44:0,
                 from /home/bricklayer/archiconda3/envs/aislebrain/aarch64-conda_cos7-linux-gnu/include/c++/7.3.0/bits/postypes.h:40,
                 from /home/bricklayer/archiconda3/envs/aislebrain/aarch64-conda_cos7-linux-gnu/include/c++/7.3.0/iosfwd:40,
                 from /home/bricklayer/archiconda3/envs/aislebrain/aarch64-conda_cos7-linux-gnu/include/c++/7.3.0/memory:72,
                 from /home/bricklayer/archiconda3/envs/aislebrain/lib/python3.6/site-packages/torch/include/c10/core/Allocator.h:4,
                 from /home/bricklayer/archiconda3/envs/aislebrain/lib/python3.6/site-packages/torch/include/ATen/ATen.h:3,
                 from /home/bricklayer/archiconda3/envs/aislebrain/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
                 from /home/bricklayer/archiconda3/envs/aislebrain/lib/python3.6/site-packages/torch/include/torch/script.h:3,
                 from /home/bricklayer/torchvision/torchvision/csrc/cpu/video_reader/VideoReader.h:3,
                 from /home/bricklayer/torchvision/torchvision/csrc/cpu/video_reader/VideoReader.cpp:1:
/usr/include/wchar.h:27:10: fatal error: bits/libc-header-start.h: No such file or directory
 #include <bits/libc-header-start.h>
          ^~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command '/home/bricklayer/archiconda3/envs/aislebrain/bin/aarch64-conda_cos7-linux-gnu-cc' failed with exit status 1

I did some searching and tried to fix it using sudo apt-get install gcc-multilib g++-multilib but no luck. It responded with:

Package gcc-multilib is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source

E: Package 'gcc-multilib' has no installation candidate
E: Unable to locate package g++-multilib
E: Couldn't find any package by regex 'g++-multilib'

Any ideas how to fix this?

Hi @jetson_mason, are you inside an archiconda environment? I see some things about that in your errors logs - I haven’t tried it with conda before. If so, can you build it outside of conda environment with python3?

Hi @dusty_nv, good catch. That did the trick. Installed python3.6 instead and tried again. It built successfully and I can run import torchvision without any errors now. Thanks!

I was build a file wheel for PyTorch 1.7, Python 3.8 by Jetson Nano.
@dusty_nv Please verify and confirm for otherspeople.

Download at: torch-1.7.0a0-cp38-cp38-linux_aarch64.whl - Google Drive

1 Like

I’m trying to compile PyTorch 1.8 (rc4) or 1.9 (dev), and every time I try to compile either version, GCC crashes when building torch_cuda_generated_BinaryMulDivKernel.cu.o. I tried GCC 7 and GCC 8. Full command that causes crash (if I run it manually, the crash happens too so it is reproducible).

cd /home/lissanro/Documents/pkgs/pytorch/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda && /usr/bin/cmake -E make_directory /home/lissanro/Documents/pkgs/pytorch/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=Release -D generated_file:STRING=/home/lissanro/Documents/pkgs/pytorch/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_BinaryMulDivKernel.cu.o -D generated_cubin_file:STRING=/home/lissanro/Documents/pkgs/pytorch/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_BinaryMulDivKernel.cu.o.cubin.txt -P /home/lissanro/Documents/pkgs/pytorch/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_BinaryMulDivKernel.cu.o.Release.cmake /home/lissanro/Documents/pkgs/pytorch/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_BinaryMulDivKernel.cu.o

I have exported variables recommended in the original post, and I also have tried some other environment variables, outcome is always the same. Even with BUILD_CAFFE2_OPS=0 BUILD_CAFFE2=0 the crash still happens. I did not try to disable CUDA because I need it.

The error in the terminal is very long, so I quote here only the end:

31036: #pragma GCC diagnostic pop
31036: # 2 "tmpxft_00007761_00000000-5_BinaryMulDivKernel.cudafe1.stub.c" 2
31036: # 1 "tmpxft_00007761_00000000-5_BinaryMulDivKernel.cudafe1.stub.c"
=== END GCC DUMP ===
CMake Error at torch_cuda_generated_BinaryMulDivKernel.cu.o.Release.cmake:281 (message):
  Error generating file

The beginning of the crash log (full log):

ProblemType: Crash
Date: Sun Feb 28 10:45:00 2021
ExecutablePath: /usr/lib/gcc/aarch64-linux-gnu/7/cc1plus
PreprocessedSource:
 // Target: aarch64-linux-gnu
 // Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=aarch64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu
 // Thread model: posix
 // gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 
 // 
 // /usr/include/c++/7/cmath: In static member function ‘static scalar_t at::native::div_floor_kernel_cuda(at::TensorIterator&)::<lambda()>::<lambda()>::<lambda(scalar_t, scalar_t)>::_FUN(scalar_t, scalar_t)’:

Any ideas how to solve this or what else to try?

Hi @Lissanro, I haven’t tried to build these yet - is it perhaps related to this PyTorch PR? https://github.com/pytorch/pytorch/pull/51834#discussion_r572391220

If not, can you file an issue about it on PyTorch GitHub and link to it here?

I will try the PR you linked tomorrow. Building PyTorch is very slow and takes whole day so it will take a while before I can confirm if it helped.

In the meantime I have found the following workaround. First, I use git clean -fdx to get rid of any old build files. Then I start building the wheel:

MAX_JOBS=4 BUILD_TESTS=0 TORCH_CUDA_ARCH_LIST="5.3;6.2;7.2" USE_NCCL=0 USE_QNNPACK=0 USE_DISTRIBUTED=0 USE_PYTORCH_QNNPACK=0 USE_OPENCV=1 USE_FFMPEG=1 USE_LMDB=1 python3 setup.py bdist_wheel

As soon as CMake finishes initial configuration and prints “Build files have been written”, I apply the following patch:

--- build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_BinaryMulDivKernel.cu.o.Release.cmake.orig 2021-03-01 07:41:52.859595866 +0000
+++ build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_BinaryMulDivKernel.cu.o.Release.cmake      2021-03-01 07:44:29.482522544 +0000
@@ -114,10 +114,8 @@

 # Take the compiler flags and package them up to be sent to the compiler via -Xcompiler
 set(nvcc_host_compiler_flags "")
-# If we weren't given a build_configuration, use Debug.
-if(NOT build_configuration)
-  set(build_configuration Debug)
-endif()
+# Force Debug build_configuration to workaround the bug in GCC 7 and GCC 8 compiliers (https://forums.developer.nvidia.com/t/pytorch-for-jetson-version-1-7-0-now-available/72048/712)
+set(build_configuration Debug)
 string(TOUPPER "${build_configuration}" build_configuration)
 #message("CUDA_NVCC_HOST_COMPILER_FLAGS = ${CUDA_NVCC_HOST_COMPILER_FLAGS}")
 foreach(flag ${CMAKE_HOST_FLAGS} ${CMAKE_HOST_FLAGS_${build_configuration}})
--- build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_CopysignKernel.cu.o.Release.cmake.orig     2021-03-01 07:41:52.880596399 +0000
+++ build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_CopysignKernel.cu.o.Release.cmake  2021-03-01 11:50:47.040069695 +0000
@@ -114,10 +114,8 @@
 
 # Take the compiler flags and package them up to be sent to the compiler via -Xcompiler
 set(nvcc_host_compiler_flags "")
-# If we weren't given a build_configuration, use Debug.
-if(NOT build_configuration)
-  set(build_configuration Debug)
-endif()
+# Force Debug build_configuration to workaround the bug in GCC 7 and GCC 8 compiliers (https://forums.developer.nvidia.com/t/pytorch-for-jetson-version-1-7-0-now-available/72048/712)
+set(build_configuration Debug)
 string(TOUPPER "${build_configuration}" build_configuration)
 #message("CUDA_NVCC_HOST_COMPILER_FLAGS = ${CUDA_NVCC_HOST_COMPILER_FLAGS}")
 foreach(flag ${CMAKE_HOST_FLAGS} ${CMAKE_HOST_FLAGS_${build_configuration}}

It forces to use Debug build configuration for torch_cuda_generated_BinaryMulDivKernel.cu.o and torch_cuda_generated_CopysignKernel.cu.o (each of them causes compiler crash if build configuration is set to Release). This way I was able to build PyTorch 1.9 (for 1.8 the workaround should work too).

I expect the patch from the PR you mentioned will at least solve the issue with CopysignKernel and allow to compile it in Release build configuration. Not sure yet if it will help with BinaryMulDivKernel issue. I will report back as soon as I know if the PR solved the problem fully or partially.

Hey! I’m trying to get pytorch installed on my Jetson TX2 but I failed. My python version is 3.6.13,Jetpack version 4.4.1[L4T 32.4.4]. And I have successfully installed pip, libopenblas-base, libopenmpi-dev and Cython. When I try to pip install torch1.7.0xxx.whl:

ERROR: torch has an invalid wheel, could not read ‘torch-1.7.0.dist-info/WHEEL’ file: BadZipFile(‘Bad magic number for file header’,)

Then I try to install torch1.6.0\1.5.0\1.4.0:

ERROR: Exception:
Traceback (most recent call last):
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/cli/base_command.py”, line 189, in _main
status = self.run(options, args)
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/cli/req_command.py”, line 178, in wrapper
return func(self, options, args)
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/commands/install.py”, line 317, in run
reqs, check_supported_wheels=not options.target_dir
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/resolver.py”, line 101, in resolve
req, requested_extras=(),
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/factory.py”, line 306, in make_requirement_from_install_req
version=None,
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/factory.py”, line 169, in _make_candidate_from_link
name=name, version=version,
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/candidates.py”, line 306, in init
version=version,
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/candidates.py”, line 144, in init
self.dist = self._prepare()
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/candidates.py”, line 226, in _prepare
dist = self._prepare_distribution()
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/candidates.py”, line 312, in _prepare_distribution
self._ireq, parallel_builds=True,
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/operations/prepare.py”, line 457, in prepare_linked_requirement
return self._prepare_linked_requirement(req, parallel_builds)
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/operations/prepare.py”, line 501, in _prepare_linked_requirement
req, self.req_tracker, self.finder, self.build_isolation,
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/operations/prepare.py”, line 67, in _get_prepared_distribution
return abstract_dist.get_pkg_resources_distribution()
File “/home/tx2/.local/lib/python3.6/site-packages/pip/_internal/distributions/wheel.py”, line 30, in get_pkg_resources_distribution
with ZipFile(self.req.local_file_path, allowZip64=True) as z:
File “/home/tx2/archiconda3/envs/py2/lib/python3.6/zipfile.py”, line 1131, in init
self._RealGetContents()
File “/home/tx2/archiconda3/envs/py2/lib/python3.6/zipfile.py”, line 1226, in _RealGetContents
raise BadZipFile(“Bad magic number for central directory”)
zipfile.BadZipFile: Bad magic number for central directory

Can someone help me out? I really appreciate it!Many thanks!

1 Like

I have tried the PR. It did not work at first. I had to replace the following in 4 places:

(__GNUC__ > 8 || (__GNUC__ == 8 && __GNUC_MINOR__ > 3))

With this:

(__GNUC__ > 8)

To make it work. I guess somebody thought the bug will be fixed in GCC higher than 8.3 but even with 8.4.0 it still crashes. Here is updated patch which can be applied to current pytorch: http://Dragon.Studio/2021/03/51834.diff. On top of this patch, the patch for issue #8103 is still necessary too.

I left a comment in pytorch PR #51834 about this to let them know that with GCC 8.4 the workaround is still necessary otherwise the compiler will crash.

error
zipfile.BadZipFile: Bad magic number for central directory

Hi @329992704, @Jackey_S, I just re-downloaded and re-installed the PyTorch 1.7 wheel, and did not get this file corruption error. Can you try downloading the wheel again? Perhaps it was a connection issue or temporarily problem with Box.com

When will 1.8 be available?

I will try to build it today and will report back.

1 Like

OK, PyTorch 1.8.0 wheel is posted here:

It needed this patch to build, which includes the fixes that @Lissanro mentioned.

1 Like

Hello @dusty_nv,

Using different pieces of code here and there, I sometimes get this error message on my jetson Xavier AGX (I use the same codes on a jetson nano but never have this error):
RuntimeError: CUDA error: no kernel image is available for execution on the device
which leads me to think that there is a problem with the torch installation.

I tried the version of pytorch 1.7.0 and 1.8.0 with no success (meaning they are installed correctly according to the verification steps, but give me this error), so I thought I would try to build it from source.

I have L4T 32.5.1 so I’m wondering, should I apply one of the patches you provide before attempting to build torch from source (for compatibility with the code I’m trying to use, my goal is to build pytorch 1.7) ?

Thank you for your help

jetson-nano@jetsonnano-desktop:~$ cat /etc/nv_tegra_release
# R32 (release), REVISION: 4.4, GCID: 23942405, BOARD: t210ref, EABI: aarch64, DATE: Fri Oct 16 19:44:43 UTC 2020
(venv) jetson-nano@jetsonnano-desktop:~$ pip3 install torch-1.6.0-cp36-cp36m-linux_aarch64.whl 
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
ERROR: torch-1.6.0-cp36-cp36m-linux_aarch64.whl is not a supported wheel on this platform.

so,what measures should i do to solve this problem???

Hi,@dusty_nv
In Xavier AGX, I installed libTorch with “xxx.whl” from Box.
But I cmake a project or compile project with QT creator 4.5.2, both encountered problems.

  1. List item my device only has cuda 10.2,but it link library to cuda10.0.I guess the problem is libTorch cmake files.
  2. QT compile,report header file problem.
    Thank you in advance!

Blockquote
CMakeFiles/libtorch-yolov5.dir/link.txt:1:/usr/bin/c++ -Wall CMakeFiles/libtorch-yolov5.dir/src/detector.cpp.o CMakeFiles/libtorch-yolov5.dir/src/main.cpp.o -o libtorch-yolov5 -L/usr/local/cuda-10.0/lib64…
CMakeFiles/libtorch-yolov5.dir/build.make:157:libtorch-yolov5: /usr/local/cuda-10.0/lib64/libnvToolsExt.so
CMakeFiles/libtorch-yolov5.dir/build.make:158:libtorch-yolov5: /usr/local/cuda-10.0/lib64/libcudart.so
ai@ai-desktop:~/Documents/road-crack-detection-cpp_copy/buildTestCUDA$ cd /usr/local
ai@ai-desktop:/usr/local$ ls -alh
total 44K
drwxr-xr-x 11 root root 4.0K 8月 21 2020 .
drwxr-xr-x 12 root root 4.0K 8月 21 2020 …
drwxr-xr-x 2 root root 4.0K 3月 4 11:38 bin
lrwxrwxrwx 1 root root 9 8月 21 2020 cuda → cuda-10.2
drwxr-xr-x 12 root root 4.0K 8月 21 2020 cuda-10.2
drwxr-xr-x 2 root root 4.0K 4月 27 2018 etc
drwxr-xr-x 2 root root 4.0K 4月 27 2018 games
drwxr-xr-x 4 root root 4.0K 9月 8 18:17 include
drwxr-xr-x 5 root root 4.0K 9月 8 18:17 lib
lrwxrwxrwx 1 root root 9 4月 27 2018 man → share/man
drwxr-xr-x 2 root root 4.0K 4月 27 2018 sbin
drwxr-xr-x 8 root root 4.0K 9月 8 18:17 share
drwxr-xr-x 2 root root 4.0K 4月 27 2018 src

 >>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/__init__.py", line 135, in <module>
    _load_global_deps()
  File "/home/nvidia/.local/lib/python3.6/site-packages/torch/__init__.py", line 93, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory

this problem have troubled me solong !what should i do ?thanks a lot!!