I am using Jetson AGX Xavier / Jetpack 4.2 / CUDA 10.0.166 and an error occurred during the installation of pytorch as directed by https://forums.developer.nvidia.com/t/pytorch-for-jetson-version-1-7-0-now-available/72048
Python 3.6.9 (default, Oct 8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nvidia/.local/lib/python3.6/site-packages/torch/__init__.py", line 189, in <module>
_load_global_deps()
File "/home/nvidia/.local/lib/python3.6/site-packages/torch/__init__.py", line 142, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
But the libcurand.so.10.0 file already exists on Xavier board.
What should I do to solve this problem?
Hi,
Which PyTorch package do you install?
Due to dependencies, you will need to use the package built with the same JetPack version.
For example:
PyTorch v1.4.0
…
Thanks.
I downloaded Python 3.6 torch 1.4.0
wget https://nvidia.box.com/shared/static/wa34qwrwtk9njtyarwt5nvo6imenfy26.whl -O torch-1.4.0-cp36-cp36m-linux_aarch64.whl
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev
pip3 install Cython
pip3 install numpy torch-1.4.0-cp36-cp36m-linux_aarch64.whl
This was done and the installation was successful but an error was found in the process of checking if importing torch was possible.
Hi @ho126jin, you had downloaded the URL for the PyTorch 1.7 wheel for JetPack 4.4. Instead, please use the following:
https://nvidia.box.com/shared/static/ncgzus5o23uck9i5oth2n8n06k340l6k.whl
wget https://nvidia.box.com/shared/static/ncgzus5o23uck9i5oth2n8n06k340l6k.whl -O torch-1.4.0-cp36-cp36m-linux_aarch64.whl
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev
pip3 install Cython
pip3 install numpy torch-1.4.0-cp36-cp36m-linux_aarch64.whl
1 Like
Thanks to you, I finished installing the pytorch.
I have one more problem.
I want to build a libtorch on the Xavier board.
Is it possible?
I believe libtorch is already built and is installed along with the pip wheels. I see it on my system under here:
$ ls ~/.local/lib/python3.6/site-packages/torch/lib
libc10_cuda.so libcaffe2_nvrtc.so libtorchbind_test.so libtorch_python.so
libc10.so libcaffe2_observers.so libtorch_cpu.so libtorch.so
libcaffe2_detectron_ops_gpu.so libjitbackend_test.so libtorch_cuda.so
libcaffe2_module_test_dynamic.so libshm.so libtorch_global_deps.so
I have the same problem on jetson xavier (jetpack 4.4-b144)
After installing torch-1.6.0-cp36-cp36m-linux_aarch64.whl
following this manual PyTorch for Jetson
import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/discort/test/.env/lib/python3.6/site-packages/torch/__init__.py", line 188, in <module>
_load_global_deps()
File "/home/discort/test/.env/lib/python3.6/site-packages/torch/__init__.py", line 141, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
Cuda version: 10.2
nvcc --version
Cuda compilation tools, release 10.2, V10.2.89
How did you manage to solve this issue?
@dusty_nv @ho126jin
Hi @discort, can you check that you have libcurand.so.10
found under /usr/local/cuda/lib64
?
Try running this in your terminal before you launch python3:
export LD_LIBRARY_PATH=/usr/local/cuda/lib64 ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
If it works, you can add it to your user’s ~/.bashrc
I am bumping up against the same issue with Pytorch install and can’t get past this error. I tried the export library path solution before starting python3 and I still get the OSError : libcurand.so.10
What’s next to try?
If you run the following commands, what L4T and CUDA versions are reported?
$ cat /etc/nv_tegra_release
$ nvcc --version
The first command will output your L4T version, and the second will output the CUDA version.
Is the CUDA version reported as CUDA 10.0? If not, you installed a PyTorch wheel that was built for a different version of JetPack.
Sorry, typo - I meant cat /etc/nv_tegra_release
nvcc command not found
Here’s the info from the nv_tegra_release:
R32 (release), REVISION: 3.1, GCID: 18186506, BOARD: t210ref, EABI: aarch64, DATE: Tue Dec 10 06:58:34 UTC 2019
OK, you are running JetPack 4.3, which should have CUDA 10.0 if SDK Manager installed it when you flashed your AGX Xavier device.
If you check your /usr/local/cuda
directory, is CUDA Toolkit there? Can you find nvcc under /usr/local/cuda/bin
?
If so, what does /usr/local/cuda/bin/nvcc --version
report?
Which PyTorch wheel did you install? Was it one of these for JetPack 4.3?
PyTorch v1.4.0
If you ls /usr/local/cuda/lib64
, do you see libcurand.so.10
under there?
If you run echo $LD_LIBRARY_PATH
do you see /usr/local/cuda/lib64
in the path?
OK… going back and re-installing as I originally installed v1.7 and not torch 1.4.
That was a bit …tortuous if I may pun.
But I am still at the same point when I try to check with an import torch in python3:
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
Is /usr/local/cuda/lib64
still in your LD_LIBRARY_PATH
?
Do you see the CUDA libraries and libcurand.so.10
under /usr/local/cuda/lib64
?
yes of course I checked that.
Also tried the
export LD_LIBRARY_PATH=/usr/local/cuda/lib64 ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
before running the python3 test
A little frustrating as I seem to have tried all the things prescribed to get through the twisted version issues and I am still stuck at the same point.