OSError: libcurand.so.10: cannot open shared object file: No such file or directory

Hi

I’m currently testing out a new camera from Baumer (VAX32.C.I.NVN) wich has the jetson nano SOM.

They have provided a OS with L4T installed, when i check the version it says:

cat /etc/nv_tegra_release
# R32 (release), REVISION: 4.4, GCID: 23942405, BOARD: t210ref, EABI: aarch64, DATE: Fri Oct 16 19:44:43 UTC 2020

I would like to try the classification_interactive from the DLI (Getting Started with AI on Jetson Nano). I followed the toturial and as i understand my docker file sould be (to match the L4T version 4.4):

JetPack Release 	Container Version Tag 	Language
4.4 	            v2.0.0-r32.4.3 	        en-US

However when i try to import:

import torchvision.transforms as transforms

i get the following error:

OSError: libcurand.so.10: cannot open shared object file: No such file or directory

My docker command is:

sudo docker run --runtime nvidia -it --privileged --rm --network host --volume ~/nvdli-data:/nvdli-nano/data --volume /tmp/argus_socket:/tmp/argus_socket nvcr.io/nvidia/dli/dli-nano-ai:v2.0.0-r32.4.3

Ive added --privileged to allow access to speak with the Baumer camera module.

Can anyone help resolve this issue?
Thanks in advance!

Full error readout:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-1-413b1040f905> in <module>
----> 1 import torchvision.transforms as transforms
      2 from dataset import ImageClassificationDataset
      3 
      4 TASK = 'thumbs'
      5 # TASK = 'emotions'

/usr/local/lib/python3.6/dist-packages/torchvision-0.7.0a0+6631b74-py3.6-linux-aarch64.egg/torchvision/__init__.py in <module>
      3 from .extension import _HAS_OPS
      4 
----> 5 from torchvision import models
      6 from torchvision import datasets
      7 from torchvision import ops

/usr/local/lib/python3.6/dist-packages/torchvision-0.7.0a0+6631b74-py3.6-linux-aarch64.egg/torchvision/models/__init__.py in <module>
----> 1 from .alexnet import *
      2 from .resnet import *
      3 from .vgg import *
      4 from .squeezenet import *
      5 from .inception import *

/usr/local/lib/python3.6/dist-packages/torchvision-0.7.0a0+6631b74-py3.6-linux-aarch64.egg/torchvision/models/alexnet.py in <module>
----> 1 import torch
      2 import torch.nn as nn
      3 from .utils import load_state_dict_from_url
      4 
      5 

/usr/local/lib/python3.6/dist-packages/torch/__init__.py in <module>
    186     # See Note [Global dependencies]
    187     if USE_GLOBAL_DEPS:
--> 188         _load_global_deps()
    189     from torch._C import *
    190 

/usr/local/lib/python3.6/dist-packages/torch/__init__.py in _load_global_deps()
    139     lib_path = os.path.join(os.path.dirname(here), 'lib', lib_name)
    140 
--> 141     ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
    142 
    143 

/usr/lib/python3.6/ctypes/__init__.py in __init__(self, name, mode, handle, use_errno, use_last_error)
    346 
    347         if handle is None:
--> 348             self._handle = _dlopen(self._name, mode)
    349         else:
    350             self._handle = handle

OSError: libcurand.so.10: cannot open shared object file: No such file or directory

From inside the container:

root@varo-desktop:~# ll /usr/local/cuda/lib64/
total 1444
drwxr-xr-x 1 root root   4096 Jun 27  2020 ./
drwxr-xr-x 1 root root   4096 Jun 27  2020 ../
-rw-r--r-- 1 root root 677036 Jun 27  2020 libcudadevrt.a
-rw-r--r-- 1 root root 786194 Jun 27  2020 libcudart_static.a
drwxr-xr-x 2 root root   4096 Jun 27  2020 stubs/

Outside the container:

varo@varo-desktop:~$ ll /usr/local/cuda/lib64/
ls: cannot access '/usr/local/cuda/lib64/': No such file or directory

As per my last reply i realized that i needed cuda installed properly, and following the steps from:

https://forums.developer.nvidia.com/t/oserror-libcurand-so-10-cannot-open-shared-object-file-no-such-file-or-directory-when-running-from-l4t-image/198578/5?u=bth

I ran following:

sudo apt-get update && sudo apt-get install cuda-toolkit-10-2
export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64:$LD_LIBRARY_PATH
sudo apt install nvidia-container-csv-cuda
sudo apt install nvidia-container-csv-cudnn

That resolved my issues.

If anyone else are trying out the Baumer camera, i have written a camera file similar to usb_camera.py that works with jetcam, so that the training in the DLI course can be run with the built in camera.

Gald to know issue fixed, thanks for the update.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.