OSError: libcurand.so.10: cannot open shared object file:No such file or directory

Hi,

Description

In my docker bash[sudo docker exec -it app1 bash], i got this error in python terminal,

OSError: libcurand.so.10: cannot open shared object file:No such file or directory

When importing torchvision by import torchvision

Error ScreenShot -

Environment

TensorRT Version: 4.4.1
GPU Type: Nvidia Jetson Xavier NX
CUDA Version: 10.2.89
CUDNN Version: 8.0.0.180
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.6

Here is the jtop details -

Installation

For installing the packages to run this, I used these commands

sudo apt-get install cuda-toolkit-10-2
sudo apt install nvidia-tensorrt
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker

1st Try Then, I followed this link - OSError: libcurand.so.10: cannot open shared object file: No such file or directory · Issue #194 · dusty-nv/jetson-containers · GitHub, but it is not worked

2nd Try I uninstalled all the packages completely and installed back, Still got the same error

3rd Try Reflashed the board and tried installing back, still got the same error

How to solve this, thanks

Hi,

We are moving this post to the Nvidia Jetson Xavier NX forum to get better help.

Thank you.

1 Like

Hi,

Usually, this kind of error is caused by incompatible packages.
There are some underlying dependencies so please use the package that is built on the same JetPack version.

Thanks.

@AastaLLL, thanks for writting, I have other xavier also, in both xavier, I used jetpack 4.4.1, In another xavier, I did not get this error

Due to this error, I got error from onnx-runtime

/usr/local/lib/python3.6/dist-packages/onnxruntime/capi/_pybind_state.py:14: UserWarning: Cannot load onnxruntime.capi. Error: ‘/usr/lib/aarch64-linux-gnu/libcublas.so.10: file too short’.

warnings.warn(“Cannot load onnxruntime.capi. Error: ‘{0}’.”.format(str(e)))

from onnxruntime.capi._pybind_state import get_all_providers, get_available_providers, get_device, set_seed,
ImportError: cannot import name ‘get_all_providers’

Could you please suggest - how to debug this dependencies issue because I cant re-flash the board, now

thanks

Solution : Reinstalling the cuda-toolkit-10-2 and nvidia-container will solve this issue, and it worked

The thing is, after reinstalling nvidia-docker2, nvidia-tensorrt, I got the error as the libcurand.so.10 file is short, then reinstalling the nvidia-container solves the issue

But, isnt the nvidia-docker2 comes with nvidia-container in default,

I dont know why to we need to explicitly reinstall the nvidia-container to solve this issue - please let me know the reason for this

thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.