I have the L4T docker image from which I built PyTorch and other dependencies onto during the build phase. Inside the container with CUDA mounted, I installed pycuda (pip3 install pycuda). When I try to run my application I get the following error:
import pycuda.driver as cuda
File "/usr/local/lib/python3.6/dist-packages/pycuda/driver.py", line 62, in <module>
from pycuda._driver import * # noqa
ImportError: libcurand.so.10: cannot open shared object file: No such file or directory
At first my LD_LIBRARY_PATH variable was set to “/usr/local/cuda-10.2/targets/aarch64-linux/lib”. I’m assuming this is so because the host system this image was built on used cuda 10.2? I changed this path variable to “/usr/local/cuda-10.0/targets/aarch64-linux/lib” in the ~/.bashrc file then ran source ~/.bashrc.
In /usr/local/cuda-10.0/targets/aarch64-linux/lib (which is shared) I have a libcurand.so.10.0 file and libcurand.so.10.0.326 file, but not libcurand.so.10.
I’m extremely confused because this application I wrote works in the host system. It seems pycuda wants to use a file that does not exist in the shared cuda directory inside the L4T container. Is there something I am missing here?
EDIT:
Looks like nvcc --version in the image returns Cuda compilation tools, release 10.2, V10.2.89 where I have CUDA 10.0 installed on the host system. And now I’ve found that this image only officially supports CUDA 10.2.
You can use JetPack4.4 for CUDA 10.2 and JetPack4.3 for CUDA 10.0.
There are some GPU dependency between OS and CUDA toolkit.
You will need to use the OS and CUDA toolkit from the same JetPack version…
JetPack 4.4 is our latest release and it can give you a better performance.
However, some package is not available for JetPack 4.4 yet that might limit your usage.
Here is the solution:
docker run -itd --runtime=nvidia --name pytorch1 -p8302:22 -p8306:6066 -p8308:8888 -v /home/jetbot/containers/pytorch1:/root
/host_dir nvcr.io/nvidia/l4t-pytorch:r32.5.0-pth1.7-py3