Trouble trying to install torch in Docker container on JP6.0-dp

gavin28 · March 21, 2024, 4:02am

System info:
Device: Jetson Orin NX 16GB
Jetpack: 6.0-dp (freshly flashed, clean system)
CUDA: 12.2
torch: https://developer.download.nvidia.com/compute/redist/jp/v60dp/pytorch/torch-2.2.0a0+81ea7a4.nv24.01-cp310-cp310-linux_aarch64.whl (I tried using nv.24.02 as well)
MarkupSafe: MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
l4t-base: nvcr.io/nvidia/l4t-base:r36.2.0

Additional info:

Using Hatch I am able to get "python3 -c “import torch; print(torch.cuda.is_available())” → True. (Hatch is basically just virtualenv in this case)
The regular cpu arm64 version of torch installs fine, but torch.cuda.is_available() is False… of course.
The output is from tmux with funky formatting. I tried to fix it a bit, but it may still be funky.

jet@ubuntu:~$ sudo docker run -it --runtime nvidia reg.companyname.com/nvidia/l4t-base:r36.2.0 bash → success
root@hostname:/# nvidia-smi → success
root@hostname:/# apt update && apt install -y python3-pip libopenblas-dev libopenmpi3 → success
root@hostname:/# pip3 install -i https://username:password@local.registry.with.torch.whl.com/simple torch → success
root@hostname:/# python3 -c “import torch; print(torch.cuda.is_available())” → failure
Traceback (most recent call last):
File “/usr/local/lib/python3.10/dist-packages/torch/init.py”, line 175, in _load_global_deps ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File “/usr/lib/python3.10/ctypes/init.py”, line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.12: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “”, line 1, in
File “/usr/local/lib/python3.10/dist-packages/torch/init.py”, line 235, in _load_global_deps()
File “/usr/local/lib/python3.10/dist-packages/torch/init.py”, line 196, in _load_global_deps_preload_cuda_deps(lib_folder, lib_name)
File “/usr/local/lib/python3.10/dist-packages/torch/init.py”, line 161, in _preload_cuda_deps
raise ValueError(f"{lib_name} not found in the system path {sys.path}")
ValueError: libcublas.so.*[0-9] not found in the system path [‘’, ‘/usr/lib/python310.zip’
, ‘/usr/lib/python3.10’, ‘/usr/lib/python3.10/lib-dynload’, ‘/usr/local/lib/python3.10/dis
t-packages’, ‘/usr/lib/python3/dist-packages’, ‘/usr/lib/python3.10/dist-packages’]

Questions (Really I’m just looking for general advice on how to proceed from here):

This libcudart.so dl error appears to be from a cuda version mismatch. Is this a mismatch between the version of cuda pytorch was compiled for vs what’s on my JP6.0-dp system? JP6.0-dp uses CUDA12.2.
Is installing pytorch manually from l4t-base even feasible or is l4t-pytorch for jp6.0 really the only option? Is that coming out soon?
Should we give up on Docker and port to a system level deployment? This is possible, but a lot of effort and technical debt :(

AastaLLL · March 21, 2024, 8:31am

Hi,

1. l4t-base doesn’t have CUDA preinstalled. It only contains the basic OS.
Please try l4t-cuda instead.

2. You can try l4t-ml.

3. You don’t need to. Please try the above container first.

Thanks.

system · April 10, 2024, 6:10am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error importing torch inside a L4T container in Jetson Nano Jetson Nano cudnn	4	262	February 5, 2024
OSError: libcurand.so.10: cannot open shared object file: No such file or directory Jetson AGX Xavier docker , pytorch	2	548	June 21, 2022
How to get cuda working in a docker container for pytorch applications Jetson AGX Orin docker , pytorch	3	530	May 30, 2024
How to install Pytorch and torchvison on NVIDIA L4T Base from NGC catalog Jetson AGX Orin pytorch	4	1422	June 13, 2023
Docker container cannot find CUDA libraries (libcurand.so.10) Jetson Nano docker , jetson-nano	6	1291	April 26, 2023
PyTorch on L4T Docker image Jetson Nano	5	2801	October 14, 2021
Pytorch installed on l4t-jetpack:r35.4.1 container on Jetson Orin Nano (JetPack 6.0 Developer Kit) fails to recognize CUDA Jetson Orin Nano cuda , docker , pytorch , python , containers	2	67	October 22, 2024
Importing PyTorch fails in L4T R32.3.1 Docker image on Jetson Nano after successful install Jetson Nano docker , pytorch	2	2638	October 18, 2021
Error in Building Machine Learning Container for Jetson and Jetpack 4.4 Jetson AGX Xavier docker	3	615	October 18, 2021
how can i install the pytorch? Jetson TX2	10	8659	October 18, 2021

Trouble trying to install torch in Docker container on JP6.0-dp

Related topics