Hello
A Jetson newb here trying to develop & train models on a WSL2-facilitated docker container to leverage the larger GPU on my Windows 10 desktop.
Before I get into the technical issue, my first question is more practical/strategic:
Question 1
Is this the right approach to develop/train on the exact same environment as the Jetson nano by using an aarch64 docker image on my desktop? Or is this overkill - and I should use a bog-standard container (normal x86 18.04 distro or even my Win10 desktop?) to train the model and then send over to the nano?
I understand the above is not strictly a Jetson nano question per se, but the answer will help me decide how much time to spend on the technical issue below…
Question 2
I am able to get several aarch56 containers (see below) up and running with Pytorch prepackaged in the container or installed afterwards, and can even import cv2 seamlessly. However I cannot import torch
without the following error:
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
With my limited knowledge, it looks like there is no jetpack installed in this image? Or maybe its a disconnect between Cuda versions? I’m at a loss how to troublshoot further
Other posts discussing this issue suggest adding /usr/local/cuda/lib64
to path, however any docker image I create only has 3 files in this directory, none of which are the one in question.
I have been following this post that discusses setting up Jetson containers using docker and WSL2, and have leveraged Ian Davis’ repo and guides and some other tutorials - all lead me to this same issue.
My setup: Windows 10 Cuda-11.3, Build 21370
Main images tried: nvcr.io/nvidia/l4t-ml:r32.5.0-py3, nvcr.io/nvidia/l4t-pytorch:r32.5.0-pth1.7-py3, nvidia/l4t-base:r32.2.1
WSL2: nvcc --version
returns Cuda compilation tools, release 9.1, V9.1.85
Many thanks for any guidance you can provide