I am a newbie about the dockers and containers. I checked a few images for containers in nvidia catalog but the configuration I want is not available.
What I want to have is similar to l4t-ml:r32.5.0-py3 but rather than using Tensorflow 1.15, I want to use Tensorflow 2.3.1 as in the l4t-tensorflow:r32.5.0-tf2.3-py3. I decided to build my own image, however, I got confused while checking it. Is there anybody that can point to a nice and easy to follow tutorial, possibly for jetson nano or explain to me how to do that?
There is @dusty_nv’s jetson containers’ repository
In my opinion, you can clone and change the Dockerfile.ml and scripts/docker_build_ml.sh files as you want:
git clone https://github.com/dusty-nv/jetson-containers.git
cd jetson-containers
sed -i "s/32.4.4/32.5.0/g" ./Dockerfile.ml
sed -i "s/TENSORFLOW_IMAGE=l4t-tensorflow:r$L4T_VERSION-tf1.15/TENSORFLOW_IMAGE=l4t-tensorflow:r$L4T_VERSION-tf2.3/g" ./scripts/docker_build_ml.sh
./scripts/docker_build_ml.sh all
I guess this should ideally work. Thank you for clarity! It failed for me with the following error. But in case it did not have any errors, what would I expect as next step? This creates an image to run, right? So I guess I should run it with
sudo docker run …
Error I run into:
Cloning into ‘torchvision’…
Note: checking out ‘01dfa8ea81972bb74b52dc01e6a1b43b26b62020’.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
Traceback (most recent call last):
File "setup.py", line 12, in <module>
import torch
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 195, in <module>
_load_global_deps()
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 148, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
I was able to create my own image with the setting I asked for. After building I run the image successfully. One last newbie question is, which line can I change to set the custom name I want?
I am guessing the following, is that right?
sh ./scripts/docker_build.sh l4t-ml:r$L4T_VERSION-py3 Dockerfile.ml \
In building script, you can change your custom name instead of “l4t-ml:r$L4T_VERSION-py3”
or
You can change your current image name with “docker tag” script.
If so, are you able to build the container before you made modifications to the script? If you clone a fresh copy of jetson-containers repo, can you run this?
It seems to indicate that the compiler was killed by Linux, presumably due to low memory situation. Which Jetson are you building this on? If you check dmesg, do you see any messages about out of memory or OOM?
I’m using the Jetson nano 4 gb… here the out of memory message
55512.297924] Out of memory: Kill process 14497 (cc1plus) score 161 or sacrifice child
[55512.352939] Killed process 14497 (cc1plus) total-vm:1030008kB, anon-rss:681844kB, file-rss:0kB, shmem-rss:0kB
[55512.736787] oom_reaper: reaped process 14497 (cc1plus), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[55580.388286] docker0: port 1(veth2a20320) entered disabled state
I am using jetson nano 2GB Develop ment kit with Jept451 image. I have executed /scripts/docker_test_ml.sh all and I received the following error:
L4T BSP Version: L4T R32.5.1
testing container l4t-pytorch:r32.5.1-pth1.8-py3 => PyTorch
xhost: unable to open display “”
Unable to find image ‘l4t-pytorch:r32.5.1-pth1.8-py3’ locally
docker: Error response from daemon: pull access denied for l4t-pytorch, repository does not exist or may require ‘docker login’: denied: requested access to the resource is denied.
See ‘docker run --help’.
Can I install the jetson container into Jetson nano 2GB ??? to do this step spent more than 12 hours
If you want to run that test script without building the containers locally, edit the script so that nvcr.io/nvidia/ is included before the container tags here:
That script just runs the tests of the containers. The PyTorch tests take awhile because it runs a bunch of models and verifies their accuracy. If you just want to run the container, see the l4t-pytorch page on NGC for the docker run command.