How can I create my custom containers for Jetson Nano

mertnano · April 2, 2021, 11:23am

Hi everybody!

I am a newbie about the dockers and containers. I checked a few images for containers in nvidia catalog but the configuration I want is not available.

What I want to have is similar to l4t-ml:r32.5.0-py3 but rather than using Tensorflow 1.15, I want to use Tensorflow 2.3.1 as in the l4t-tensorflow:r32.5.0-tf2.3-py3. I decided to build my own image, however, I got confused while checking it. Is there anybody that can point to a nice and easy to follow tutorial, possibly for jetson nano or explain to me how to do that?

l4t-tensorflow:r32.5.0-tf2.3-py3
TensorFlow 2.3.1

l4t-ml:r32.5.0-py3
    TensorFlow 1.15
    PyTorch v1.7.0
    torchvision v0.8.0
    torchaudio v0.7.0
    onnx 1.8.0
    CuPy 8.0.0
    numpy 1.19.4
    numba 0.52.0
    OpenCV 4.4.1
    pandas 1.1.5
    scipy 1.5.4
    scikit-learn 0.23.2
    JupyterLab 2.2.9

Cheers

mehmetdeniz · April 2, 2021, 11:53am

Hi @mertnano

There is @dusty_nv’s jetson containers’ repository

In my opinion, you can clone and change the Dockerfile.ml and scripts/docker_build_ml.sh files as you want:

git clone https://github.com/dusty-nv/jetson-containers.git
cd jetson-containers
sed -i "s/32.4.4/32.5.0/g" ./Dockerfile.ml
sed -i "s/TENSORFLOW_IMAGE=l4t-tensorflow:r$L4T_VERSION-tf1.15/TENSORFLOW_IMAGE=l4t-tensorflow:r$L4T_VERSION-tf2.3/g" ./scripts/docker_build_ml.sh
./scripts/docker_build_ml.sh all

mertnano · April 2, 2021, 12:38pm

I guess this should ideally work. Thank you for clarity! It failed for me with the following error. But in case it did not have any errors, what would I expect as next step? This creates an image to run, right? So I guess I should run it with

sudo docker run …

Error I run into:

Cloning into ‘torchvision’…
Note: checking out ‘01dfa8ea81972bb74b52dc01e6a1b43b26b62020’.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

Traceback (most recent call last):
  File "setup.py", line 12, in <module>
    import torch
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 195, in <module>
    _load_global_deps()
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 148, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory

mehmetdeniz · April 2, 2021, 12:57pm

The container test scripts also added into the repo.

You can test your image with:

./scripts/docker_test_ml.sh all

If there are any problems about packages into container, this script should find them.

dusty_nv · April 2, 2021, 3:52pm

Hi @mertnano, if you get that error while building container, make sure you have set your default docker runtime to nvidia (and reboot) - https://github.com/dusty-nv/jetson-containers#docker-default-runtime

mertnano · April 3, 2021, 6:17am

Thank you both for great help! @dusty_nv and @mehmetdeniz .

I was able to create my own image with the setting I asked for. After building I run the image successfully. One last newbie question is, which line can I change to set the custom name I want?
I am guessing the following, is that right?

sh ./scripts/docker_build.sh l4t-ml:r$L4T_VERSION-py3 Dockerfile.ml \

Best

mehmetdeniz · April 4, 2021, 8:42pm

I think, this script has some arguments left.

You can type like that (from docker_build_ml.sh):

sh ./scripts/docker_build.sh l4t-ml:r$L4T_VERSION-py3 Dockerfile.ml
–build-arg BASE_IMAGE=$BASE_IMAGE
–build-arg PYTORCH_IMAGE=l4t-pytorch:r$L4T_VERSION-pth1.7-py3
–build-arg TENSORFLOW_IMAGE=l4t-tensorflow:r$L4T_VERSION-tf2.3-py3
–build-arg L4T_APT_SOURCE=“deb https://repo.download.nvidia.com/jetson/common r32 main”

mehmetdeniz · April 4, 2021, 8:50pm

In building script, you can change your custom name instead of “l4t-ml:r$L4T_VERSION-py3”
or
You can change your current image name with “docker tag” script.

Best wishes

acarbon · April 6, 2021, 6:40am

Hi @mehmetdeniz,
I was trind to do the same but, after some warning at stage 8/19 I get this error

The command ‘/bin/sh -c git clone --recursive -b ${TORCHAUDIO_VERSION} GitHub - pytorch/audio: Data manipulation and transformation for audio signal processing, powered by PyTorch torchaudio && cd torchaudio && python3 setup.py install && cd …/ && rm -rf torchaudio’ returned a non-zero code: 1

can You suggest me something, please?
Alessandro

mehmetdeniz · April 6, 2021, 8:37am

Hi @acarbon
Did you get this message?

echo “done building PyTorch $pytorch_whl, torchvision $vision_version ($pillow_version), torchaudio $audio_version”

acarbon · April 6, 2021, 11:55am

hi @mehmetdeniz, no I didn’t

dusty_nv · April 6, 2021, 4:27pm

Hi @acarbon, did you first set your docker default-runtime to nvidia (and reboot)? https://github.com/dusty-nv/jetson-containers#docker-default-runtime

If so, are you able to build the container before you made modifications to the script? If you clone a fresh copy of jetson-containers repo, can you run this?

$ git clone https://github.com/dusty-nv/jetson-containers jetson-containers-original
$ cd jetson-containers-original
$ ./scripts/docker_build_ml.sh pytorch

acarbon · April 6, 2021, 8:01pm

hi, Yes, the default runtime is right, and in fact I can’t run even the standard container,
I get lots of similar errors
Skipping link https://files.pythonhosted.org/packages/36/06/1feea5c3fdcced8847f3a80c9a912cc065bcdafc1cb3e34d63f21391950d/numpy-1.16.3-cp27-cp27m-win32.whl#sha256=315fa1b1dfc16ae0f03f8fd1c55f23fd15368710f641d570236f3d78af55e340 (from Links for numpy) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.,!=3.3.); it is not compatible with this Python

it seems something wrong in python version
thank’s
Alessandro

dusty_nv · April 6, 2021, 8:09pm

That is normal message when pip3 install --verbose is used. It will eventually find the right package.

What is the actual error that causes it to terminate compilation? Can you attach the whole log?

acarbon · April 6, 2021, 9:41pm

hi @dusty_nv, in attachment the whole log.
tank’s
putty.log (957.9 KB)

dusty_nv · April 7, 2021, 1:00am

OK, so this is the error from your log:

FAILED: third_party/kaldi/CMakeFiles/kaldi.dir/submodule/src/feat/feature-functions.cc.o 
/usr/bin/c++   -I../../third_party/kaldi/src -I../../third_party/kaldi/submodule/src -isystem /usr/local/lib/python3.6/dist-packages/torch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/cuda-10.2/include -Wall -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility=hidden -O3 -DNDEBUG -fPIC   -D_GLIBCXX_USE_CXX11_ABI=1 -std=gnu++14 -MD -MT third_party/kaldi/CMakeFiles/kaldi.dir/submodule/src/feat/feature-functions.cc.o -MF third_party/kaldi/CMakeFiles/kaldi.dir/submodule/src/feat/feature-functions.cc.o.d -o third_party/kaldi/CMakeFiles/kaldi.dir/submodule/src/feat/feature-functions.cc.o -c ../../third_party/kaldi/submodule/src/feat/feature-functions.cc
c++: internal compiler error: Killed (program cc1plus)

It seems to indicate that the compiler was killed by Linux, presumably due to low memory situation. Which Jetson are you building this on? If you check dmesg, do you see any messages about out of memory or OOM?

Can you keep an eye on your memory usage (via tegrastats) while this is running? I suggest disabling ZRAM and mounting disk swap, as shown here: https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-transfer-learning.md#mounting-swap

acarbon · April 7, 2021, 5:44am

I’m using the Jetson nano 4 gb… here the out of memory message

55512.297924] Out of memory: Kill process 14497 (cc1plus) score 161 or sacrifice child
[55512.352939] Killed process 14497 (cc1plus) total-vm:1030008kB, anon-rss:681844kB, file-rss:0kB, shmem-rss:0kB
[55512.736787] oom_reaper: reaped process 14497 (cc1plus), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[55580.388286] docker0: port 1(veth2a20320) entered disabled state

I’ll try to mount a disk swap.
thank’s

flogarcia999 · April 7, 2021, 3:25pm

I am using jetson nano 2GB Develop ment kit with Jept451 image. I have executed /scripts/docker_test_ml.sh all and I received the following error:

L4T BSP Version: L4T R32.5.1
testing container l4t-pytorch:r32.5.1-pth1.8-py3 => PyTorch
xhost: unable to open display “”
Unable to find image ‘l4t-pytorch:r32.5.1-pth1.8-py3’ locally
docker: Error response from daemon: pull access denied for l4t-pytorch, repository does not exist or may require ‘docker login’: denied: requested access to the resource is denied.
See ‘docker run --help’.

Can I install the jetson container into Jetson nano 2GB ??? to do this step spent more than 12 hours

  #build_pytorch "https://nvidia.box.com/shared/static/lufbgr3xu2uha40cs9ryq1zn4kxsnogl.whl" \
        #                         "torch-1.2.0-cp36-cp36m-linux_aarch64.whl" \
        #                         "l4t-pytorch:r$L4T_VERSION-pth1.2-py3" \
        #                         "v0.4.0" \
        #                         "pillow<7"

mehmetdeniz · April 7, 2021, 3:41pm

Hello @flogarcia999

You can type xhost + command before starting docker.

You can install from this catalog Data Science, Machine Learning, AI, HPC Containers | NVIDIA NGC

dusty_nv · April 7, 2021, 4:39pm

If you want to run that test script without building the containers locally, edit the script so that nvcr.io/nvidia/ is included before the container tags here:

https://github.com/dusty-nv/jetson-containers/blob/1e10908a104494a883f6855d1e9947827f2a17bc/scripts/docker_test_ml.sh#L164

Like this:

test_pytorch_all "nvcr.io/nvidia/l4t-pytorch:r$L4T_VERSION-pth1.8-py3"
test_tensorflow_all "nvcr.io/nvidia/l4t-tensorflow:r$L4T_VERSION-tf1.15-py3"
test_tensorflow_all "nvcr.io/nvidia/l4t-tensorflow:r$L4T_VERSION-tf2.3-py3"
test_all "nvcr.io/nvidia/l4t-ml:r$L4T_VERSION-py3"

That script just runs the tests of the containers. The PyTorch tests take awhile because it runs a bunch of models and verifies their accuracy. If you just want to run the container, see the l4t-pytorch page on NGC for the docker run command.

Topic		Replies	Views
Dear nvidia - why are you so awesome and terrible at the same time? Jetson Nano	2	468	October 14, 2021
Is the 'dli-nano-ai' Dockerfile published anywhere? Jetson Nano docker	21	1666	October 15, 2021
Cant install Pytorch on JetsonNano P3450 Jetson Nano pytorch	21	2731	August 16, 2023
Suggestion to solve Tegra Nvidia-docker issues Jetson Nano cuda , docker	20	4263	October 15, 2021
Docker image for cross compilation? Jetson Nano	22	7257	October 18, 2021
Fail to run NVIDIA Pytorch docker container in jetson nano card (Jetpack 4.2.1) Jetson Nano tensorrt , docker , pytorch	2	738	October 18, 2021
Jetson container package torch2trt on nano orin 8gb Developement kit Jetson Orin Nano containers	2	699	September 6, 2023
JetPack 4.6 Production Release with L4T 32.6.1 Jetson Nano	47	12274	March 10, 2022
Pytorch support Jetson Nano	31	4918	October 18, 2021
Running the docker/build.sh doesn't works Jetson Xavier NX opencv , jetson-inference	10	1625	August 10, 2022

How can I create my custom containers for Jetson Nano

Related topics