DIGITS Server on Nano, or Training Times and Modes

When I read through the “Two Days To A Demo” script, it seemed reasonable to attempt installing a DIGITS server on the Jetson Nano. I got stopped fairly early in such an attempt: I did not want to try Docker and I could not find a PCIe driver to use to adapt the script to the Nano. So I would first like to know

has anyone gotten the DIGITS server to a) run on the Nano, and if so, b) how did it install, and c) what are the training times when the server software is used for the examples?

I ended up using PyTorch and doing the retraining for a client using the plantCLEF data. That took about 12 hours, and I chose to run with a 4GB swapfile and power mode 1 (5W) to keep the system from shutting down during training. The script suggested that 8 hours would do for this retraining example. Thus my next question is

How can one do it that fast on a nano? Do they use a fan and power mode 0? Or a larger swap file? Or perhaps they used a faster microSD card for the system install?

If this question is not answered, I would settle for some more examples I could use to show my client about training and retraining using the Nano, combined with expected runtimes and configurations. Besides power mode and swap file size, what other system configuration parameters can I tweak to improve retraining speed?

Some community members have reported being able to install DIGITS on Jetson (I haven’t heard of on Nano in particular), but yes - technically DIGITS is only supported on PC/server platforms.

If your Nano is shutting down in 10W mode, that would indicate a power supply issue. Have you tried a DC barrel jack power adapter? That should improve the training performance. Disconnecting the display and running headless during training may as well.

Another thing you can do is to store the dataset and your swap file on a USB3 SSD (or use a USB3 SATA adapter).

Yes, I used a 5V 4A power supply with the barrel jack. It took three or four retries and some experimentation with an ice pack before I tried using power mode 1 for the retraining. I passed along the suggestion of “headless retraining” to my client, thanks.

If you know of other successful training examples on the Nano, I would appreciate seeing them with timing and environment data.

Hi grpadmin, these projects also do training onboard Nano using PyTorch:

These projects typically use smaller datasets of a few hundred images, so the training completes in a shorter amount of time.

Hi,

I have trying to install DIGITS in nano too.
https://github.com/dusty-nv/jetson-inference/blob/master/docs/digits-setup.md

But after I start the docker, it exit(1)

nvidia@nvidia-desktop:~ nvidia-docker run --name digits -d -p 8888:5000 -v /home/nvidia/data:/data:ro -v /home/nvidia/digits-jobs:/workspace/jobs nvcr.io/nvidia/digits:18.05 629d9d039966a4a0b14e19f28b62266302ce9683ca9112447d7619aac8190232 nvidia@nvidia-desktop:~ sudo docker ps -a
[sudo] password for nvidia:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
629d9d039966 nvcr.io/nvidia/digits:18.05 “/usr/local/bin/nvid…” 21 seconds ago Exited (1) 15 seconds ago digits
nvidia@nvidia-desktop:~$

I have followed the procedure one by one up to this point. Can I install DIGITS in nano? Thx

I have tried the hello world and the docker works though.
nvidia@nvidia-desktop:~ docker run ubuntu:18.04 /bin/echo "Hello world" Unable to find image 'ubuntu:18.04' locally 18.04: Pulling from library/ubuntu fbdcf4a939bd: Pull complete d3463cc4abcf: Pull complete 4cf5b492942e: Pull complete 7799262edbd8: Pull complete Digest: sha256:8d31dad0c58f552e890d68bbfb735588b6b820a46e459672d96e585871acc110 Status: Downloaded newer image for ubuntu:18.04 Hello world nvidia@nvidia-desktop:~

But if I follow this link, it stopped too. May I assume it is the problem of DIGITS installation?
https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS

nvidia@nvidia-desktop:~ docker run --runtime=nvidia --name digits -d -p 5000:5000 -v /opt/mnist:/data/mnist nvidia/digits ef26b34b5d0e7ce7317542505d6c2b4161b4a9df38f4151fc8bf12390f5e4fad nvidia@nvidia-desktop:~ sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ef26b34b5d0e nvidia/digits “python -m digits” 26 seconds ago Exited (1) 24 seconds ago digits
nvidia@nvidia-desktop:~$

I checked the error log. "standard_init_linux.go:211: exec user process caused “exec format error” What does it mean?

nvidia@nvidia-desktop:~ docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits a6986747b2a69cbe40dbf61c5b71020b73087eba48cfad826a2b41fb6238d28f nvidia@nvidia-desktop:~ docker logs a6986747b2a69cbe40dbf61c5b71020b73087eba48cfad826a2b41fb6238d28f
standard_init_linux.go:211: exec user process caused “exec format error”

My DIGITS is updated.

nvidia@nvidia-desktop:~$ docker pull nvidia/digits
Using default tag: latest
latest: Pulling from nvidia/digits
Digest: sha256:9b37921080efcedb93e1cd138b8981de14c65ca4cdb2dbcbb465d02a0fb6a513
Status: Image is up to date for nvidia/digits:latest
nvidia@nvidia-desktop:~$

When I checked the docker log, "standard_init_linux.go:211: exec user process caused “exec format error”
Does that mean Nano can not run DIGIT? Thx

Hi AK51, DIGITS isn’t officially supported on Jetson, and the Docker image from NGC is built for x86_64 (so it won’t run on ARM aarch64). You could try installing DIGITS from source on Jetson, but it is recommended to run DIGITS on a PC/server system.

Thanks for your reply.
I just need to create a simple and fast object detection on nano, and output the coordinate. i.e. Just to track a robot (one object) if it appears.

I have done the training in Azure and output the onnx. linux_dockerfile and tensorflow, how can I put in your object detection python code?

Or is there any suggestion?
Note: I did try the on-board object tracking sample, but it is easy to lose the object tracking.

Btw, is there a link to build DIGITS on nano? Thx
robot.zip (51.3 MB)

Then how to know which one is working or not for Jetson?
I spent quite a time till finding this. Please help us not to repeat the same mistake of mine and AK51.

Hi @dkcog123, if you want to do DNN training onboard Jetson, it is recommended to use PyTorch like in this tutorial:

https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-transfer-learning.md