Thumbs Project, please help

Hi Guys,

i have a Problem with the Thumbs Project on (Course Getting Started with AI on Jetson Nano Image Classification Thumbs Project.

i follow the instruction everything works but whenever i press the Train button my Nano hang up.

What i did:
check the User rights from Folder with the Pictures i made → ok
check Internet connection → ok
try Powermode 0/1

same Problem all the time

i need Support please.

Thank you in advance
Regards from Germany

Hey Guys,

i found that solution with checking the Folder of empty Files.
I dont have any empty Files in the Folder.

Folder structure look like that :

Home/nvdli-data/classification/thumbs_A/thumbs_down
Home/nvdli-data/classification/thumbs_A/thumbs_up

1 Like

Hi @wokitronic, are you able to see that there are indeed image files in your thumbs_A/thumbs_down and thumbs_A/thumbs_up folder?

It is normally for the Nano to take a couple minutes for it to initially start the training (PyTorch is loading). However, if it hangs indefinitely, can you try these suggestions below?

  • I notice in your screenshot that you are browsing JupyterLab from your Jetson desktop. Can you try browsing it from a PC? You should be able to type that URL into your PC’s browser and connect. This will free up more resources on your Jetson.

  • You might also want to shut down your Jetson’s desktop entirely by disconnecting the display, or shutting down the desktop

  • Do you have SWAP memory mounted? Can you keep an eye on the memory usage by running sudo tegrastats in the background?

Many thanks dusty

I just disconnect HDMI Desktop from Jetson and Start Jupyter with my Laptop and it works

Happy with That 😁

I have a similar issue in that each time I click to Train the model in the Thumbs Project, my Nano crashes and shuts down.

I have what I guess is a first gen Jeston Nano 4GB running with the latest JetBot image:

Jetson Nano (4GB) 4.4.1 0.4.2 jetbot-042_nano-4gb-jp441.zip

This is running headless so I am using a remote system to run through the course exercises.

Also, I am using the Wi-Fi connect (wlan0) instead of the hardwired ethernet so I am not sure if that makes a diff for the Docker container.

I ran the “sudo tegrastats” that Dusty recommended and these are the last lines before the crash:

RAM 2382/3964MB (lfb 107x4MB) SWAP 21/1982MB (cached 0MB) IRAM 0/252kB(lfb 252kB) CPU [29%@1479,27%@1479,30%@1479,29%@1479] EMC_FREQ 5%@1600 GR3D_FREQ 0%@76 APE 25 PLL@28C CPU@28.5C iwlwifi@40C PMIC@100C GPU@30C AO@36.5C thermal@29C POM_5V_IN 2874/1796 POM_5V_GPU 0/3 POM_5V_CPU 932/488
RAM 2424/3964MB (lfb 107x4MB) SWAP 26/1982MB (cached 0MB) IRAM 0/252kB(lfb 252kB) CPU [34%@1479,36%@1479,27%@1479,26%@1479] EMC_FREQ 5%@1600 GR3D_FREQ 0%@76 APE 25 PLL@28.5C CPU@28.5C iwlwifi@40C PMIC@100C GPU@26.5C AO@36.5C thermal@27.5C POM_5V_IN 3617/1804 POM_5V_GPU 38/3 POM_5V_CPU 1637/492
RAM 2477/3964MB (lfb 107x4MB) SWAP 49/1982MB (cached 0MB) IRAM 0/252kB(lfb 252kB) CPU [41%@1479,46%@1479,31%@1479,40%@1479] EMC_FREQ 5%@1600 GR3D_FREQ 7%@76 APE 25 PLL@28.5C CPU@28.5C iwlwifi@40C PMIC@100C GPU@27C AO@36.5C thermal@27.75C POM_5V_IN 3384/1810 POM_5V_GPU 76/4 POM_5V_CPU 1461/496
RAM 2554/3964MB (lfb 107x4MB) SWAP 51/1982MB (cached 0MB) IRAM 0/252kB(lfb 252kB) CPU [32%@1132,32%@1132,46%@1132,35%@1132] EMC_FREQ 5%@1600 GR3D_FREQ 99%@384 APE 25 PLL@28.5C CPU@28C iwlwifi@40C PMIC@100C GPU@27C AO@36.5C thermal@27.75C POM_5V_IN 3241/1816 POM_5V_GPU 346/5 POM_5V_CPU 963/498

I’ll try to shutdown the desktop as was suggested previously to see if that helps.

Cheers,

Jon

I connected the Nano via the hardwired ethernet interface and disabled the wlan0 and tried again. This time it seems to be working.

However, this message does show in the log files when clicking Train :

/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:59: UserWarning: This overload of nonzero is deprecated:
nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
nonzero(Tensor input, *, bool as_tuple) (Triggered internally at …/torch/csrc/utils/python_arg_parser.cpp:766.)

Also, in the code under the title “Training and Evaluation”, it uses a try/except with :
except e:

If this code is reached, I believe it will case an error since ‘e’ is not defined.

An option would be to use:
except Exception as e:

Cheers,

Jon

Hi @jonnymovo, your board abruptly powering off is typically a sign of a power supply issue. Which power supply are you using?

If you put the board into 5W mode by running this command on your device, does it no longer shutdown?

sudo nvpmodel -m 1

If it no longer shuts down in 5W mode, then you may want to upgrade your power supply. My guess with the wifi vs ethernet, is that the wifi adapter was using more power.

I can give that a try. I have a 5V 3A power supply connected to the barrel jack so I suppose I could look for something with amperage.

Thanks.