I am working through the DLI Intro course on Getting Started with AI on the Jetson nano (classification_interactive.ipynb). I have successfully worked through all of the steps up to training the thumbs up / thumbs down version of RESNET. I have taken 30 images for each thumbs up and down and confirmed the files exist. When I select the train button, the train and evaluate buttons “grey out”. The progress bar will advance after a very long time - hours, but then stop at about 20%. The browser is chrome running on an iMac. The nano has the Intel M2 WiFi card installed but is hard wired ethernet and is running in headless mode. I am using the Adafruit 5V 4A power supply. There is continuous ethernet activity.
I have tried the following with no success:
updating all of the packages for the DLI Nano image
turning on the fan at full speed (sudo sh -c 'echo 255 > /sys/devices/pwm-fan/target_pwm' )
reducing the power profile ( sudo nvpmodel -m 1 )
At this point, I am stuck and would greatly appreciate any help.
I had this issue too. Probably you have some zero sized images in your batch and the training gets stuck on that. Open a console and check the size of all images e.g. with ls -la. You have to delete them and replace them with new ones i.e. just capture new ones so that you have again 30 images.
I had checked the files through the Jupyter Lab interface and they were there, but never thought to look at the file sizes. Six files in the thumbs down folder were zero bytes! I deleted and replaced those images and the training started within 30 seconds and all 10 epochs completed in just a few minutes. Thumbs UP and down detecting working very well, even with a cluttered background in the image.
Hi, I have a related question. While training the thumbs up / thumb down example I mistakenly entered a number of thumbs up images to the thumbs down data set. Now everything is saying thumbs up in live mode. I did not save the model, but even if I close out Jupyter and shut down the nano and come back up it, the data is still there. I don’t see where the images are stored. What I would like to do is start completely over (and be more careful), to retrain with new data.
Hi rdlarson91. It look like you should be able to just delete all of, or even just the wrong images in the photos folder. For the thumbs classification exercise, they should be in either:
depending on which run you were doing. Shutdown the kernel for the classification_interactive notebook (if the notebook is running), delete the photos and then restart the kernel or reopen the notebook. You can then retrain the model on the new / corrected images. Hope that helps.