Jetson nano - retraining detection problems with FDDB demo

kl.sc · June 10, 2020, 4:28pm

I spend several days trying to get the FDDB example working from

/jetson-inference/python/training/detection/train.py

I tried mobilenet_v2 - but it wasn’t implemented in “reshape.py” and I don’t know how to do.
The vgg11 model crashes my Jetson nano.
The interception_v3 was also too big.
resnet18 works but couldn’t locate the face (width and height are negativ values). But I think, resnet18 is only for image classification, or?

Is there any help, getting the example running? Maybe changing the reshape.py? Or tell me the correct commands?
Would be great :)

Meanwhile I try the example “pytorch-ssd” from dusty, but this stops on my Jetson nano and I have to reboot. I will play and try to get it work. Does the model result after converting it to onnx work with detectnet_camera.py?

By the way: In the future I would like to use DIGITS. Can you recommend a good Nvidia GPU to start?

Thank you very much.
Klaus

AastaLLL · June 11, 2020, 2:13am

Hi,

Have you try our transfer learning toolkit?
It can also help you to retrain a model.

Thanks.

kl.sc · June 11, 2020, 9:09am

Hi AastaLLL,

thank you very much for the fast help. It sounds good. And yes, I would like to retrain my own model for image detection.

Just to confirm: TLT works with

Jetson nano only (sd card image 32.3.1 or do I need 32.4.2 for container work)? At the moment I don’t have a work station with Nvidia Graphic card. And no experience with cloud learning.
It allows me to re-train a detection model. At the moment I arrange my data like the FDDB example. Now I have to bring it in the KITTI format.

It looks like, that you need a workstation for it. The Jetson nano alone is not enough.

I will try it. But anyway:

Is there a chance to do the transfer learning like the FDDB demo on the Jetson nano itself? This would be the fastest way for me to get results and convince my boss, as I already have prepared my learning data like in this demo. With Resnet18 (the only one which works) I get always negative values for the bbox size even with the original FDDB data.

Can you recommend a desktop system for using DIGITS (GPU, RAM size) or a free testing cloud service?

Sorry for the questions - I’m new in the AI world.

Thank you very much.

dusty_nv · June 11, 2020, 4:31pm

Hi kl.sc, TLT runs on an x86 system with NVIDIA discrete GPU, it isn’t supported to run TLT on Jetson. Although once the model has been re-trained with TLT, that model can be deployed to Jetson for inference.

In my dev branch of jetson-inference (currently the depth branch), the FDDB code has been removed in leui of pytorch-ssd. Sorry, the FDDB demo was a proof of concept when the PyTorch->ONNX->TensorRT workflow didn’t work with SSD networks (it does now with JetPack 4.4 / TensorRT 7.1). So I am currently building the object detection re-training tutorial around the pytorch-ssd repo, which I am able to run on Jetson Nano with JetPack 4.4.

If pytorch-ssd freezes for you during re-training, have you confirmed that you are able to mount 4GB swap? Also you might want to try smaller batch size. And also you should be on JetPack 4.4.