Hey everyone, so I have been training the mobilenet ssdv2 model to detect ping pong balls. I have been following the guide on youtube from Dusty. I have around 600 different images of a white ping pong ball and a orange ping pong ball that I labelled and created border boxes for in each of the images.
The first time I trained the model I only had 120 images on 10 epochs, and the issues were that it couldn’t detect the ping pong balls most of the time. It would occasionally detect the white ping pong ball, but with only around 50% accuracy and most of the time even with the ball right up to the camera it doesn’t detect it. And as for the orange ping pong ball it doesn’t detect it at all.
The second time I attempted to add more images of different variety to the dataset, so I ended up with a total of around 600 images of ping pong ball images (Both the orange and white pong ball are in the same images when I trained it). Both the white ping pong ball and orange ping pong ball are in the labels.txt and I properly drew the border box on each of them in each image. So I ran this data for 30 epochs, and then I tested it, I once again get the same issues as before of it having poor accuracy and not being able to detect the orange ping pong ball at all. Also it keeps mistaking white objects as the white ping pong ball.
I was wondering if training on an a model I already is the reason why the results are basically identical even after the increase of images and epochs.
Do you guys have any suggestions or any insight on why am I getting such poor results? Note I am testing the ping pong balls in the exact same environment with the same lightning that I have trained it in. Any help would be appreciated, thank you.
My labels in labels.txt is as shown below:
I have all the files train.txt, trainval.txt, test.txt, and val.txt. After running the camera capture program I only had the image labels under train.txt and test.txt and I just copied all the image names from train.txt to the other two.
Hi @vincent.phung26, can you try deleting the .engine file from your model’s directory and running detectnet again? If you ran onnx_export.py multiple times, detectnet might be used an older cached version of the TensorRT engine, and deleting it will re-generate it the next time you run detectnet (note that this was fixed in a more recent build of jetson-inference, where it compares the model checksums to determine when the model was updated and will then automatically re-generate the TensorRT engine, but I’m not sure what build of jetson-inference you are using)
You can also run train_ssd.py with the --validation-mean-ap flag or run eval_ssd.py, and this will compute the per-class detection accuracies for you, so you can tell how well the model is trained overall and for each object class. Since the ping-pong balls you’re detecting are small, you may also want to try training it with --resolution=512 (instead of the default 300x300 resolution) and for more epochs to see if that helps.
I tried deleting the engine file and I reran detectnet again and the results still weren’t much better. So I’m trying to run the --validation-mean-ap flag and --resolution=512 flag. However, when I run it in the terminal like this:
I deleted the engine file, and reran detect net and now the accuracy is much better but it only really works when the camera is at a distance of 50cm or less. As for the --resolution flag, I ran git pull on the master branch in the jetson-inference folder and there were updates in the terminal I saw, but when I start up the container to run
I still get the unrecognized arguments error. I’m not sure what im doing wrong, if you have any fixes for this and recommendations on how to make the detect net program recognize the ping pong balls from a further distance that would help alot. But asides from that up close the accuracy is usually 60-98%, but at a further distance it doesn’t even recognize the ping pong balls.
OK, glad that regenerating the TensorRT helped your model. What version of JetPack-L4T are you running? My guess is that the container hasn’t been updated, or you need to sudo docker pull dustynv/jetson-inference:rXX.X (where XX.X is your L4T version, or check the log when you first start docker/run.sh to see the container it runs)
Or when you are inside the container, you can git pull pytorch-ssd inside the container the same way that you did outside the container:
git checkout master