I have very basic questions.
I am developing a binary classifier.
It is for use in a real-world industrial application to identify damage inside a large, inaccessible structure. The classes are ‘healthy’ and ‘damage’. I have data for both classes in the form of wideangle still images and cropped details from them.
Obviously damage only occurs within a part of an otherwise healthy frame. The data has been gathered using a system that provides variation-free images (no angle, focus variation, limited exposure variation – proximity is the only variable apart from the individual subject matter itself). Both classes are comprised, in visual terms, of various material textures.
So far I have built 2 versions of a database and tested them using the jetson-inference tutorials in order to make basic decisions like choice of pretrained model and balance of full-frame vs. detail images across the two classes. I hope to take the material to the next stage using the TAO toolkit.
I have tried GoogleNet, ResNet18, ResNet34, ResNet50.
Despite more than 2 months work I have not achieved any significant progress. I built the first database to 1,000 images and its confusion matrix stubbornly remained at (true positive x true negative) = (false positive x false negative), despite varying accuracy for ‘best model’ at the validation stage. It resulted in an 80% ‘damage’ inference. This never varied throughout the build.
The size of the first database became an unwelcome barrier to further development in the light of the lack of progress. I stripped down the material and only used one type of damage in the ‘damage’ class in a second database.
The results using the new database are essentially the same as before with the only difference being that 80% is now a ‘healthy’ inference.
I have kept records of the results of each 100 epoch training run.
• should I be looking at a different backbone architecture for my user case?
• should I be using a more complicated detection/classification process?
• there are currently more wide frame images in the ‘healthy’ class and more detail images in the ‘damage’ class in the database, with a mix in both and the two classes are otherwise balanced. Is this good practice? Is there a better way?
• or, does anyone have any advice please?
I am using the 4 Gb Nano devkit and additional swap
Thank you and apologies for the long question.