The notebook is using the version 1 of OCRNet, where the newest you can download is 2.1
I am using OCRNet trainable 2.0 and download it with ngc.
I also had to change the feature_channel: 512 tofeature_channel: 192
Running in nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt
The notebook has the default epoch to be 10 which is fine verifying.
I am now training on the ICDAR15 dataset and I am seeing low accuracy. I am now running an experiment on 2000 epochs.
What are you others experience? What are good defaults? I tried searching on the forum and it there was not a lot of people talking about OCRNet with TAO 😄
I had better luck with going back to trainable_v1.0
I was stuck on accuracy with 0.05 with v2, with v1 I am starting with 0.78
When I was trying v2, it was complaining about feature_channel, saying 192 instead of 512
I also changed the input size since it is half the size for v1 than v2. 200x64x1 vs 100x32x1
It was a bit confusing when I looked at it first, but now with my experience it makes more sense. Could the README be updated to say where to look for what? :)