Training doesn't converge for Mapillary Vistas Dataset training with MaskRCNN

Let me retrain with 125 classes.
Here in the original post said 124 classes.

Then he also train with 124 classes.

But in my python code for MapillarytoCOCO conversion, I have only 18 classes as shown in main.py
main.py (9.6 KB)

Original is 37+1 classes