LPRnet saving checkpoint on all epochs (huge disk consumption)

• Hardware: (GTX 1660)
• Network Type: LPRnet
• TLT Version: docker_tag: v3.0-py3
• Training spec file: tutorial_spec.txt (1.3 KB)

I noticed that some models accept the checkpoint_interval parameter on the spec file but nothing about it is mentioned on the lprnet documentation .

Does lprnet accept this parameter? I’m trying to use it but TLT still saves on all epochs.

LPRnet does not support saving intermediate models yet.
For workaround, please write a script to delete the models with an interval.

1 Like

I’m able to save the intermediate models with the provided code. Can you clarify your statement?

That’s a nice idea but can you please confirm if the checkpoint_interval can be used or if there is something similar?

Could you share how did you run?

I’m using the notebook provided in TLT Quick Start Guide, specifically the one named lprnet.ipynb. In this notebook, the default behavior is to save intermediate models for every epoch.

OK, yes, lprnet can save all the models for every epoch. But LPRnet does not support saving models with an interval yet.