Currently, only these stopping criteria are supported;
stop_criterion: - metric: 'l2_relative_error_u' - min_delta: 0.1 - patience: 5000 - mode: 'min' - freq: 2000 - strict: true
I wanted to stop the training when the training loss goes below a certain limit. In simple words.
while training: if training loss <tol: break
I am using a bare metal NVIDIA Modulus, so I can edit the source code, if it needs slight modification to achieve this. I can see in the
trainer.py a simple
break is implemented to stop the training when stopping criteria is met or when maximum training iterations is reached. https://gitlab.com/nvidia/modulus/modulus/-/blob/release_22.09/modulus/trainer.py#L669
I want to add the if condition here at the start of each iteration. https://gitlab.com/nvidia/modulus/modulus/-/blob/release_22.09/modulus/trainer.py#L496
How do I access the training loss? Is it a part of the dictionary
I also need to save the iteration number where the training loss met this criteria.