L-BFGS and AdaHessian Optimizers

Hi I have a couple of questions.

1-) How can I use l-bfgs optimizer? I can not find any example. I want to use it just by itself. and also, I want to use it after an optimization period by using adam optimizer then l-bfgs optimizer. Several papers used this approach.

2-) how can I use Adahessian optimizer. Agai I can not find any example.

3-) During the training I only want to save the best model. But as far as I see, periodically given a parameter value, modulus saves models sequentially. How can I save only the best model?
Best Regards
Thank you

1 Like

HI @udemirezen

Thanks for your interest in Modulus. Responses to your questions are sectioned below:

How can I use l-bfgs optimizer? I can not find any example. I want to use it just by itself. and also, I want to use it after an optimization period by using adam optimizer then l-bfgs optimizer. Several papers used this approach. how can I use Adahessian optimizer. Agai I can not find any example.

Switching between optimizers should be as simple as changing the your config.yaml file. There’s info on optimizers we support in the config in the user-guide. Related source code if you are familiar with Hydra. For example:

defaults :
  - modulus_default
  - arch:
      - fully_connected
  - scheduler: tf_exponential_lr
  - optimizer: adahessian # or bfgs
  - loss: sum
  - _self_

optimizer: 
  lr: 1e-3 

The reason why we have no examples that use L-BFGS / AdaHessian is that typically we have found that Adam works. But its still great to experiment with these. Note that you may have to tune the parameters of the optimizer which can be done in the yaml as well. The parameters you can control / defaults are found here.

During the training I only want to save the best model. But as far as I see, periodically given a parameter value, modulus saves models sequentially. How can I save only the best model?

If your training is set up correctly and stable, the best model should be the model at the end of training which should have the lowest training/validation loss if there is no over fitting. If you’re interested in early stopping based on validation error have a look at the stop criteria utility which can be set up in your config (there is presently a bug with this feature, please see this post with the fix).

1 Like

Thank you very much for your good explanation. Now everything is crystal clear :D

Hi @udemirezen, any idea how to customize Modulus with adam + l-bfgs optimizer? as you mentioned, it is getting popular to see combining these two optimizers. thanks.

Hi,

Regarding the use adam + l-bfgs optimizer, is it possible to run e.g. 10,000 steps with adam. Then I change the optimizer to l-bfgs and resuming running from the 10,000th step? Does it work this way?

Thanks

1 Like

Hi @jay_case

Right now its not possible to have the combination of the two directly working with each other in the training loop. It should be possible however to train for a fixed number of epochs, grab the model checkpoint, then train using that checkpoint and a different optimizer as @tsltaywb mentioned.

Deleting (or renaming, preferred) your optim_checkpoint.pth in the outputs folder will force modulus to not try to reload the old optimizer state. Thus you can switch from say Adam to BFGS in your config and use the same training directory.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.