L-BFGS and AdaHessian Optimizers

udemirezen · November 30, 2022, 4:45pm

Hi I have a couple of questions.

1-) How can I use l-bfgs optimizer? I can not find any example. I want to use it just by itself. and also, I want to use it after an optimization period by using adam optimizer then l-bfgs optimizer. Several papers used this approach.

2-) how can I use Adahessian optimizer. Agai I can not find any example.

3-) During the training I only want to save the best model. But as far as I see, periodically given a parameter value, modulus saves models sequentially. How can I save only the best model?
Best Regards
Thank you

ngeneva · December 1, 2022, 9:32pm

HI @udemirezen

Thanks for your interest in Modulus. Responses to your questions are sectioned below:

How can I use l-bfgs optimizer? I can not find any example. I want to use it just by itself. and also, I want to use it after an optimization period by using adam optimizer then l-bfgs optimizer. Several papers used this approach. how can I use Adahessian optimizer. Agai I can not find any example.

Switching between optimizers should be as simple as changing the your config.yaml file. There’s info on optimizers we support in the config in the user-guide. Related source code if you are familiar with Hydra. For example:

defaults :
  - modulus_default
  - arch:
      - fully_connected
  - scheduler: tf_exponential_lr
  - optimizer: adahessian # or bfgs
  - loss: sum
  - _self_

optimizer: 
  lr: 1e-3

The reason why we have no examples that use L-BFGS / AdaHessian is that typically we have found that Adam works. But its still great to experiment with these. Note that you may have to tune the parameters of the optimizer which can be done in the yaml as well. The parameters you can control / defaults are found here.

During the training I only want to save the best model. But as far as I see, periodically given a parameter value, modulus saves models sequentially. How can I save only the best model?

If your training is set up correctly and stable, the best model should be the model at the end of training which should have the lowest training/validation loss if there is no over fitting. If you’re interested in early stopping based on validation error have a look at the stop criteria utility which can be set up in your config (there is presently a bug with this feature, please see this post with the fix).

udemirezen · December 5, 2022, 2:58pm

Thank you very much for your good explanation. Now everything is crystal clear :D

jay_case · December 11, 2022, 6:10am

Hi @udemirezen, any idea how to customize Modulus with adam + l-bfgs optimizer? as you mentioned, it is getting popular to see combining these two optimizers. thanks.

tsltaywb · December 12, 2022, 2:38am

Hi,

Regarding the use adam + l-bfgs optimizer, is it possible to run e.g. 10,000 steps with adam. Then I change the optimizer to l-bfgs and resuming running from the 10,000th step? Does it work this way?

Thanks

ngeneva · December 14, 2022, 2:39am

Hi @jay_case

Right now its not possible to have the combination of the two directly working with each other in the training loop. It should be possible however to train for a fixed number of epochs, grab the model checkpoint, then train using that checkpoint and a different optimizer as @tsltaywb mentioned.

Deleting (or renaming, preferred) your optim_checkpoint.pth in the outputs folder will force modulus to not try to reload the old optimizer state. Thus you can switch from say Adam to BFGS in your config and use the same training directory.

system · December 28, 2022, 2:39am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Applying Multiple Optimizers in Modulus Technical Support (PhysicsNeMo Only)	8	1410	July 5, 2024
Bfgs config error: Key 'max_iters' not in 'OptimizerConf' Report a Bug (PhysicsNeMo Only)	3	798	May 24, 2023
Using BFGS as optimizer and receive error - KeyError: 'max_iter' Technical Support (PhysicsNeMo Only)	8	1117	August 16, 2023
Lbfgs optimizer set the initial state as the final state Report a Bug (PhysicsNeMo Only)	1	609	May 26, 2023
Optimizer for Image Classification TAO Toolkit	8	776	April 4, 2022
Activation Functions and Optimization Algorithms Technical Support (PhysicsNeMo Only)	2	633	August 16, 2023
Using Isaac Gym with Modulus Technical Support (PhysicsNeMo Only)	2	678	August 21, 2023
The optimization options in nvcc have resulted in increased register pressure CUDA Programming and Performance cuda	8	94	December 13, 2024
Speed up the simulation without affecting the physics Isaac Sim physx	20	6165	April 5, 2024
Issue enabling FPGA Modulus Scenario, numpy.typing not found Samples & Examples modulus	18	1211	August 5, 2022

L-BFGS and AdaHessian Optimizers

Related topics