How can I specify a saved "best rewards" model in inference?

How can I specify a saved “next best rewards” model in inference?
In “”, it is written as follows.There is no mention of “best rewards”.

“To load a trained checkpoint and continue training, use the checkpoint argument:
python task=Ant checkpoint=runs/Ant/nn/Ant.pth”

But during training, I saw the following display:

"fps step: 3875.4 fps step and policy inference: 3843.5 fps total: 967.0
saving next best rewards: [1.42e-05]
=> saving checkpoint ‘runs/TASK_NAME/nn/TASK_NAME.pth’ "

Is “TASK_NAME.pth” a model for “best rewards” or “last checkpoint”?

It is confirmed by the file write time using “TensorBoard”. “TASK_NAME.pth” is “best rewords” checkpoint.

Yes, TASK_NAME.pth will be the best rewards checkpoint.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.