Hi
How can I specify a saved “next best rewards” model in inference?
In “README.md”, it is written as follows.There is no mention of “best rewards”.
“To load a trained checkpoint and continue training, use the checkpoint
argument:
python train.py task=Ant checkpoint=runs/Ant/nn/Ant.pth”
But during training, I saw the following display:
"fps step: 3875.4 fps step and policy inference: 3843.5 fps total: 967.0
saving next best rewards: [1.42e-05]
=> saving checkpoint ‘runs/TASK_NAME/nn/TASK_NAME.pth’ "
Is “TASK_NAME.pth” a model for “best rewards” or “last checkpoint”?