Loading a trained model doesn't appear to work as expected

I’m working through the pytorch classifier models (thumbs, facial expressions, etc) and have noticed that when I train and save a model (my_xyz_model.pth), I cannot seem to reload the model and have it work properly at a later time. I can successfully reload the model, and it runs, outputting predictions, but the accuracy is essentially the same as it would be if I ran it using only the transfer model, before training it on my classification data.

The interface seems to indicate that all I have to do is load up my notebook, ensure the path to my trained model is in the model path field, and click load model.

Am I missing a step in this process? Do I need to do something in the code block where I load the transfer learning model (Resnet in my case)?



Do you use our TLT toolkit for retraining?


I haven’t explored it yet. Currently, I’m using the starter code in the intro DLI docker image.

TLT looks very interesting and I will have a look.

Should I fuss with making the load model function work in the DLI project, or just start using TLT?

I’m now working with tlt, and I will likely train future models on my GPU workstation and export them to my Nano.

My issue still remains that I’ve trained and saved models using the intro tutorials on my Nano, but when I load them, they don’t perform any better than the transfer learning model that I start with (Resnet18).

When reloading a model in the tutorial UI (specify model path and click load model), do I need to do something different in the cells above, like not load the base model (Resnet)?

I have a pth file. How do I make it work without retraining?

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.


Since the transfer learning jobs is not working over two different frameworks.
It’s possible that some issue on your training database.

Could you share the size of your database and an example (if possible) with us first?



I am having the same problem. Am I also missing something?

Many thanks.

Hi PDLM01,

Please help to open a new topic with more details. Thanks