I’m getting an error when reloading models during training phase. Has anyone got any advice?
Thanks,
Ben
I’m getting an error when reloading models during training phase. Has anyone got any advice?
Thanks,
Ben
Hi @mn17b2m
Can you provide some information regarding what version of Modulus you are running. Is this a bare metal install or a Docker image.
I have personally not seen this issue before but based on a related Github issue seems its a bug others are seeing in current PyTorch version that could be related to Cuda Graphs.
Perhaps try shutting off Cuda Graphs with cuda_graphs: False
in your config to disable this feature?
I’m running v22.03 and it is bare metal install on google colab
Cuda Graphs is a feature present in 22.07, not 22.03 so its not relevant. Based on that PyTorch issue thread I linked, you may want to try downgrading your PyTorch version. (Seems this is happening for people on PyTorch 1.12). Please have a look there for more information that may be relevant to you.