DiscreteGeometry with Parameterization vs Transfer Learning

Part of my simulation programmatically alters a geometry and adds each new geometry as a tessellation in a list provided to a discrete geometry. It runs, but adding each new tessellation increases the memory usage by a significant amount.

I’m using a 2070 with 6gb of ram locally, so I have to run at relatively small batch sizes for the constraints. The values I use for a single geometry push it to 4gb used and each tessellation added to the discrete geometry adds ~1gb more. Even on the target machines I’m “limited” to 15-20gb per gpu.

I’d like to run on the order of 50-100 programmatically generated variations per original geometry. If I were to dial back the batch size to fit in the memory limits, the accuracy would not be good.

Would it make more sense to use transfer learning and use the trained model for the original geometry as a way to inference outputs? I’m guessing that the main pitfall is that even with transfer learning I will still need to retrain the model for the modified geometries. Is there any other way to reduce the memory usage for parameterized discrete geometries?

Thanks!

Hi @patterson

As you mentioned the easiest way to lower memory usage is reducing batch size but this can impact training. Another alternative is to decrease the size of your neural network, but this will also likely lower accuracy. You could look into first order formulations of PDEs to additional auto-grads from running for higher-order derivatives but this is not always an option.

One good option that may work for you is gradient aggregation (sorry no direct link, you’ll have to scroll down half way). The basic idea is to accumulate gradients over multiple mini-batches to mimic a larger batch size at the sacrifice of training time. This could help out here.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.