"CUDA out of memory" error when running Helmholtz example in Modulus

Hi, I’m new to Modulus and I get the following error whenever I run their Helmholtz python scripts example. How do I solve this issue?

Hi @nga77 ,

Thanks for trying out Modulus. This is a general deep learning problem not just Modulus. Unfortunately, PINNs training can take up quite a bit of memory compared to data-driven problems since additional gradients need to be stored. We develop on V100 and A100 gpus so the GPU memory we work with is larger than 4Gb. Fortunately there’s some simple solutions you can try:

  1. Lower your batch-size. This can typically be done in your config file. (Will impact convergence)
  2. Change the size of your neural network (make it smaller). This can be done in your config file or in the code itself. Have a look at the API docs for what parameters you can change. (Will impact convergence)
  3. Train on hardware with more memory.

Can this problem also be solved if I used Modulus on public cloud instances like AWS?

Yes, given that your remote instance has a GPU available with sufficient memory. In development we may test things by running on smaller sizes on test machines then scale to larger systems for bigger problems. It greatly depends on what the problem you’re working on.

Alternatively you could try running in CPU mode, but this will be much slower.

Alright, thank you so much for the help :)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.