Error when applying exact continuity feature for solving the heat variable

I’m having an issue with applying exact continuity feature to solve the heat variables in the FPGA case. (FPGA Heat Sink with Laminar Flow — Modulus 22.09 documentation)

For solving flow variables, the exact continuity feature was successfully implemented after disabling the FuncTorch function. (without using symmetry trick)

However, on the same case, the exact continuity feature can’t be applied successfully when solving the heat variable. The error msg is:

Error executing job with overrides: []
Traceback (most recent call last):
** File “fpga_heat.py”, line 368, in run**
** thermal_slv.solve()**
** File “/usr/local/lib/python3.8/dist-packages/modulus-22.9-py3.8.egg/modulus/solver/solver.py”, line 159, in solve**
** self._train_loop(sigterm_handler)**
** File “/usr/local/lib/python3.8/dist-packages/modulus-22.9-py3.8.egg/modulus/trainer.py”, line 599, in _train_loop**
** self._record_validators(step)**
** File “/usr/local/lib/python3.8/dist-packages/modulus-22.9-py3.8.egg/modulus/trainer.py”, line 289, in _record_validators**
** self.validator_outvar = self.record_validators(step)**
** File “/usr/local/lib/python3.8/dist-packages/modulus-22.9-py3.8.egg/modulus/solver/solver.py”, line 119, in record_validators**
** return self.domain.rec_validators(**
** File “/usr/local/lib/python3.8/dist-packages/modulus-22.9-py3.8.egg/modulus/domain/domain.py”, line 57, in rec_validators**
** valid_losses = validator.save_results(**
** File “/usr/local/lib/python3.8/dist-packages/modulus-22.9-py3.8.egg/modulus/domain/validator/continuous.py”, line 94, in save_results**
** pred_outvar = self.forward(invar)**
** File “/usr/local/lib/python3.8/dist-packages/modulus-22.9-py3.8.egg/modulus/domain/validator/validator.py”, line 15, in forward_nograd**
** pred_outvar = self.model(invar)**
** File “/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py”, line 1194, in _call_impl**
** return forward_call(input, kwargs)
** File “/usr/local/lib/python3.8/dist-packages/modulus-22.9-py3.8.egg/modulus/graph.py”, line 220, in forward
*
** outvar.update(e(outvar))**
** File “/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py”, line 1194, in _call_impl**
** return forward_call(input, kwargs)
** File “/usr/local/lib/python3.8/dist-packages/modulus-22.9-py3.8.egg/modulus/eq/derivatives.py”, line 85, in forward
*
** grad = gradient(var, grad_var)**
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
** File “/usr/local/lib/python3.8/dist-packages/modulus-22.9-py3.8.egg/modulus/eq/derivatives.py”, line 24, in gradient**
** “”"**
** grad_outputs: List[Optional[torch.Tensor]] = [torch.ones_like(y, device=y.device)]**
** grad = torch.autograd.grad(**
** ~~~~~~~~~~~~~~~~~~~ <— HERE**
** [**
** y,**
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

In order to apply the exact continuity feature for solving the heat variable, some changes of the code were mad as shown below:


The system information is listed below:
Linux Version: Win11 WSL-Ubuntu 20.04 LTS
Driver Version: 522.30
CUDA Version: 11.8
GPU: RTX3090
Modulus Version: Modulus Bare Metal Install - 22.09

Are there any suggestion for this issue?

Hi @johnlaide

Thanks for the detailed report and code snippets. Try adding the requires_grad = True to the validator. This needs to be set to tell the validator to track the gradients which are required in the exact continuity setup. Seems we forgot to add this for this part of the problem.

An example of this being turned on is in the flow script where it set to the continuity param in the config file.

Hi @ngeneva

Appreciate for your kindly reply.

The exact continuity feature has success fully implemented with the case after makes the requires_grad = True to the [validator].

Thanks & Regards

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.