Error in running some example tutorial code (ZeroEquation)

I am trying to run the example with the NVIDIA modulus 22.07 container. Everything run fine until examples that contain the use of ZeroEquation which require to compute the sdf derivatives. (Example code: ldc/ldc_2d_zeroEq.py, three_fin_2d/heat_sink.py). I always get the following error when running the code.

  File "/modulus/modulus/solver/solver.py", line 159, in solve
    self._train_loop(sigterm_handler)
  File "/modulus/modulus/trainer.py", line 554, in _train_loop
    self._record_monitors(step)
  File "/modulus/modulus/trainer.py", line 298, in _record_monitors
    self.record_monitors(step)
  File "/modulus/modulus/solver/solver.py", line 148, in record_monitors
    self.domain.rec_monitors(self.network_dir, self.writer, step)
  File "/modulus/modulus/domain/domain.py", line 102, in rec_monitors
    monitor.save_results(key, writer, step, monitor_data_dir)
  File "/modulus/modulus/domain/monitor/pointwise.py", line 59, in save_results
    outvar = self.model(invar)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1129, in _call_impl
    return forward_call(*input, **kwargs)
  File "/modulus/modulus/graph.py", line 157, in forward
    outvar.update(e(outvar))
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1129, in _call_impl
    return forward_call(*input, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "/modulus/modulus/eq/derivatives.py", line 85, in forward
            var = input_var[var_name]
            grad_var = self.prepare_input(input_var, grad_sizes.keys())
            grad = gradient(var, grad_var)
                   ~~~~~~~~ <--- HERE
            grad_dict = {
                name: grad[i] for i, name in enumerate(self.gradient_names[var_name])
  File "/modulus/modulus/eq/derivatives.py", line 24, in gradient
    """
    grad_outputs: List[Optional[torch.Tensor]] = [torch.ones_like(y, device=y.device)]
    grad = torch.autograd.grad(
           ~~~~~~~~~~~~~~~~~~~ <--- HERE
        [
            y,
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I am suspecting there is some bug in initializing the variable without the required_grad option enable. Anyone can give some suggestion on how to run these examples ?

Hi @Hin,

ldc_2d_zeroEg.py seems to be running fine for me. For this example can you please check to see if the Inferencer, Validator and Monitor have the requires_grad=True set. E.g. for the LDC example:

    # add inferencer data
    grid_inference = PointwiseInferencer(
        nodes=nodes,
        invar=openfoam_invar_numpy,
        output_names=["u", "v", "p", "nu"],
        batch_size=1024,
        plotter=InferencerPlotter(),
        requires_grad=True, # Check this is here!
    )
    ldc_domain.add_inferencer(grid_inference, "inf_data")

The heat sink example also requires this to be turned on for the global_monitor monitor.

Typically inputs to these components do not have input gradients turned on (so you can’t compute gradient quantities) to help save memory during inference/validation. So if you require autodiff gradients (needed for the nu calculation I believe in this case) you will need to make sure that you tell Modulus that gradients will be required.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.