Hello to everyone,
I have searched in the forum, but it seems this particular case have not been asked before.
If I change the final_activation
of the FullyConnectedArch
model to, e.g., Activation.RELU
or Activation.SOFTPLUS
, almost immediately during training time, I get
“loss went to Nans”
The same exact model on the same exact problem set up works fine.
modulus.sym.version : 0.1.0
torch.version : 2.0.1
How I do it:
def get_model(
input_keys : list = [Key("x"), Key("y"), Key("z"), Key("t")],
output_keys: list = [Key("V"), Key("rho")],
layer_size: int = 512,
nr_layers: int = 6,
skip_connections: bool = True,
activation_fn: Activation = Activation.SILU,
final_activation: Union[None, Activation] = None
):
"""
Method to get the nVidia modulus model.
Returns:
flow_net
"""
flow_net = FullyConnectedArch(
input_keys = input_keys,
output_keys= output_keys,
# Arch Specs
layer_size = layer_size,
nr_layers = nr_layers,
skip_connections = skip_connections,
adaptive_activations = False,
activation_fn = activation_fn
)
# ==== HERE ======
if final_activation:
flow_net._impl.final_layer.activation_fn = final_activation # Activation.RELU
flow_net._impl.final_layer.callable_activation_fn = get_activation_fn(
final_activation, out_features=flow_net._impl.final_layer.linear.out_features
)
return flow_net
And then call it in main
as
flow_net = get_model(
input_keys = input_keys,
output_keys = output_keys,
layer_size = cfg_data.model_data.layer_size,
nr_layers = cfg_data.model_data.nr_layers ,
skip_connections = cfg_data.model_data.skip_connections ,
final_activation = Activation.SOFTPLUS
)
Thanks in advance!