Issue enabling FPGA Modulus Scenario, numpy.typing not found

Hello, I am trying to follow the modulus extension documentation and it requires me to enable the FPGA Modulus Scenario plugin and that’s where I’m stuck at the moment
I’m assuming its an issue with the numpy version? I went ahead and found it and it shows as 1.19.0 and numpy.typing is not included in that version.

Any ideas on how to upgrade it or something?

Traceback (most recent call last):
  File "/home/waverider/.local/share/ov/pkg/deps/566eea00d3e609fa2e714a3b36b9d556/plugins/bindings-python/omni/ext/impl/custom_importer.py", line 76, in import_module
    return importlib.import_module(name)
  File "/home/waverider/.local/share/ov/pkg/deps/566eea00d3e609fa2e714a3b36b9d556/python/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/__init__.py", line 1, in <module>
    from .extension import *
  File "/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/extension.py", line 1, in <module>
    from .fpga_flow_solver import run
  File "/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/fpga_flow_solver.py", line 1, in <module>
    from .fpga_geometry import *
  File "/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/fpga_geometry.py", line 4, in <module>
    from modulus.geometry.csg.csg_3d import Box, Channel, Plane
  File "/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_ext.core-22.3.1+lx64.r.cp37/deps/pip_prebundle_modulus/modulus/geometry/csg/csg_3d.py", line 26, in <module>
    from .csg import ConstructiveSolidGeometry, csg_curve_naming
  File "/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_ext.core-22.3.1+lx64.r.cp37/deps/pip_prebundle_modulus/modulus/geometry/csg/csg.py", line 11, in <module>
    from .curves import _sample_ranges, Curve
  File "/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_ext.core-22.3.1+lx64.r.cp37/deps/pip_prebundle_modulus/modulus/geometry/csg/curves.py", line 10, in <module>
    from chaospy.distributions.sampler.sequences.primes import create_primes
  File "/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_ext.core-22.3.1+lx64.r.cp37/deps/pip_prebundle/chaospy/__init__.py", line 11, in <module>
    from numpoly import *
  File "/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_ext.core-22.3.1+lx64.r.cp37/deps/pip_prebundle/numpoly/__init__.py", line 7, in <module>
    from .baseclass import ndpoly, FeatureNotSupported, PolyLike
  File "/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_ext.core-22.3.1+lx64.r.cp37/deps/pip_prebundle/numpoly/baseclass.py", line 37, in <module>
    import numpy.typing
ModuleNotFoundError: No module named 'numpy.typing'

2022-06-23 22:01:21 [10,313ms] [Error] [carb.scripting-python.plugin] Exception: Extension python module: 'fpga' in '/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37' failed to load.```

Hello @robert34ck2! I’ve asked the dev team for more help with this. I’ll post back here when I hear back!

Hello @robert34ck2! I just wanted to give you an update. Currently, there are several developers discussing this issue to find a solution. I am pinging the team for more information!

1 Like

Hi @robert34ck2 !

Sorry for your trouble, we ran into an unexpected incompatibility with the Create 2022.1.3 version. We’re trying to fix it for the next release, but in the meantime if you want to follow the documentation, you’d need to downgrade to Create 2022.1.2.

Hello, I’ve downgraded to that version of Create but now the software hard crashes when I try to load one of the scenarios. I’ve tried with Isosurface, Streamlines and Slices from the extension documentation. I even tried running create from the terminal but the logs don’t seem to be very helpful about what the issue is.

2022-07-01 20:14:13 [19,730ms] [Warning] [omni.usd] Warning: in _ReportErrors at line 2830 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/usd/usd/stage.cpp -- Unresolved reference path </RootClass/worlds/scene> on prim @/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/fpga_template_temp.usda@,@anon:0x7f8e1c2abee0@</Root/scene>. (instantiating stage on stage @/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/fpga_template_temp.usda@ <0x138fe4f0>)

2022-07-01 20:14:13 [19,865ms] [Warning] [omni.usd] Warning: in _ReportErrors at line 2830 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/usd/usd/stage.cpp -- Unresolved reference path </RootClass/worlds/scene> on prim @/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/fpga_template_temp.usda@,@anon:0x1a135ea0:fpga_template_temp-session.usda@</Root/scene>. (instantiating stage on stage @/home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/fpga_template_temp.usda@ <0x1a1320c0>)

FPGAScenario.eval started.
In init_solver_eval.
About to make class's FPGA solver for evaluation
Configs for eval action:
training:
  max_steps: 1500000
  grad_agg_freq: 1
  rec_results_freq: 1000
  rec_validation_freq: 5000
  rec_inference_freq: 5000
  rec_monitor_freq: 1000
  rec_constraint_freq: 2000
  save_network_freq: 1000
  print_stats_freq: 100
  summary_freq: 1000
  amp: false
  amp_dtype: float16
  ntk:
    use_ntk: false
    save_name: null
    run_freq: 1000
profiler:
  profile: false
  start_step: 0
  end_step: 100
  name: nvtx
network_dir: /home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/outputs/fpga_flow/network_checkpoint_flow
initialization_network_dir: ''
save_filetypes: npz
summary_histograms: false
jit: false
device: ''
debug: false
run_mode: eval
arch: ???
loss:
  _target_: modulus.aggregator.Sum
  weights: null
optimizer:
  _params_:
    compute_gradients: adam_compute_gradients
    apply_gradients: adam_apply_gradients
  _target_: torch.optim.Adam
  lr: 0.001
  betas:
  - 0.9
  - 0.999
  eps: 1.0e-08
  weight_decay: 0.0
  amsgrad: false
scheduler:
  _target_: custom
  _name_: tf.ExponentialLR
  decay_rate: 0.95
  decay_steps: 15000
batch_size:
  inlet: 560
  outlet: 560
  no_slip: 20000
  lr_interior: 2500
  hr_interior: 2500
  integral_continuity: 11250
  num_integral_continuity: 5
custom: ???

Thanks!

Can you please also send the results of the “nvidia-smi” command?

Note: Sorry if this is a double-post – I accidentally edited this instead of replying later in the chain.

Sure thing!

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05    Driver Version: 510.73.05    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   47C    P8    39W / 350W |    244MiB / 12288MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2258      G   /usr/lib/xorg/Xorg                111MiB |
|    0   N/A  N/A      2573      G   /usr/bin/gnome-shell              131MiB |
+-----------------------------------------------------------------------------+

Thanks!

Interesting. The error you’ve reported looks like it’s running out of memory while trying to do the modulus inference, but I would expect the card you’re using to have enough VRAM to handle the situation. I’ll reach out to the modulus team and inquire about the minimum memory requirements for their example.

Hi! It looks like the batch size is set too high in our initial example release, which is causing modulus to run out of memory on your card. We’ll make sure to get this fixed for the upcoming release. In the meantime, you can reduce this size by searching for a file called fpga_flow_solver.py in your ~/.local folder on your filesystem. On line 183, the batch size is specified as 1024*1024. For your case, I would recommend reducing this to 1024 *128 or 1024 * 64.

Specifically, on my system the file is located here:

~/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/fpga_flow_solver.py

You should be able to find it with the following commands:

cd ~/.local
find . -L -name "fpga_flow_solver.py"

Hope that helps :D

I was able to find the file and reduce the batch size, the file was in this directory: ~/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/fpga_flow_solver.py

tried again to run the sim and crashed again, here are the logs.

[11.327s] app ready
2022-07-01 20:51:07 [20,360ms] [Warning] [omni.usd] Warning: in _ReportErrors at line 2830 of /buildAgent/work/ca6c508eae419cf8/USD/pxr/usd/usd/stage.cpp -- Unresolved reference path </RootIModulusBase constructor called with name="Train" and desc="Train the FPGA model".
IModulusAction constructor called
IModulusBase constructor called with name="Eval" and desc="Evaluate the FPGA model".
IModulusAction constructor called
IModulusBase constructor called with name="Isosurface" and desc="Generate Isosurface Visualization".
IModulusAction constructor called
IModulusBase constructor called with name="Streamline" and desc="Generate Streamline Visualization".
IModulusAction constructor called
IModulusBase constructor called with name="Slice" and desc="Generate Slice Visualization".
IModulusAction constructor called
IModulusBase constructor called with name="update_data" and desc="Forward new simulation data to the visualization pipeline".
IModulusAction constructor called
TODO: ControlShareParameter constructor called
TODO: ControlShareParameter constructor called
TODO: ControlShareParameter constructor called
TODO: ControlShareParameter constructor called
TODO: ControlShareParameter constructor called
TODO: ControlShareParameter constructor called
TODO: ControlShareParameter constructor called
TODO: ControlShareParameter constructor called
TODO: ControlShareParameter constructor called
TODO: ControlShareParameter constructor called
TODO: ControlShareParameter constructor called
IModulusBase constructor called with name="FPGA" and desc="A modulus solver that simulates flow around an FPGA heat sink.".
[modulus_ext.core] Registered scenario "FPGA"
[modulus_ext.core] Update UI called.
registry: [<weakref at 0x7f2b199808f0; to 'FPGAScenario' at 0x7f2af1b1af50>]
IModulusScenario constructor called with params = "dict_values([<modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a862fc590>, <modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a862fc410>, <modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a862fc710>, <modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a862fccd0>, <modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a867cde50>, <modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a867cdbd0>, <modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a867cd190>, <modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a867cd6d0>, <modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a867cd850>, <modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a867cda10>, <modulus_ext.core.scenario.scenario.ControlShareParameter object at 0x7f2a867cdfd0>])", inputs = "[]", outputs = "[]", actions = "[<modulus_ext.core.scenario.scenario.IModulusAction object at 0x7f2b199fcc90>, <modulus_ext.core.scenario.scenario.IModulusAction object at 0x7f2b199fcf50>, <modulus_ext.core.scenario.scenario.IModulusAction object at 0x7f2a862fc290>, <modulus_ext.core.scenario.scenario.IModulusAction object at 0x7f2a8a39df10>, <modulus_ext.core.scenario.scenario.IModulusAction object at 0x7f2af1f45990>, <modulus_ext.core.scenario.scenario.IModulusAction object at 0x7f2a862fc690>]"
FPGAScenario.eval started.
In init_solver_eval.
About to make class's FPGA solver for evaluation
Configs for eval action:
training:
  max_steps: 1500000
  grad_agg_freq: 1
  rec_results_freq: 1000
  rec_validation_freq: 5000
  rec_inference_freq: 5000
  rec_monitor_freq: 1000
  rec_constraint_freq: 2000
  save_network_freq: 1000
  print_stats_freq: 100
  summary_freq: 1000
  amp: false
  amp_dtype: float16
  ntk:
    use_ntk: false
    save_name: null
    run_freq: 1000
profiler:
  profile: false
  start_step: 0
  end_step: 100
  name: nvtx
network_dir: /home/waverider/.local/share/ov/data/Kit/Create.Next/2022.1/exts/3/modulus_scenario_fpga-1.0.0+cp37/fpga/outputs/fpga_flow/network_checkpoint_flow
initialization_network_dir: ''
save_filetypes: npz
summary_histograms: false
jit: false
device: ''
debug: false
run_mode: eval
arch: ???
loss:
  _target_: modulus.aggregator.Sum
  weights: null
optimizer:
  _params_:
    compute_gradients: adam_compute_gradients
    apply_gradients: adam_apply_gradients
  _target_: torch.optim.Adam
  lr: 0.001
  betas:
  - 0.9
  - 0.999
  eps: 1.0e-08
  weight_decay: 0.0
  amsgrad: false
scheduler:
  _target_: custom
  _name_: tf.ExponentialLR
  decay_rate: 0.95
  decay_steps: 15000
batch_size:
  inlet: 560
  outlet: 560
  no_slip: 20000
  lr_interior: 2500
  hr_interior: 2500
  integral_continuity: 11250
  num_integral_continuity: 5
custom: ???

Hmm interesting. What specific card are you using? (Sorry it didn’t quite show in the nvidia-smi results)

I’d like to try to replicate your setup so that I can determine where the problem is.

Edit: Nevermind, I see that you’re using a 3080Ti. I’ll try to get an equivalent machine to test with on our side. Thanks!

1 Like

I’m using a 3080 Ti
Here’s the output of my screenfetch command

                          ./+o+-       waverider@typhoon
                  yyyyy- -yyyyyy+      OS: Ubuntu 22.04 jammy
               ://+//////-yyyyyyo      Kernel: x86_64 Linux 5.15.0-40-generic
           .++ .:/++++++/-.+sss/`      Uptime: 40m
         .:++o:  /++++++++/:--:/-      Packages: 1865
        o:+o+:++.`..```.-/oo+++++/     Shell: bash 5.1.16
       .:+o:+o/.          `+sssoo+/    Resolution: 5120x1440
  .++/+:+oo+o:`             /sssooo.   DE: GNOME 41.7
 /+++//+:`oo+o               /::--:.   WM: Mutter
 \+/+o+++`o++o               ++////.   WM Theme: Adwaita
  .++.o+++oo+:`             /dddhhh.   GTK Theme: Yaru [GTK2/3]
       .+.o+oo:.          `oddhhhh+    Icon Theme: Yaru
        \+.++o+o``-````.:ohdhhhhh+     Font: Ubuntu 11
         `:o+++ `ohhhhhhhhyo++os:      Disk: 29G / 51G (61%)
           .o:`.syhhhhhhh/.oo++o`      CPU: Intel Core i7-7700 @ 8x 4.2GHz [38.0°C]
               /osyyyyyyo++ooo+++/     GPU: NVIDIA GeForce RTX 3080 Ti
                   ````` +oo+++o\:     RAM: 2471MiB / 15959MiB
2 Likes

Just tested out this in the new Create version 2022.1.4-rc.7 and it is still an issue

Hi Robert!
Could you provide the complete log file for one of these crashes? You should be able to find the logs in ~/.nvidia-omniverse/logs/Kit/Create.Next/2022.1/kit_*_*.log.

Thank you!
Mathias

Here it is!

kit_20220707_085804.log (835.9 KB)

Hi Robert! I still see the numpy error in this log (line 4501), which prevents the FPGA extension from loading. This also seems to be from Create 2022.1.4-rc.7.
Do you have a log for Create 2022.1.2?

Sure can, here it is. it’s the one from my last attempt at running the simulation. I could downgrade back to 2022.1.2 if you need me to.

kit_20220701_155047.log (797.0 KB)

Tested it on Create 2022.1.5-rc.2 and there is still the dependency issue.