Hi, I am trying to use Modulus 23.08 on our cluster. Due to security issue, we can’t use docker and have to convert it to singularity format. On one of the system, it worked fine with the aneurysm example.
However, on another cluster, it didn’t work. The error is:
tsltaywb@volta01:~/hpctmp_tsltaywb/modulus/myprojects/aneurysm_2308$ python aneurysm.py
/usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
ret = run_job(
[06:27:53] - JIT using the NVFuser TorchScript backend
[06:27:53] - JitManager: {'_enabled': True, '_arch_mode': <JitArchMode.ONLY_ACTIVATION: 1>, '_use_nvfuser': True, '_autograd_nodes': False}
[06:27:53] - GraphManager: {'_func_arch': False, '_debug': False, '_func_arch_allow_partial_hessian': True}
/usr/local/lib/python3.10/dist-packages/modulus/sym/geometry/tessellation.py:104: RuntimeWarning: divide by zero encountered in divide
np.full(x.shape, triangle_areas[index] / x.shape[0])
Using BVH_GEQUEL
NUM Triangles: 37578
Timing for Build CPAT Model (build bvh): 0.19523s
Timing for cpatResultsToArrays: 0.0412009s
Timing for cpatDistanceField: 0.626457s
Timing for computeDistanceField: 0.626526s
terminate called after throwing an instance of 'std::runtime_error'
what(): Library not found
Aborted (core dumped)
Similarly, if I am using the docker image directly on my workstation running windows WSL, I have the exact error.
Do you have any idea what’s wrong?
However, now we encounter some problem. Is there anyway we can use the Tesselated Geometry w/o using the docker image? We’re moving towards 3D simulation and I don’t think it would be possible if we can’t import the STL.
Thanks.
Hi, just an update.I managed to get the other cluster one working. Error was due to python conflict with my .local python libraries. Hope that helps.
Hi tsltaywb, Could you give me more details to fix this problem? I have the same problem with v24.04
Hi thai.le,
The problem is that my own local python is conflicting with the singularity’s python. My solution is to simply rename the .local to .local_tmp. Of cos, you need to check if that’ll cause problem with your other codes.
Thanks, tsltaywb! . I tried your solution but I didn’t see any .local folder in my container. Do you have any ideas for my case? I am running a Modulus container v24.04 in Window system
Hi thai.le, sorry, that’s all I know.
Btw, this solution is for my school’s cluster. It will not work if you are using WSL.
Thanks, tsltaywb, I don’t know exactly what problem occurred with the Windows system. so I switched to a Linux system (Ubuntu 22.04) and it worked with the latest Modulus container.
I have the same erro as u ,mine’s like
root@6e71f8e00829:/test3# /usr/bin/python /test3/examplestest3/aneurysm.py
/usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See Changes to job's runtime working directory | Hydra for more information.
ret = run_job(
[W init.cpp:779] Warning: nvfuser is no longer supported in torch script, use _jit_set_nvfuser_single_node_mode is deprecated and a no-op (function operator())
[W init.cpp:767] Warning: nvfuser is no longer supported in torch script, use _jit_set_nvfuser_enabled is deprecated and a no-op (function operator())
[05:27:21] - JIT using the NVFuser TorchScript backend
[05:27:21] - JitManager: {‘_enabled’: True, ‘_arch_mode’: <JitArchMode.ONLY_ACTIVATION: 1>, ‘_use_nvfuser’: True, ‘_autograd_nodes’: False}
[05:27:21] - GraphManager: {‘_func_arch’: False, ‘_debug’: False, ‘_func_arch_allow_partial_hessian’: True}
/usr/local/lib/python3.10/dist-packages/modulus/sym/geometry/tessellation.py:106: RuntimeWarning: divide by zero encountered in divide
np.full(x.shape, triangle_areas[index] / x.shape[0])
Using BVH_GEQUEL
NUM Triangles: 37578
Timing for Build CPAT Model (build bvh): 1.68367s
Timing for cpatResultsToArrays: 0.11893s
Timing for cpatDistanceField: 2.23038s
Timing for computeDistanceField: 2.23042s
terminate called after throwing an instance of ‘std::runtime_error’
what(): Library not found