Hi.
I’m having troubles with torch on my Jetson Orin Nano
When I try to import it in python the interpreter crashes.
I tried to run it with the fault handler here is the output:
python3 -q -X faulthandler
>>> import torch
Fatal Python error: Segmentation fault
Current thread 0x0000ffff8411c010 (most recent call first):
File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 1166 in create_module
File "<frozen importlib._bootstrap>", line 556 in module_from_spec
File "<frozen importlib._bootstrap>", line 657 in _load_unlocked
File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 991 in _find_and_load
File "/home/romain/.local/lib/python3.8/site-packages/torch/__init__.py", line 229 in <module>
File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
File "<frozen importlib._bootstrap_external>", line 848 in exec_module
File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 991 in _find_and_load
File "<stdin>", line 1 in <module>
Segmentation fault (core dumped)
Hi @rom.boutet0, that line is where the torch loads it’s C++ extension library and libtorch. I’m not sure why this segfault may occur in your environment, sorry about that. I would recommend uninstalling torch from pip3 and trying one of the other wheels to see if there is a difference. If you still can’t install torch and import it, then I would probably try flashing a fresh SD card.
Thanks for answering so quickly !
Looks like using Pytorch 1.14.0 fixed the issue of importing torch, I still have a warning (not sure it’s related) when importing torchvision:
UserWarning: Failed to load image Python extension: '/home/romain/.local/lib/python3.8/site-packages/torchvision-0.15.1-py3.8-linux-aarch64.egg/torchvision/image.so: undefined symbol: _ZNK3c107SymBool10guard_boolEPKcl'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
I’m now facing similar issue when trying to run any inference with YoLo. Still the segfault error:
Fatal Python error: Segmentation fault
Current thread 0x0000ffffa9498010 (most recent call first):
File "/home/romain/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 459 in _conv_forward
File "/home/romain/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 463 in forward
File "/home/romain/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1480 in _call_impl
File "/home/romain/.local/lib/python3.8/site-packages/ultralytics/nn/modules/conv.py", line 42 in forward_fuse
File "/home/romain/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1480 in _call_impl
File "/home/romain/.local/lib/python3.8/site-packages/ultralytics/nn/tasks.py", line 82 in _predict_once
File "/home/romain/.local/lib/python3.8/site-packages/ultralytics/nn/tasks.py", line 62 in predict
File "/home/romain/.local/lib/python3.8/site-packages/ultralytics/nn/tasks.py", line 45 in forward
File "/home/romain/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1480 in _call_impl
File "/home/romain/.local/lib/python3.8/site-packages/ultralytics/nn/autobackend.py", line 314 in forward
File "/home/romain/.local/lib/python3.8/site-packages/ultralytics/nn/autobackend.py", line 428 in warmup
File "/home/romain/.local/lib/python3.8/site-packages/ultralytics/yolo/engine/predictor.py", line 222 in stream_inference
File "/home/romain/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 50 in generator_context
File "/home/romain/.local/lib/python3.8/site-packages/ultralytics/yolo/engine/predictor.py", line 184 in __call__
File "/home/romain/.local/lib/python3.8/site-packages/ultralytics/yolo/engine/model.py", line 253 in predict
File "/home/romain/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 34 in decorate_context
File "MarionSpot_5.0/test_with_yolo.py", line 47 in <module>
Segmentation fault (core dumped)
I also tried with Pytorch 1.13.0.
Not sure the problem is related to torch as before but any clue is welcome.
I’ll give a try to the l4t-pytorch container.
I would double-check that you don’t have multiple versions of PyTorch installed (perhaps the site-wide one that is working under /usr and then your user one under /home/romain/.local?) It seems odd to get these segfaults, which aren’t typically encountered.
There is also a simple PyTorch test script here that does some basic checks of creating GPU tensor and using cuDNN kernels: