Jetpack 5.1.3, Pytorch and PyTorch Vision, Tried all thread suggestions

I can’t seem to make any changes to the long running thread @dusty_nv, so I will place it here. I’ve followed all the advice, and can’t get things working. Need torchvision for a library.

Go to compile Torchvision and get this error:

OSError: libmpi_cxx.so.20: cannot open shared object file: No such file or directory

This command sudo find / -name 'libmpi*' gets me this:

/usr/lib/aarch64-linux-gnu/libmpi_mpifh.so.40.20.2
/usr/lib/aarch64-linux-gnu/libmpi_usempif08.so.40
/usr/lib/aarch64-linux-gnu/libmpi_usempif08.so
/usr/lib/aarch64-linux-gnu/libmpi.so.40.20.3
/usr/lib/aarch64-linux-gnu/libmpi_java.so
/usr/lib/aarch64-linux-gnu/libmpi_cxx.so.40.20.1
/usr/lib/aarch64-linux-gnu/libmpi_mpifh.so.40
/usr/lib/aarch64-linux-gnu/libmpi_mpifh.so
/usr/lib/aarch64-linux-gnu/libmpi_usempi_ignore_tkr.so.40.20.0
/usr/lib/aarch64-linux-gnu/libmpi_usempi_ignore_tkr.so.40
/usr/lib/aarch64-linux-gnu/libmpi_cxx.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_mpifh.so.40.20.2
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_usempif08.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi.so.40.20.3
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_java.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_cxx.so.40.20.1
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_mpifh.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_usempi_ignore_tkr.so.40.20.0
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_cxx.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_java.so.40.20.0
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_usempif08.so.40.21.0
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_usempi_ignore_tkr.so
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi.so
/usr/lib/aarch64-linux-gnu/libmpi.so.40
/usr/lib/aarch64-linux-gnu/libmpi_cxx.so.40
/usr/lib/aarch64-linux-gnu/libmpi_java.so.40.20.0
/usr/lib/aarch64-linux-gnu/libmpi_java.so.40
/usr/lib/aarch64-linux-gnu/libmpi++.so
/usr/lib/aarch64-linux-gnu/libmpi_usempif08.so.40.21.0
/usr/lib/aarch64-linux-gnu/libmpi_usempi_ignore_tkr.so
/usr/lib/aarch64-linux-gnu/libmpi.so
/etc/alternatives/libmpi++.so-aarch64-linux-gnu
/etc/alternatives/libmpi.so-aarch64-linux-gnu

I added this to my bash:

export LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu/openmpi/lib:$LD_LIBRARY_PATH

I tried adding symlinks for libmpi_cxx.20 to these files:

usr/lib/aarch64-linux-gnu/libmpi_cxx.so.40
/usr/lib/aarch64-linux-gnu/openmpi/lib/libmpi_cxx.so.40.20.1

I wiped out the openmpi and dev installs as you mentioned here:

sudo apt-get purge -y libopenmpi-dev libopenmpi* openmpi-bin && \
    sudo apt-get install -y libopenmpi-dev openmpi-bin

I’m using this torch install: torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl

On torchvision, here is my branch:

* release/0.16

I also tried downloading the v0.16.1 tagged branch, and got the same results.

Any ideas?

Hi @rich25, I think that error is coming from PyTorch not torchvision (you may see the same error with python3 -c 'import torch'), and my suspicion is that you installed a PyTorch wheel built for a different version of JetPack.

OK, looking this up, that wheel is for JetPack 5. Are you on JetPack 5 or JetPack 6?

Thanks @dusty_nv!

I’m using Jetpack 5.1.3. I’m not sure there has been a specific compilation for this one, at least according to the doc it covers these:
JetPack 5.1 (L4T R35.2.1) / JetPack 5.1.1 (L4T R35.3.1) / JetPack 5.1.2 (L4T R35.4.1)

JetPack 5.1.x should all be interoperable, because they’re all on Ubuntu 20.04. And checking my JetPack 5 system here, yea I have libmpi_cxx.so.40 on it, not *.so.20 like in the error.

Can you do a pip3 uninstall torch, and then re-download the wheel from this link to double-check it’s actually the right one?

https://developer.download.nvidia.cn/compute/redist/jp/v512/pytorch/torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl

And actually in retrospect, those nvXX.XX pytorch wheels were built without MPI and wouldn’t depend on it, so my suspicion is that a different wheel actually was downloaded.

It must have been something similar to that, or some kind of caching again.

I purged PyTorch and some dependencies, reboot, reinstalled, and it seems to be compiling now.

Thank you again for the time Dusty, and keep up the great work with the Jetson Research Group. It’s been awesome and getting a lot out of it for our team!

1 Like

Awesome Rich, glad you got it figured out - happy to help. See you around the research group!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.