Platform: Orin NX
Jetpack Version: 5.1
Cuda: 11.4
Torch version installed: 1.14.0a0+44dac51c.nv23.01
Issue:
i am not able to find the version of torchvision compatible with torch: 1.14.0a0+44dac51c.nv23.01. This version of torch is for jetson platforms for jetpack 5.1, the other 2 being 1.14(other revision) and 2.0.
Torch version 2.0 has issues with compiling with cuda when installed on these platforms. I have chosen 1.14, which gets compiled with CUDA. The issue is with finding the version of torchvision that is compatible with the mentioned version of torch (not 2.0).
I tried torchvision versions 0.14 and 0.16, and get the errors:
"RuntimeError: Couldn’t load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.
".
I need help with figuring out the compatible versions of torchvision for torch version ‘1.14.0a0+44dac51c.nv23.01’.
Hi @surajsingh52, did you try building torchvision from source like shown in the Installation section of this thread?
torchvision 0.14 should work with PyTorch 1.14, and torchvision 0.15 for PyTorch 2.0. It’s more important that torchvision gets compiled with CUDA than the specific version.
You can also copy torchvision wheel out of this container (they are found under /opt in the container)
Hey @dusty_nv,
please correct me if i am wrong, from what i have tried the compatibility of torchvision 0.14 with PyTorch 1.14, and torchvision 0.15 with PyTorch 2.0 is valid. But these version of torch are not for the arm arch of jetson platforms. You could install them and use the torch and torchvision libs, but they don’t use any of the CUDA devices. The CUDA devices on these platforms only get enabled when the compatible versions of torch for these boards are installed, which for my setup is ‘1.14.0a0+44dac51c.nv23.01’, which i am guessing is a version of 1.14 for the jetson platforms. So, may main issue again is ‘finding a version of torchvision compatible with torch=1.14.0a0+44dac51c.nv23.01’.
Will be checking out the container you suggested.
Also, i will have to setup docker on my jetson board for this. Let me know if there’s a way to directly download the specific wheel, the way you have it for torch.
Hey @dusty_nv,
tried installing torchvision with the wheel from the container.
The image i pulled was: dustynv/torchvision:r35.3.1.
The torchvision wheel i found: torchvision-0.15.1a0+42759b1-cp38-cp38-linux_aarch64.whl
I am able to import torchvision using the wheel with a warning: '/home/tonbo1/archiconda3/envs/yolov7/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/home/tonbo1/archiconda3/envs/yolov7/lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZNK3c107SymBool10guard_boolEPKcl’If you don’t plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
'. The problem seems to be with mismatch in the versions of torch and torchvision i have installed. On running my script, i face a runtime error, which i have added below.
The Runtime Error: ‘RuntimeError: Couldn’t load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.’
Update: The version of torchvision in the conatiners is the same for both rivisions 35.3.1 and 35.2.1.
So, let me know if you guys find a version of torchvision compatible with the version of torch i need (mentioned earlier).
I believe the messages about ‘version mismatch’ could instead be related to if you pip3 install torch or pip3 install torchvision, then it will install pre-built wheels from PyPi that were built without CUDA enabled. torchvision 0.14 should be compatible with PyTorch 1.14 if you build torchvision from source against PyTorch 1.14, but I don’t think it’s actually that sensitive to the versions as long as torchvision gets compiled with CUDA on your machine.
Sorry the wheel from the container didn’t work - it seems there are other dependencies in the container which you would need to install. I’m also not sure what impacts that using archiconda has, I haven’t used that. I would recommend compiling torchvision from source (and post the build log here if it still doesn’t work with CUDA), or just use my torchvision container which already has this stuff working in it.
for installing torch and torchvision, i have installed both using pip but used the wheels recommended by NVIDIA specific to the this platform and architechture.
the commands:
“pip install torchvision-0.15.1a0+42759b1-cp38-cp38-linux_aarch64.whl” and
“pip install torch-1.14.0a0+44dac51c.nv23.01-cp38-cp38-linux_aarch64.whl”.
I think this is the usuall process for installation and use of conda/archiconda doesn’t matter in this context(might be wrong).
Also, the container is just to copy the relevant wheel you need. Don’t think it has any other role in my context. Will be trying out installing torchvision from source.