I am getting the following error after running “import torch” command as shown below:
I have Jetpack 5.1.2 installed on my Jetson AGX Xavier. Please let me know how to resolve this error.
The CUDA version installed is 12.0 The CUDA upgrade package is 12.2 python version is 3.8 Torch version is 2.1 numpy version that got installed during the installation process observation is 1.24.4
root@linux:/home/trident# $ python3
bash: $: command not found
root@linux:/home/trident# export LD_LIBRARY_PATH=/usr/lib/llvm-8/lib:$LD_LIBRARY_PATH
root@linux:/home/trident# python3
Python 3.8.10 (default, Nov 22 2023, 10:22:35)
[GCC 9.4.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
import torch
Traceback (most recent call last):
File “/usr/local/lib/python3.8/dist-packages/torch/init.py”, line 168, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File “/usr/lib/python3.8/ctypes/init.py”, line 373, in init
self._handle = _dlopen(self._name, mode)
OSError: libcufft.so.10: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “”, line 1, in
File “/usr/local/lib/python3.8/dist-packages/torch/init.py”, line 228, in
_load_global_deps()
File “/usr/local/lib/python3.8/dist-packages/torch/init.py”, line 189, in _load_global_deps
_preload_cuda_deps(lib_folder, lib_name)
File “/usr/local/lib/python3.8/dist-packages/torch/init.py”, line 154, in _preload_cuda_deps
raise ValueError(f"{lib_name} not found in the system path {sys.path}")
ValueError: libcublas.so.*[0-9] not found in the system path [‘’, ‘/usr/lib/python38.zip’, ‘/usr/lib/python3.8’, ‘/usr/lib/python3.8/lib-dynload’, ‘/usr/local/lib/python3.8/dist-packages’, ‘/usr/lib/python3/dist-packages’, ‘/usr/lib/python3.8/dist-packages’]
print(torch.version)
Traceback (most recent call last):
File “”, line 1, in
NameError: name ‘torch’ is not defined
Hi @nagesh_accord, the PyTorch wheels for JetPack 5 were built against CUDA 11.4. The PyTorch binaries aren’t compatible across changes in the major version of CUDA or cuDNN. You would either need to recompile PyTorch for your environment or change back to CUDA 11.4. There are steps for building PyTorch in this topic:
Thanks for the updates. As I am bit new to this CUDA, Pytorch, openCV and other stuff, please bear with my queries.
Please recommend which would be better method, to change back to CUDA 11.4 or recompile Pytorch for Jetpack 5.1.2.
For either cases, please point me to the correct steps like how to uninstall CUDA 12.0/12.2 and install cuda 11.4
also, how to recompile pytorch.
wanted to know the correct order of installations:
CUDA toolkit, pytorch, openCV, etc
which we need to install first and second and third etc…
In the documentation it says maximum pytorch version compatible for jetpack 5.1.x
is pytorch 2.0.0
but in the link above you shared … it says, pYtorch 2.1.0 is compatible with Jetpack 5.1.2.
which is right?
I had installed Pytorch 2.1.0 based on this link only. Please clarify.
@nagesh_accord I believe that compatibility table just pertains to the official PyTorch wheels that NVIDIA releases, however newer versions can continue to be built (like I do, and post to that thread). You can find instructions for building from source in that topic. I’m not sure how to undo the steps you have taken so far with installing different versions of CUDA/ect in your environment (although it could be as easy as changing the symbolic link of what points to /usr/local/cuda)
JetPack already comes with CUDA Toolkit and OpenCV after flashing your device with SDK Manager and allowing it to complete the post-install steps, so you should just need to install PyTorch. If you continue running into issues, you may just want to re-flash to get your system in a known-working state again. Or in that case, I also recommend trying the l4t-pytorch container which already includes PyTorch, torchvision, OpenCV, ect pre-installed inside the container:
Ok. How to. change symbolic link to.the version of CUDA that we want, in case I have more than one version of CUDA toolkit installed?
I am working on customized carrier board, where I have flashed the updated BSP myself through flashing to the SOM module.
So SDK manager does not work on the SOM module which is present on the customized carrier board. Correct me if I am wrong?
So.I need to install CUDA tool kit, Pytorch,.Open CV etc explicitly myself on the SOM.
Just want to know if these pytorch container installs CUDA complete toolkit package also?
First, make sure that /usr/local/cuda is indeed a symbolic link by checking ls -ll /usr/local/cuda* (you will see what they point to). Then rm -rf /usr/local/cuda and re-link it to CUDA 11 with ln -s /usr/local/cuda-11 /usr/local/cuda. If it is still not working after this and you are unfamiliar with Linux/CUDA, you probably just want to re-flash.
You should still be able to use SDK Manager to perform the post-flashing setup after you have flashed your custom BSP yourself. You can de-select the flashing step in SDK Manager and have it just do the post-flashing steps to install installing CUDA, cuDNN, OpenCV, ect. Or you may just be able to install them from the NVIDIA apt repo that typically comes with L4T.
Since JetPack 5 and newer, yes the containers include CUDA/ect installed inside the containers themselves (as opposed to on JetPack 4, these were mounted from the device). So if you are just using containers, technically you don’t even need CUDA Toolkit on your device. The l4t-pytorch container includes the full CUDA Toolkit (as do all of my containers, which are intended for development), however you can find other base images on NGC that only include subsets of these intended for deployment to keep the image size down.
ok. I can try this and check. However, as the hardware units are now in final boxed packed stage, we dont have the short- boot recovery mode enabled ( by shorting two pins) with the two USB 3.0 available for flashing, I doubt I can do any more flashing.
hence,I may have to fix any installation issues of any tools/software/libraies without flashing from now on.
Just want to know, Can we execute SDK manager on the target or SDK manager is meant to be used from the Host PC which is connected to the target?
As currently I cannot force my target to Boot recovery mode, with shorting of pins, can we execute SDK manager from host just by connecting target to the Host PC through USB cable?
Ok . I am currently following this method as per the documentation provided to install using package manager method( either local/network type) or run file installation method in the below link:
Hope my understanding is correct.
Ok. Now I am in a dilemma which method to use, container method or pakage manager installation method? which one do you suggest is very simple and works fine smoothly with out much overheads.
Can you throw more light on this NGC? Are these some other method apart from package installation or container method installation where we can install?
Sorry for many queries. I want to understand more things about these so asking these questions. Thanks.
@nagesh_accord SDK Manager runs on the Ubuntu PC, and I believe it can reset your Jetson into recovery mode without shorting the pins, or if you run sudo reboot –force forced-recovery from your Jetson.
I would personally recommend the container method since you seem to be deploying system(s) and want to install PyTorch and presumably other ML packages which can have complex dependencies.
Should I also generate this file for installation of CUDa and other packages through SDK manager?
Also some of the .json files mentioned software reference file and hardware reference file were not available for my unit jetson agx Xavier industrial.
Will those be generated after I connect the target to host PC and see the connection in the SDK manager?
I will trying this step today.
Our customer wants all components of jetpack 5.1.2 to be installed on the target before release.
So SDK manager method of installation would be better I thought, than container method. Pls clarify.
Does container has all required components for jetpack 5.1.2?
Thanks for this information.
It may take some time for me to understand this, will go through in future.
I would only get into this after you are comfortable with the basics of SDK Manager. Already have your board flashed with L4T using your custom method, and then use SDK Manager to just install the JetPack components like CUDA/cuDNN/ect (you can de-select the OS flashing step)
Yes in that case, I would have all the JetPack components installed normally, outside of container. l4t-jetpack includes all the JetPack components, while mine vary depending on the container (for example, may or may not have OpenCV and GStreamer depending on the container requirements)
As we are worried to try out this command "sudo reboot –force forced-recovery "
If we try out this command and if the unit enters to recovery mode, we are afraid, we may have to flash the unit to bring back to normal boot mode. Please clarify?
As our unit is boxed up fully, and we dont want to flash any thing again.
Also did not find steps in the documentation where and how to enter the Manual or automatic recovery mode. Please let us know more details about this.
Are you sure you have your Jetson connected to your PC over the Jetson’s USB flashing port? Since you are using a custom system, I am not sure which port this would be. You would also see an NVIDIA USB device show up under lsusb if it were connected.
No, all you have to do is reboot the board again to get it out of recovery mode and back into normal mode, and it won’t have changed the device unless you actually reflashed it from SDK Manager.
This just pertains to the containers and has nothing to do with SDK Manager. SDK Manager initially communicates with your Jetson over USB, not TCP/IP. Later during the post-flashing install steps, that USB connection will make a virtual ethernet adapter (so there is some networking used to install the packages like CUDA/cuDNN/ect) but you shouldn’t manually have to do that.
If you continue having questions or issues with SDK Manager, I recommend opening a new topic about that since you are using a custom system and not what this topic was originally about, then one of our experts in that area can help you with the finer details of that process specific to your circumstances. Thanks and best of luck!
We are.aware of the USB port that we were using for flashing through manual flash command method.
However, we do that by shorting two pins and force and unit to recovery mode so that it lists as NVIDIA CORPORATION device when we run lsusb command.
Now.that we have packed the unit and boxed it fully and the shorting of the pins have been removed and that USB is used as a normal USB 3.0(not.flashing USB port any more), we are not able to connect to SDK manager I suppose.
Since you told we can force the unit to recovery mode by using some reboot command on the boxed up unit, without shorting pins, I will try that and see tmrw if possible.thanks
Sorry for slightly deviating with SDK.manager topic here.
Thanks for the confirmation. We shall try that command tmrw and see if Jetson lists upon lsusb command and we can see SDk.manager automatically detecting the board.
Thanks
I am getting the below error. when trying to run the command
“./run.sh dustynv/l4t-pytorch:r35.4.1” on the Jetson target. Please let me know how to resolve this error. It says "/tmp/.docker.xauth " does not exist.
root@linux:/home/trident/Downloads# ./run.sh dustynv/l4t-pytorch:r35.4.1
localuser:root being added to access control list
xauth: file /tmp/.docker.xauth does not exist
sudo docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /home/trident/Downloads/data:/data --device /dev/snd --device /dev/bus/usb -e DISPLAY=:0 -v /tmp/.X11-unix/:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --device /dev/video0 dustynv/l4t-pytorch:r35.4.1
docker: Error response from daemon: unknown or invalid runtime name: nvidia.
See ‘docker run --help’.
I was trying the third installation option( apart from SDK Manager and CONTAINER installation methods ) to install CUDA 11.4.4 for my Jetpack 5.1.2 as per the below link:
I am facing some errors as shown below, when I execute the below command as per the documentation in the above link:
root@linux:/tmp# sudo apt-get install linux-headers-$(uname -r)
Reading package lists… Done
Building dependency tree
Reading state information… Done
E: Unable to locate package linux-headers-5.10.120-tegra
E: Couldn’t find any package by glob ‘linux-headers-5.10.120-tegra’
E: Couldn’t find any package by regex ‘linux-headers-5.10.120-tegra’
Any idea how to resolve this error?
Below is the information I have regarding the L4T version and other things:
I uninstalled CUDA 12.2 and installed CUDA 11.4 on my Jetson which is valid for my Jetpack 5.1.2, but still it is giving driver version mismatch error as shown below, when we execute the sample CUDA program. Any idea why?
I have doubt, should I execute the remaining two commands sudo apt-get --purge remove “nvidia” “libxnvctrl” - sudo apt-get autoremove
to clean up the CUDA 12.2 version fully,
[ but at the same time, I am afraid, it may remove other nvidia drivers and my system may not boot at all !!( once it had happened earlier in a similar situation ]
@nagesh_accord The docker daemon/services and NVIDIA Container Runtime should already be installed by SDK Manager (unless you never had SDK Manager install them)
If you browse my various container packages from the link below, each has associate test script(s) that you could run, or pick commands from to run manually:
The docker rmi command will remove a container image that you previously downloaded. docker images will list the container images that you already have on your system (you may need sudo for these if your user isn’t part of the docker usergroup)