Updating Jetson AGX Orin with Jepack 5.1.2 to CUDA 11.8 and PyTorch installation from source

Hello everyone,

I am trying to run Seamless Streaming from Meta on my NVIDIA Jetson AGX Orin 64GB. For this, I am using the model from Hugging Face, which is also used in the GitHub repository.

The two main components I need to install are PyTorch and fairseq2. As indicated in the setup.py(seamless_communication/setup.py at main · facebookresearch/seamless_communication · GitHub), I need fairseq2 version 0.2.

For fairseq2 (v0.2.1), the README states the following requirements:

  • Python 3.8 to 3.11
  • PyTorch 2.1.0 or 2.1.1
  • CUDA 11.8 or 12.1

I then checked which Jetpack versions are compatible. I decided to use Jetpack 5.1.2 and flashed my Jetson Orin with it. However, Jetpack 5.1.2 installs CUDA 11.4 by default. According to this NVIDIA page, CUDA can easily be updated.

Next, I downloaded and installed CUDA 11.8 from this NVIDIA website.

Here are the steps I followed:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/arm64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-tegra-repo-ubuntu2004-11-8-local_11.8.0-1_arm64.deb
sudo dpkg -i cuda-tegra-repo-ubuntu2004-11-8-local_11.8.0-1_arm64.deb
sudo cp /var/cuda-tegra-repo-ubuntu2004-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

I then added the CUDA path to my .bashrc as shown in this NVIDIA documentation:

export PATH=/usr/local/cuda-11.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH

I found that the precompiled PyTorch .whl files (e.g., here) are built for CUDA 11.4. Therefore, PyTorch for CUDA 11.8 must be built from source.

I created a Conda environment with Python 3.8:

conda create --name seamless python=3.8

Then, I followed the steps from section Instructions - Build from Source to install PyTorch from source:

git clone --recursive --branch v2.1.0 http://github.com/pytorch/pytorch
cd pytorch

export USE_NCCL=0
export USE_DISTRIBUTED=0
export USE_QNNPACK=0
export USE_PYTORCH_QNNPACK=0
export TORCH_CUDA_ARCH_LIST="7.2;8.7"

export PYTORCH_BUILD_VERSION=2.1.0
export PYTORCH_BUILD_NUMBER=1

sudo apt-get install python3-pip cmake libopenblas-dev libopenmpi-dev 

pip install -r requirements.txt
pip install scikit-build
pip install ninja

python setup.py bdist_wheel

Then I installed the .whl file from the dist directory:

pip install dist/torch...

Everything seemed to go well, and I confirmed CUDA 11.8 was installed using:

nvcc --version

However, when I check within my environment, I get the following:

python
import torch
print(torch.version.cuda)

This still shows CUDA 11.4.

Why is CUDA 11.4 still being displayed when I am using CUDA 11.8? Could someone help me figure out what I might be doing wrong?

Thank you in advance for your assistance!

Hi

The default CUDA_HOME in PyTorch is /usr/local/cuda.
Suppose it should link to CUDA 11.4. Could you check if this is the issue in your use case?

$ ll /usr/local/cuda

You can either change the symbolic link to CUDA 11.8 or build the PyTorch with a predefined CUDA_HOME value as below:

CUDA_HOME=/usr/local/cuda-11.8 python setup.py bdist_wheel

Thanks.

1 Like

@AastaLLL thank you very much!

After the installation of CUDA and exports of the paths to CUDA 11.8, I get following:

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:43:33_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

and

ll /usr/local/cuda
lrwxrwxrwx 1 root root 22 Mar 13 08:46 /usr/local/cuda -> /etc/alternatives/cuda/

I removed the symbolic link

sudo rm /usr/local/cuda

and set it to CUDA Version 11.8. Then the link to the right CUDA Version should be correct

sudo ln -s /usr/local/cuda-11.8 /usr/local/cuda
ls -l /usr/local | grep cuda

lrwxrwxrwx  1 root root   22 Mar 13 08:46 cuda -> /usr/local/cuda-11.8

Then I also used the command you mentioned to build pytorch for CUDA 11.8

CUDA_HOME=/usr/local/cuda-11.8 python setup.py bdist_wheel

After I installed it from the dist directory I got finally the right CUDA version

import torch
torch.version.cuda
'11.8'

Thanks for updating the status.
Good to know the version is correct now.

1 Like