I fixed it! Here are tips that should generalize across Pytorch and JetPack versions, as well as specifics that worked in my case.
General Principles
- When building torch using
setup.py, open the file to see which environment variables can be used and which are relevant to you. For example, though the post suggestedUSE_NATIVE_ARCH=1for 2.7 builds, that environment variable doesn’t seem to be used when building torch 2.8. - Install build dependencies in a virtual environment. This is usually my practice, but I originally skipped it for speed. Then I regretted it.
- Check the errors and CMake warnings during build. This will help you catch errors and kill the build early, rather than waiting 2 hours.
Specific Steps
- As suggested here,
git clone --depth=1 --recursive -b v<PYTORCH_VERSION> https://github.com/pytorch/pytorch
cd pytorch
- Create a virtual environment for the build packages (set that up yourself using
venv,uv, etc.) - Install packages outlined in original post into the virtual environment
- Install
numpy>2into the virtual environment - Obtain your CUDA Arch Bin using
jtop(mine was 8.7 for AGX Orin Dev Kit) - Set environment variables and build as follows:
USE_CUDA=1 USE_CUDNN=1 USE_CUSPARSELT=1 USE_CUDSS=1 USE_CUFILE=1 TORCH_CUDA_ARCH_LIST="<CUDA Arch Bin>" USE_DISTRIBUTED=1 USE_FLASH_ATTENTION=1 USE_MEM_EFF_ATTENTION=1 USE_TENSORRT=0 python3 setup.py bdist_wheel
-
During the build, check for Cmake warnings. For example, I encountered and fixed the following
-
-- Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY) CMake Warning at cmake/public/cuda.cmake:31 (message): PyTorch: CUDA cannot be found. ...This happened because I didn’t include Jetson SDK Components when using the SDKManager to flash my Orin. I resolved it using
sudo apt install nvidia-jetpackon the Orin -
-- Could NOT find CUSPARSELT (missing: CUSPARSELT_LIBRARY_PATH CUSPARSELT_INCLUDE_PATH) CMake Warning at cmake/public/cuda.cmake:276 (message): Cannot find cuSPARSELt library. Turning the option offI downloaded
cusparselt 0.6.3from here: using the Linux → aarch64-jetson → Ubuntu 22.04 → deb (local) steps. Theinstall_cusparselt.shscript mentioned here didn’t seem to work. -
-- Could NOT find CUDSS (missing: CUDSS_LIBRARY_PATH CUDSS_INCLUDE_PATH) CMake Warning at cmake/public/cuda.cmake:242 (message): Cannot find CUDSS library. Turning the option offI downloaded
cudss 0.7.1from cuDSS 0.7.1 Downloads | NVIDIA Developer using the Linux → aarch64-jetson → Ubuntu 22.04 → deb (local) steps.
-
-
Once your wheel is built in
dist, you can move it to your desired deployment environment and install it there. -
Test your installed wheel in Python:
import torch, numpy print("torch: ", torch.__version__) print("numpy: ", numpy.__version__) print("torch.version.cuda: ", torch.version.cuda) print("cuda built: ", torch.backends.cuda.is_built()) print("cuda available: ", torch.cuda.is_available()) print("cuda device count: ", torch.cuda.device_count())