PyTorch for Jetson

I’m revisiting my Docker process to build PyTorch from source. Are there patches for PyTorch 1.11 and beyond, or have the fixes been migrated into the PyTorch code base?

Hi
Can you tell me how to install MAGMA and its version number? and how to compile it form the source code of pytorch?

Hi @znmeb, I haven’t built PyTorch 1.11, but suffice it to say that my 1.10 patches would be a good starting point. Also, I’m not sure if Python 3.6 is supported past PyTorch 1.10, so you may need JetPack 5.0 (or upgrade Python if you are on Python 4.x) to build it.

Hi @18208947737, I haven’t built MAGMA before, you may want to open a new topic about that.

Hi @robert.scheffler, do you have CUDA Toolkit installed okay? Can you check the following directory:

ls -ll /usr/local/cuda/lib64/
total 3626264
lrwxrwxrwx 1 root root        17 Nov 15 04:07 libcublasLt.so -> libcublasLt.so.11
lrwxrwxrwx 1 root root        24 Nov 15 04:07 libcublasLt.so.11 -> libcublasLt.so.11.6.5.24
-rw-r--r-- 1 root root 371525152 Nov 15 04:07 libcublasLt.so.11.6.5.24
-rw-r--r-- 1 root root 502851542 Nov 15 04:07 libcublasLt_static.a
lrwxrwxrwx 1 root root        15 Nov 15 04:07 libcublas.so -> libcublas.so.11
lrwxrwxrwx 1 root root        22 Nov 15 04:07 libcublas.so.11 -> libcublas.so.11.6.5.24
-rw-r--r-- 1 root root 168021872 Nov 15 04:07 libcublas.so.11.6.5.24
-rw-r--r-- 1 root root 212415888 Nov 15 04:07 libcublas_static.a
-rw-r--r-- 1 root root    796212 Nov 15 02:30 libcudadevrt.a
lrwxrwxrwx 1 root root        17 Nov 15 02:30 libcudart.so -> libcudart.so.11.0
lrwxrwxrwx 1 root root        21 Nov 15 02:30 libcudart.so.11.0 -> libcudart.so.11.4.167
-rw-r--r-- 1 root root    670808 Nov 15 02:30 libcudart.so.11.4.167
-rw-r--r-- 1 root root   1078022 Nov 15 02:30 libcudart_static.a
lrwxrwxrwx 1 root root        13 Nov 17 03:57 libcudla.so -> libcudla.so.1
lrwxrwxrwx 1 root root        17 Nov 17 03:57 libcudla.so.1 -> libcudla.so.1.0.0
-rw-r--r-- 1 root root    159296 Nov 17 03:57 libcudla.so.1.0.0
lrwxrwxrwx 1 root root        14 Nov 15 03:49 libcufft.so -> libcufft.so.10
lrwxrwxrwx 1 root root        21 Nov 15 03:49 libcufft.so.10 -> libcufft.so.10.6.0.71
-rw-r--r-- 1 root root 174702496 Nov 15 03:49 libcufft.so.10.6.0.71
-rw-r--r-- 1 root root 215629292 Nov 15 03:49 libcufft_static.a
-rw-r--r-- 1 root root 187336232 Nov 15 03:49 libcufft_static_nocallback.a
lrwxrwxrwx 1 root root        15 Nov 15 03:49 libcufftw.so -> libcufftw.so.10
lrwxrwxrwx 1 root root        22 Nov 15 03:49 libcufftw.so.10 -> libcufftw.so.10.6.0.71
-rw-r--r-- 1 root root    740776 Nov 15 03:49 libcufftw.so.10.6.0.71
-rw-r--r-- 1 root root     30202 Nov 15 03:49 libcufftw_static.a
-rw-r--r-- 1 root root   1436538 Nov 15 02:22 libcufilt.a
-rw-r--r-- 1 root root     33242 Nov 15 02:30 libculibos.a
lrwxrwxrwx 1 root root        16 Nov 15 03:52 libcupti.so -> libcupti.so.11.4
lrwxrwxrwx 1 root root        20 Nov 15 03:52 libcupti.so.11.4 -> libcupti.so.2021.2.2
-rw-r--r-- 1 root root   5782696 Nov 15 03:52 libcupti.so.2021.2.2
lrwxrwxrwx 1 root root        15 Nov 12 13:45 libcurand.so -> libcurand.so.10
lrwxrwxrwx 1 root root        23 Nov 12 13:45 libcurand.so.10 -> libcurand.so.10.2.5.165
-rw-r--r-- 1 root root  81480832 Nov 12 13:45 libcurand.so.10.2.5.165
-rw-r--r-- 1 root root  81438022 Nov 12 13:45 libcurand_static.a
lrwxrwxrwx 1 root root        19 Nov 12 13:56 libcusolverMg.so -> libcusolverMg.so.11
lrwxrwxrwx 1 root root        27 Nov 12 13:56 libcusolverMg.so.11 -> libcusolverMg.so.11.2.0.165
-rw-r--r-- 1 root root 258827504 Nov 12 13:56 libcusolverMg.so.11.2.0.165
lrwxrwxrwx 1 root root        17 Nov 12 13:56 libcusolver.so -> libcusolver.so.11
lrwxrwxrwx 1 root root        25 Nov 12 13:56 libcusolver.so.11 -> libcusolver.so.11.2.0.165
-rw-r--r-- 1 root root 218556608 Nov 12 13:56 libcusolver.so.11.2.0.165
-rw-r--r-- 1 root root 211452066 Nov 12 13:56 libcusolver_static.a
lrwxrwxrwx 1 root root        17 Nov 12 13:50 libcusparse.so -> libcusparse.so.11
lrwxrwxrwx 1 root root        25 Nov 12 13:50 libcusparse.so.11 -> libcusparse.so.11.6.0.165
-rw-r--r-- 1 root root 230611448 Nov 12 13:50 libcusparse.so.11.6.0.165
-rw-r--r-- 1 root root 256717656 Nov 12 13:50 libcusparse_static.a
-rw-r--r-- 1 root root  15858550 Nov 12 13:56 liblapack_static.a
-rw-r--r-- 1 root root    909274 Nov 12 13:56 libmetis_static.a
lrwxrwxrwx 1 root root        13 Nov 12 14:00 libnppc.so -> libnppc.so.11
lrwxrwxrwx 1 root root        21 Nov 12 14:00 libnppc.so.11 -> libnppc.so.11.4.0.155
-rw-r--r-- 1 root root   1564840 Nov 12 14:00 libnppc.so.11.4.0.155
-rw-r--r-- 1 root root     26846 Nov 12 14:00 libnppc_static.a
lrwxrwxrwx 1 root root        15 Nov 12 14:00 libnppial.so -> libnppial.so.11
lrwxrwxrwx 1 root root        23 Nov 12 14:00 libnppial.so.11 -> libnppial.so.11.4.0.155
-rw-r--r-- 1 root root  13533736 Nov 12 14:00 libnppial.so.11.4.0.155
-rw-r--r-- 1 root root  15378762 Nov 12 14:00 libnppial_static.a
lrwxrwxrwx 1 root root        15 Nov 12 14:00 libnppicc.so -> libnppicc.so.11
lrwxrwxrwx 1 root root        23 Nov 12 14:00 libnppicc.so.11 -> libnppicc.so.11.4.0.155
-rw-r--r-- 1 root root   6509104 Nov 12 14:00 libnppicc.so.11.4.0.155
-rw-r--r-- 1 root root   6291604 Nov 12 14:00 libnppicc_static.a
lrwxrwxrwx 1 root root        16 Nov 12 14:00 libnppidei.so -> libnppidei.so.11
lrwxrwxrwx 1 root root        24 Nov 12 14:00 libnppidei.so.11 -> libnppidei.so.11.4.0.155
-rw-r--r-- 1 root root   9937808 Nov 12 14:00 libnppidei.so.11.4.0.155
-rw-r--r-- 1 root root  11479354 Nov 12 14:00 libnppidei_static.a
lrwxrwxrwx 1 root root        14 Nov 12 14:00 libnppif.so -> libnppif.so.11
lrwxrwxrwx 1 root root        22 Nov 12 14:00 libnppif.so.11 -> libnppif.so.11.4.0.155
-rw-r--r-- 1 root root  79115976 Nov 12 14:00 libnppif.so.11.4.0.155
-rw-r--r-- 1 root root  82495146 Nov 12 14:00 libnppif_static.a
lrwxrwxrwx 1 root root        14 Nov 12 14:00 libnppig.so -> libnppig.so.11
lrwxrwxrwx 1 root root        22 Nov 12 14:00 libnppig.so.11 -> libnppig.so.11.4.0.155
-rw-r--r-- 1 root root  34841224 Nov 12 14:00 libnppig.so.11.4.0.155
-rw-r--r-- 1 root root  36462618 Nov 12 14:00 libnppig_static.a
lrwxrwxrwx 1 root root        14 Nov 12 14:00 libnppim.so -> libnppim.so.11
lrwxrwxrwx 1 root root        22 Nov 12 14:00 libnppim.so.11 -> libnppim.so.11.4.0.155
-rw-r--r-- 1 root root   8880704 Nov 12 14:00 libnppim.so.11.4.0.155
-rw-r--r-- 1 root root   8057652 Nov 12 14:00 libnppim_static.a
lrwxrwxrwx 1 root root        15 Nov 12 14:00 libnppist.so -> libnppist.so.11
lrwxrwxrwx 1 root root        23 Nov 12 14:00 libnppist.so.11 -> libnppist.so.11.4.0.155
-rw-r--r-- 1 root root  34354008 Nov 12 14:00 libnppist.so.11.4.0.155
-rw-r--r-- 1 root root  36021764 Nov 12 14:00 libnppist_static.a
lrwxrwxrwx 1 root root        15 Nov 12 14:00 libnppisu.so -> libnppisu.so.11
lrwxrwxrwx 1 root root        23 Nov 12 14:00 libnppisu.so.11 -> libnppisu.so.11.4.0.155
-rw-r--r-- 1 root root    658520 Nov 12 14:00 libnppisu.so.11.4.0.155
-rw-r--r-- 1 root root     11458 Nov 12 14:00 libnppisu_static.a
lrwxrwxrwx 1 root root        15 Nov 12 14:00 libnppitc.so -> libnppitc.so.11
lrwxrwxrwx 1 root root        23 Nov 12 14:00 libnppitc.so.11 -> libnppitc.so.11.4.0.155
-rw-r--r-- 1 root root   4551016 Nov 12 14:00 libnppitc.so.11.4.0.155
-rw-r--r-- 1 root root   3593810 Nov 12 14:00 libnppitc_static.a
lrwxrwxrwx 1 root root        13 Nov 12 14:00 libnpps.so -> libnpps.so.11
lrwxrwxrwx 1 root root        21 Nov 12 14:00 libnpps.so.11 -> libnpps.so.11.4.0.155
-rw-r--r-- 1 root root  18404344 Nov 12 14:00 libnpps.so.11.4.0.155
-rw-r--r-- 1 root root  18500000 Nov 12 14:00 libnpps_static.a
lrwxrwxrwx 1 root root        15 Nov 15 04:07 libnvblas.so -> libnvblas.so.11
lrwxrwxrwx 1 root root        22 Nov 15 04:07 libnvblas.so.11 -> libnvblas.so.11.6.5.24
-rw-r--r-- 1 root root    712192 Nov 15 04:07 libnvblas.so.11.6.5.24
-rw-r--r-- 1 root root  14228496 Nov 15 03:52 libnvperf_host.so
-rw-r--r-- 1 root root   2208728 Nov 15 03:52 libnvperf_target.so
-rw-r--r-- 1 root root  18399504 Nov 12 13:46 libnvptxcompiler_static.a
lrwxrwxrwx 1 root root        25 Nov 12 13:48 libnvrtc-builtins.so -> libnvrtc-builtins.so.11.4
lrwxrwxrwx 1 root root        29 Nov 12 13:48 libnvrtc-builtins.so.11.4 -> libnvrtc-builtins.so.11.4.166
-rw-r--r-- 1 root root   6883128 Nov 12 13:48 libnvrtc-builtins.so.11.4.166
lrwxrwxrwx 1 root root        16 Nov 12 13:48 libnvrtc.so -> libnvrtc.so.11.2
lrwxrwxrwx 1 root root        20 Nov 12 13:48 libnvrtc.so.11.2 -> libnvrtc.so.11.4.166
-rw-r--r-- 1 root root  40962912 Nov 12 13:48 libnvrtc.so.11.4.166
lrwxrwxrwx 1 root root        18 Nov 12 14:06 libnvToolsExt.so -> libnvToolsExt.so.1
lrwxrwxrwx 1 root root        22 Nov 12 14:06 libnvToolsExt.so.1 -> libnvToolsExt.so.1.0.0
-rw-r--r-- 1 root root     44088 Nov 12 14:06 libnvToolsExt.so.1.0.0
drwxr-xr-x 2 root root      4096 Mar 24 18:02 stubs

Hi! I’ve compiled torch1.10.0 from source with Clang on Xavier NX for python3.8, and the process took 9 hours.
Here’s the Google Drive link:

And here’s the Baidu Net Disk link:

I’m not sure whether the extraction code is necessary when you download through the Baidu link, if needed, the extraction code is:
vhys

I hope it helps you.

I have installed Jetpack 5.0 on my Jetson Xavier AGX and I am trying to create a torchscript file from Detectron2 weights. I have used the torch-1.12 provided and torch & torchvision seem to be correctly installed when I open a python shell and print their versions. However when I try to create a torchscript model I get errors about missing torchvision ops, similar as posted here: MIssing torchvision::nms error in the C++ CUDA TorchVision API · Issue #5697 · pytorch/vision · GitHub.

Most things I seem to point to an incompatible torch & torchvision version but according to (torchvision · PyPI) the versions in the 1.12 and 1.10 wheels for pytorch I got from here (jetson-containers/docker_build_ml.sh at master · dusty-nv/jetson-containers · GitHub) are compatible.

The full error I get is:

RuntimeError: 
object has no attribute nms:
  File "/home/jetsonxavier/.local/lib/python3.8/site-packages/torchvision/ops/boxes.py", line 35
    """
    _assert_has_ops()
    return torch.ops.torchvision.nms(boxes, scores, iou_threshold)
           ~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
'nms' is being compiled since it was called from '_batched_nms_vanilla'
  File "/home/jetsonxavier/.local/lib/python3.8/site-packages/torchvision/ops/boxes.py", line 102
    for class_id in torch.unique(idxs):
        curr_indices = torch.where(idxs == class_id)[0]
        curr_keep_indices = nms(boxes[curr_indices], scores[curr_indices], iou_threshold)
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        keep_mask[curr_indices[curr_keep_indices]] = True
    keep_indices = torch.where(keep_mask)[0]
'_batched_nms_vanilla' is being compiled since it was called from 'batched_nms'
  File "/home/jetsonxavier/.local/lib/python3.8/site-packages/torchvision/ops/boxes.py", line 66
    # Ideally for GPU we'd use a higher threshold
    if boxes.numel() > 4_000 and not torchvision._is_tracing():
        return _batched_nms_vanilla(boxes, scores, idxs, iou_threshold)
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    else:
        return _batched_nms_coordinate_trick(boxes, scores, idxs, iou_threshold)
'batched_nms' is being compiled since it was called from 'batched_nms'
  File "/home/jetsonxavier/Projects/F3D/detectron2/detectron2/layers/nms.py", line 20
    # just call it directly.
    # Fp16 does not have enough range for batched NMS, so adding float().
    return box_ops.batched_nms(boxes.float(), scores, idxs, iou_threshold)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
'batched_nms' is being compiled since it was called from 'find_top_rpn_proposals'
  File "/home/jetsonxavier/Projects/F3D/detectron2/detectron2/modeling/proposal_generator/proposal_utils.py", line 112
            boxes, scores_per_img, lvl = boxes[keep], scores_per_img[keep], lvl[keep]

        keep = batched_nms(boxes.tensor, scores_per_img, lvl, nms_thresh)
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        # In Detectron1, there was different behavior during training vs. testing.
        # (https://github.com/facebookresearch/Detectron/issues/459)
'find_top_rpn_proposals' is being compiled since it was called from 'RPN.predict_proposals'
  File "/home/jetsonxavier/Projects/F3D/detectron2/detectron2/modeling/proposal_generator/rpn.py", line 503
        with torch.no_grad():
            pred_proposals = self._decode_proposals(anchors, pred_anchor_deltas)
            return find_top_rpn_proposals(
                   ~~~~~~~~~~~~~~~~~~~~~~~
                pred_proposals,
                ~~~~~~~~~~~~~~~
                pred_objectness_logits,
                ~~~~~~~~~~~~~~~~~~~~~~~
                image_sizes,
                ~~~~~~~~~~~~
                self.nms_thresh,
                ~~~~~~~~~~~~~~~~
                self.pre_nms_topk[self.training],
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                self.post_nms_topk[self.training],
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                self.min_box_size,
                ~~~~~~~~~~~~~~~~~~
                self.training,
                ~~~~~~~~~~~~~ <--- HERE
            )
'RPN.predict_proposals' is being compiled since it was called from 'RPN.forward'
  File "/home/jetsonxavier/Projects/F3D/detectron2/detectron2/modeling/proposal_generator/rpn.py", line 477
        else:
            losses = {}
        proposals = self.predict_proposals(
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            anchors, pred_objectness_logits, pred_anchor_deltas, images.image_sizes
            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        )
        return proposals, losses


This is the error message when I try to create torchscript file. When I try to run a .pth weights file I get the following error message:

RuntimeError: Couldn't load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, 
or if you had errors while compiling torchvision from source. For further information on the compatible versions, check https://github.com/pytorch/vision#installation
 for the compatibility matrix. Please check your PyTorch version with torch.__version__ and your torchvision version with torchvision.__version__ and 
verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install

They both seem to be related to the torch & torchvision compatibility.

Has anybody encountered the same problems?

1 Like
  • Python 3.6 - [ torch-1.10.0-cp36-cp36m-linux_aarch64.whl ] can not download , why?

This was the error… Thank you so much.

Hi @rmj54, this link is working for me, are you able to try again? Perhaps it’s a network issue from your end?

Hi, this link does not work. 1.10 works but all the rest are down.

Tried this too: wget https://nvidia.box.com/shared/static/p57jwntv436lfrd78inwl7iml6p13fzh.whl -O torch-1.8.0-cp36-cp36m-linux_aarch64.whl

Same story. The download stops after connected.

Tried this too: wget https://nvidia.box.com/shared/static/p57jwntv436lfrd78inwl7iml6p13fzh.whl 1 -O torch-1.8.0-cp36-cp36m-linux_aarch64.whl

Same story. The download stops after connected.

Can you try again, perhaps from your PC browser? I am able to now, so it may have been a temporary issue?

PyTorch 1.11 for JetPack 5.0 Developer Preview and Xavier/Orin has been posted:

PyTorch v1.11.0

I try install with 1.6 1.7 1.10.,After install ,I import torch,the system reply:
module ‘typing’ has no attribute ‘_SpecialForm’

Hi @1263032440, are you using Python 3.6? Can you try using the l4t-pytorch container to rule out an environment issue?

I am using jetpack 5.0 and my pytorch is 1.12.0a0+2c916ef.nv22.3
I am trying the yolov5 and I need torch vision I installed the main branch of torchvision but it gives me the incompatible error.

RuntimeError: Couldn’t load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check GitHub - pytorch/vision: Datasets, Transforms and Models specific to Computer Vision for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.

where can I find the torchvision compatible with torch 1.12.0a0+2c916ef.nv22.3

1 Like

yes,i use the 3.6.,and how can i get the container you described?

You can select one of the tags from here that’s compatible with your L4T version (you can check this with cat /etc/nv_tegra_release)

https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-pytorch/tags

And then the command to start the container is listed on this page: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-pytorch

Hi @user22290, can you try this torchvision commit? https://github.com/pytorch/vision/commit/e5a5f0be

1 Like

I am using the main branch. It has all the changes in this commit. The changes related to this commit is on the unittest/windows, while I am using linux.