CUDA not found on DJI Manifold 2G NVIDIA Jetson TX2

Hi i am running a dji manifold2, which contains a NVIDIA Jetson TX2, however it tells me CUDA device not found, even though there is CUDA 9 installed !

Source code: GitHub - mikel-brostrom/Yolov5_StrongSORT_OSNet: Real-time multi-camera multi-object tracker using YOLOv5 and StrongSORT with OSNet

Command: python3 track.py --source 0 --show-vid

Loading weights from deep_sort_pytorch/deep_sort/deep/checkpoint/ckpt.t7... Done!
Traceback (most recent call last):
  File "track.py", line 243, in <module>
    detect(opt)
  File "track.py", line 51, in detect
    device = select_device(opt.device)
  File "/media/dji/80GBstore/targetTrack/mikelbrostromNov2021/Yolov5_DeepSort_Pytorch/yolov5/utils/torch_utils.py", line 65, in select_device
    assert torch.cuda.is_available(), f'CUDA unavailable, invalid device {device} requested'  # check availability
AssertionError: CUDA unavailable, invalid device 0 requested

Python version and pip list

Python 3.8.9 (default, Apr  3 2021, 01:02:10) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.


torch                         1.8.0
torchaudio                    0.10.1
torchvision                   0.11.1


Drivers query

/usr/local/cuda/samples/1_Utilities/deviceQuery$ ./deviceQuery 
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA Tegra X2"
  CUDA Driver Version / Runtime Version          9.0 / 9.0
  CUDA Capability Major/Minor version number:    6.2
  Total amount of global memory:                 7839 MBytes (8219348992 bytes)
  ( 2) Multiprocessors, (128) CUDA Cores/MP:     256 CUDA Cores
  GPU Max Clock rate:                            1301 MHz (1.30 GHz)
  Memory Clock rate:                             1600 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 524288 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 9.0, NumDevs = 1
Result = PASS




nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Sun_Nov_19_03:16:56_CST_2017
Cuda compilation tools, release 9.0, V9.0.252

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Sun_Nov_19_03:16:56_CST_2017
Cuda compilation tools, release 9.0, V9.0.252


My system

DJI Manifold 2
NVIDIA Jetson TX2
ARMv8 Processor rev 3 (v8l) × 4 ARMv8 Processor rev 0 (v8l) × 2
NVIDIA Tegra X2 (nvgpu)/integrated
64-bit

Extra debugging as requested:

>>> import torch
>>> print(torch.version.cuda)
None

>>> torch.cuda.current_device()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/media/dji/80GBstore/pyenvs/newpy38/lib/python3.8/site-packages/torch/cuda/__init__.py", line 388, in current_device
    _lazy_init()
  File "/media/dji/80GBstore/pyenvs/newpy38/lib/python3.8/site-packages/torch/cuda/__init__.py", line 164, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled


>>> torch.cuda.device(0)
<torch.cuda.device object at 0x7fa871ad00>

The odd thing is, i have also attempted to install CUDA-enabled versions of torch from torch’s website, and this is what i got, after trying on virtual environments for python3.7 and python3.8

pip install torch-1.1.0-cp37-cp37m-linux_x86_64.whl 
ERROR: torch-1.1.0-cp37-cp37m-linux_x86_64.whl is not a supported wheel on this platform.

pip install torch-1.0.1.post2-cp37-cp37m-linux_x86_64.whl
ERROR: torch-1.0.1.post2-cp37-cp37m-linux_x86_64.whl is not a supported wheel on this platform.

pip install torch-1.7.0+cu92-cp38-cp38-linux_x86_64.whl 
ERROR: torch-1.7.0+cu92-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform.

pip install torch-1.7.1+cu92-cp39-cp39-linux_x86_64.whl 
ERROR: torch-1.7.1+cu92-cp39-cp39-linux_x86_64.whl is not a supported wheel on this platform.

pip install torch-1.7.1+cpu-cp39-cp39-linux_x86_64.whl 
ERROR: torch-1.7.1+cpu-cp39-cp39-linux_x86_64.whl is not a supported wheel on this platform.

Here are the python run virtual environments

Python 3.8.9 (default, Apr  3 2021, 01:02:10) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.


Python 3.7.10 (default, Feb 20 2021, 21:21:24) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.

Could anyone advise what could be the problem ?

  1. Why cant CUDA-compatible version of torch be installed ?

  2. Are there any updates I have to do before this can be done ?