Trouble getting Pytorch to work with CUDA

Hello,

I’m running into an issue where I can’t get Pytorch with cuda support to work. I need to run a few different neural network models, but I cannot get “torch.cuda.is_avaiable()” to display “True”.
I tried running the models regardless of the “False” response, but that didn’t work (which was expected).

Specs:

  •   Running IGX-SW 1.0 Production Release
    
  •   cat /sys/class/dmi/id/bios_version:  36.3.1-gcid-36302503
    
  •   nvidia-smi:
     -           NVIDIA-SMI 535.183.01
     -           Driver version: 535.183.01
     -           CUDA Version: 12.2
    
  •   nvcc:
      -          Cuda compilation tools, release 11.5, V11.5.119
      -          Build cuda_11.5.r11.5/compiler.30672275_0
    
  •   Python version: 3.10.12
    
  •   Pytorch version: 2.0.1
    

What I’ve tried:

  • Downloading and installing Pytorch normally (using pip and the commands from their website), which didn’t work.
  • Installing and testing out a few Pytorch wheels from Nvidia’s website, which also didn’t work for both the iGPU and dGPU.
  • Updating the GPU with the ARM vBIOS, but it looks like I was already on the latest version as it took like 1 second to complete.
  • Several random potential fixes that I found online, but those didn’t work too. Things like reinstalling specific libraries, changing library versions, and such.

How I got here:

  • I installed the dGPU version of IGX OS v1.0 on our IGX Orin dev kit.
  • I then installed Pytorch using “pip3 install torch torchvision torchaudio”
  • I tried “torch import” after the installation, but I was getting an error along the lines of “libcudnn not found”.
  • After some googling, I was pointed towards installing “nvidia-cuda-toolkit”. I believe that’s why I have 2 different versions of CUDA.

Please let me know if you have any suggestions or potential fixes.
If needed, I can reflash the OS as there’s nothing of importance currently stored on the dev kit.

Thank you.