Cannot install PyTorch on jetson nano for python 3.9

I tried this solution for the following error and still got the same

(env) rithvik@rithvik:~/pytorch$ echo $PATH
/home/rithvik/env/bin:/home/rithvik/.local/bin:/usr/local/cuda-11.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
(env) rithvik@rithvik:~/pytorch$ echo $LD_LIBRARY_PATH
/usr/local/cuda-11.0/lib64:/usr/local/cuda-11.0/lib64:/usr/local/cuda-11.0/lib64
(env) rithvik@rithvik:~/pytorch$ nvcc --version 
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Thu_Jun_11_22:26:42_PDT_2020
Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.TC445_37.28540450_0
(env) rithvik@rithvik:~/pytorch$ ipython
Python 3.8.0 (default, Dec  9 2021, 17:53:27) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.12.2 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import torch

In [2]: torch.cuda.is_available()
/home/rithvik/pytorch/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at  /home/rithvik/pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  return torch._C._cuda_getDeviceCount() > 0
Out[2]: False

Also was looking into this post and got this when I ran them on my device

(env) rithvik@rithvik:~/pytorch$ cat /sys/devices/gpu.0/gpu_powered_on
1
(env) rithvik@rithvik:~/pytorch$ cat /sys/devices/gpu.0/devfreq/57000000.gpu/power/runtime_enabled
disabled
(env) rithvik@rithvik:~/pytorch$ cd /usr/local/cuda/samples/0_Simple/vectorAdd
(env) rithvik@rithvik:/usr/local/cuda/samples/0_Simple/vectorAdd$ sudo make
[sudo] password for rithvik: 
/usr/local/cuda-11.0/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -o vectorAdd.o -c vectorAdd.cu
/usr/local/cuda-11.0/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -o vectorAdd vectorAdd.o 
mkdir -p ../../bin/sbsa/linux/release
cp vectorAdd ../../bin/sbsa/linux/release
(env) rithvik@rithvik:/usr/local/cuda/samples/0_Simple/vectorAdd$ ./vectorAdd
[Vector addition of 50000 elements]
Failed to allocate device vector A (error code unknown error)!
(env) rithvik@rithvik:/usr/local/cuda/samples/0_Simple/vectorAdd$