Pytorch "Found No Nvidia Driver on your system" Jetson Nano

seth_the_forbus · February 25, 2021, 10:27pm

Good afternoon,

I recently started on the Nvidia “Getting started with AI on jetson nano course”. I’m using SSH from PoP_OS (Ubuntu based distribution) on my second PC, and connected over USB to my jetson nano. I was able to setup the JupyterLab server and connect. I tested the camera successfully, but then when I got to the “Classification interactive” section of the course and ran the “Import Torch” “Import Torchvision” “Device = torch.device(‘cuda’)” set of commands, the notebook returns the error “AssertionError found no nvidia driver on your system. Please check that you have an Nvidia GPU and installed a driver”. The PC I’m connecting to my JEtson nano with does not have an Nvidia GPU, it is a simple Gigabyte Brix micro PC with a 5th gen i5. I have access to high end NVidia laptops and desktops, but my understanding is that this Jupyter notebook should be using the cuda cores on the Jetson nano itself, not the system I’m accessing the Jupyter notebook from. I’ve already run sudo apt-get update and upgrade, and my nano is running jetpack version 4.4.1-b50. Any insight is appreciated, I’m hoping to get this error resolved, as I’m quite enjoying this course!

Best Regards, -Seth

MtHiker · February 26, 2021, 12:57am

check /usr/local/cuda/lib64/libcudart.* are existing on your Jetson nano.

seth_the_forbus · February 26, 2021, 1:07am

I believe so, here is a screenshot. I wouldn’t have had to install that as long as I used the official image on the SD card when setting it up correct?

AastaLLL · February 26, 2021, 3:05am

Hi,

Could you open the Terminal window.
And go to the CUDA folder to collect the device information for us:

$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery
$ sudo make
$ ./deviceQuery

Thanks.

seth_the_forbus · February 26, 2021, 3:00pm

dusty_nv · February 26, 2021, 4:41pm

Hi @seth_the_forbus, can you try running the same deviceQuery same from inside the dlinano container?

After you start the container over SSH, it will give you a terminal prompt inside the container - you can navigate that terminal to cd /usr/local/cuda/samples and run deviceQuery again.

Is the version of the DLI container you are running is nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.4.4 ?
This is the version of the container for JetPack 4.4.1 / L4T R32.4.4.

seth_the_forbus · February 26, 2021, 5:56pm

I tried your instructions Dusty (thanks for all your help btw). I get the error “sudo command not found” when I try to do a sudo make after doing a change directory to the cuda/samples folder. As far as the container, I used the following script.

sudo docker run --runtime nvidia -it --rm --network host \
    --volume ~/nvdli-data:/nvdli-nano/data \
    --device /dev/video0 \
    nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.4.4

dusty_nv · February 26, 2021, 6:23pm

Since you previously built deviceQuery sample from outside container, you should find the binary inside the container under /usr/local/cuda/samples/bin/aarch64/linux/release. If not, try building it from outside the container again. Your /usr/local/cuda directory gets mapped from your device into the container, so changes that you make to it from the device should be reflected inside the container.

Also, if you cat /etc/nv_tegra_release, does it also show L4T version R32.4.4?

seth_the_forbus · February 26, 2021, 6:48pm

To clarify, when you talk about running the terminal inside of the container, do you mean clicking terminal inside of the Jupyter notebook? I was doing it over SSH before so I might have been confused. I did plug my display port back into my jetson and tried building it outside of the container successfully. I also ran the cat /etc/nv_tegra_release on the hardware itself and verified L4T version R32.4.4.

seth_the_forbus · February 26, 2021, 6:56pm

Sorry for the second post, forum wouldn’t let me embed two images in one post as a new user.

Continued:
From the Jupyter lab terminal, I get no such file or directory as an error on the cat command, and no such file or directory when I try to run ./device Query. I did reboot my nano between doing the make command on the local device before I tried doing it again in JupyterLab.

I initially setup my Jetson nano about 6 months ago, before a change in living situation forced me to box it up. I’m wondering if I messed something up when I updated it out of the box this time, and if I should simply just flash a fresh SD card image and start over at this point?

dusty_nv · February 26, 2021, 7:13pm

You can either use the terminal inside the JupyterLab or on your SSH, they should be the same. I think you aren’t finding deviceQuery because it doesn’t get built to that path - instead, check for it under either of these:

/usr/local/cuda/samples/bin/aarch64/linux/release
/usr/local/cuda/samples/1_Utilities/deviceQuery

I haven’t actually seen the PyTorch error you are getting on Jetson before, so if you have another SD card or are willing to backup your work, I would re-flash it with the latest JetPack 4.5 image (L4T R32.5.0). Then run the nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.5.0 version of the container instead.

seth_the_forbus · February 26, 2021, 7:24pm

Thank you so much Dusty! You seem to have fixed it. After doing a CD to /usr/local/cuda/samples/1_Utilities/deviceQuery I ran the device query. After that, the Jupyter notebook recognized the cuda cores and everything was fixed. Hopefully this post helps someone else if they ever run into this. I’m very appreciative of your time and energy! :)

Best Regards, -Seth

dusty_nv · February 26, 2021, 8:22pm

OK good to know, thanks - glad you got it running! I have not seen that before, where the driver appears to need some ‘pre-initialization’ inside of the container - will keep an eye out for it.