Emergency problems about GPU Driver

Hi guys,

I’m having trouble using GPU for running my PyTorch programs. I checked CUDA and it appears to be version 10.2, which should be fine. However, it seems that I’m missing the NVIDIA GPU driver. I found a folder named “nvidia” in the directory “/usr/src”, but according to what I’ve read online, it should be followed by a version number. In my case, it’s just named “nvidia” and it contains only a folder called “graphics_demos”. I suspect that the GPU driver has not been correctly installed. Can anyone help me with what I should do? I don’t want to use the SDK Manager to flash the device as I’m concerned about potentially damaging the entire system. Are there any other methods available?

Thank you.

Are you logged in directly to the Jetson in the graphical desktop? If so, then what do you see from:
glxinfo | egrep -i '(nvidia|version)'
(you might need to run “sudo apt-get install mesa-utils” to get glxinfo)

When not directly logged in like that there are a lot of possible problems.

Also, check what you see at:
ls -ld /usr/local/cuda*
(user space software)

Also, check the output of:
lsmod | grep nvgpu
(the GPU driver which loads into Xorg)

Thank you for your reply. Here’s what I’ve got. Can you help me further?

Additionally, when I run “nvidia-smi”, the system shows “command not found” error.When I execute “torch.cuda.is_available()”, it returns false.The result of “torch.cuda.device_count()” is 0.

The commands I gave show that both the GPU driver and CUDA are present, and seem to work. The problem is that you are using incompatible GPU detection software.

The background is that ordinary desktop GPUs have a discrete GPU (dGPU) attached to the PCIe bus, but Jetsons have an integrated GPU (iGPU) directly connected to the memory controller. nvidia-smi is from an incorrect driver intended for a dGPU on a PCIe bus. Had you actually installed something which adds nvidia-smi, the working driver would have probably been broken. You won’t find any dGPU software on a Jetson.

I think @dusty_nv could answer the torch.cuda.* questions. There are different methods for detecting the iGPU. Perhaps you have the wrong version of torch, or perhaps it is just a case of a different function call. He’ll see this thread because I used the @ with his name.

Thanks! I will try to find solutions to the iCPU driver problem.

@dusty_nv might be able to suggest options to torch install for use with the iGPU, and not needing nvidia-smi.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.