Newb: How to Troubleshoot Problem with CUDA not seeing GPU ./deviceQuery return cudaGetDeviceCount 802 System not yet Initialized

I’m new to CUDA and GPU stuff. I have an older GPU a GT 720. My goal is to get tortoise-tts working. I guess, I’m assuming I need CUDA to do this.

On bare metal, I installed Ubuntu 20.04 and the proprietary Nvidia (nvidia-driver-470).

Based on this I came up with these versions of cuda and cuda-drivers.

sudo apt-get install cuda=11.4.4-1 cuda-drivers=470.239.06-1

Everything installs correctly, but when I try to run the cuda-samples (11.4 version), they don’t work.

If I run the deviceQuery one, I get this.

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 802
-> system not yet initialized
Result = FAIL

All of the cuda 11.4 samples fail the same way. I assume this means Cuda can’t see my Nvidia device. So my questions are:

Do I need to even do this?

If so, how do I troubleshoot this issue?

Should I use the proprietary Nvidia driver or are there better options?

Do you have an old mainboard with nvidia chipset?
Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

Here’s the debug log.

nvidia-bug-report.log.gz (236.8 KB)

All of this hardware is old, if that’s what you mean by an old mainboard. I’m trying to a low-budget proof of concept, and part of it is a text-to-speech.

Like suspected, it’s an nvidia chipset. The driver has a bug, detecting old nvidia cipsets as nvswitch systems so cuda won’t work. You might check if adding the modprobe option
options nvidia NVreg_NvLinkDisable=1
helps.

Thank you so much for your help.

If I get another card, how do I know that it will work with CUDA, because this one that was on the list. Maybe I should look for whatever is the best card for 470 driver that I have installed. I’m happy to go newer but newer gets more expensive.

Thanks again.

It is your mainboard that is blocking cuda due to the nvidia driver bug, not the gpu.

Oh I see. Back to the drawing board - or main board.

I did try
options nvidia NVreg_NvLinkDisable=1
but it didn’t seem to work.

Does cat /proc/driver/nvidia/params |grep NvLink reflect the setting?

Yeah, it says.

NvLinkDisable: 1

If I want to get another main board, what should I avoid? Boards that don’t have nvidia chips in it?

Boards with nvidia chipsets.

1 Like

I wanted to follow up that I did pick up a board without an nvidia chipset and it worked immediately. Thank you.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.