A10 Card not recognized on Ubuntu

Hello,

We have recently acquired an Nvidia A10 card, and we have been having a hard time getting it to get recognized by the Nvidia driver on our Ubuntu 18.04 LTS machine. We’ve tried running it on its own as well as with another GTX 1050Ti card. In both cases we do not recognize the A10 card. With the card on its own, nvidia-smi returns:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

While with the GTX 1050Ti in the machine nvidia-smi can see the 1050Ti without an issue, but not the A10.

We have tried a multitude of bios settings variations in regards to secure boot, fast boot, CSM boot, and enabling above 4G decoding. We

We ensured to have the 470.57.02 driver installed, verified by running dkms status

We can definitely see the card pop up on lspci:

41:00.0 3D controller: NVIDIA Corporation Device 2236 (rev a1)
	Subsystem: NVIDIA Corporation Device 1482
	Flags: fast devsel, IRQ 112, NUMA node 2
	Memory at <unassigned> (64-bit, prefetchable)
	Memory at 80000000 (64-bit, prefetchable) [size=32M]
	Capabilities: <access denied>
	Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

Machine Info:

Motherboard Prime x399-A
AMD Ryzen Threadripper 2990WX 32-Core Processor
Ubuntu 18.04.5 - Kernel 5.4.0-81-generic

Thank you!

Hi @mohie Please try this solutions:

sudo apt-get purge nvidia-*
sudo apt-get update
sudo apt-get autoremove
sudo apt install libnvidia-common-470
sudo apt install libnvidia-gl-470
sudo apt install nvidia-driver-470
sudo reboot

Once rebooted, nvidia-smi should work as expected. If you already have CUDA version except 11.2 and 11.4 installed in your machine please skip sudo apt-get autoremove command as A10 GPU required CUDA 11.2 and 11.4 recommended NVIDIA Driver. This command will remove all unused package’s.

Hope this will work.

Please enable “above 4G decoding”/“large/64bit BAR” support in bios, disable CSM and install the OS in uefi mode.

Hi, I have a server access at my university with same specs of NVIDIA Corporation Device 2236 (rev a1) card with Ubuntu 18.04.

I tried installing the nvidia-drivers-470 and other supporting drivers mentions above. But when I do nvidia-smi, I get No devices were found.

I tried repeating same above process by removing old drivers and installing nvidia-drivers-460 and many other versions. But nothing worked.

This is what I got when I tried lspci | egrep 'VGA|3D'
00:0f.0 VGA compatible controller: VMware SVGA II Adapter
03:00.0 3D controller: NVIDIA Corporation Device 2236 (rev a1)

Can someone help me in installing the right drivers for the above mentioned graphic card.

Attaching bug report. (@generix)
nvidia-bug-report.log.gz (667.0 KB)

Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post.

1 Like

The VM you’re using is incorrectly set up. It needs to be deleted and recreated set to EFI boot.
https://forums.developer.nvidia.com/t/unable-to-start-fabricmanager-on-ubuntu-20-04-lts-server-with-a100/205525/2?u=generix

I saw that post, but I couldn’t find the solution properly, apologies for my stupidity. But can you point me to resource, to reinstall Ubuntu and set it to EFI boot.

It’s not a Ubuntu setting but on ESXi, when creating the VM, in “VM options” change boot method from “BIOS” to “EFI”.