RTX 3060 (mobile) not shown in vulkaninfo although driver seems to be installed as it should

Hi all,
we’re currently trying to get a laptop’s dedicated NVIDIA RTX 3060 (mobile) card running.

Installed are the proprietary nvidia drivers in version 550 on Pop!_OS. It seems though that these - at least regarding vulkan - do not work correctly. This has been tried in the “NIVIDIA only” and “Hybrid” mode of Pop!_OS. All outputs below have been gathered in the Hybrid mode; it doesn’t seem to make any difference to the “NIVIDIA only” mode though.
Appended is the vulkaninfo log file which also throws the following error messages when being run with __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia vulkaninfo > vulkaninfo.log:

ERROR: [Loader Message] Code 0 : loader_scanned_icd_add: Could not get 'vkCreateInstance' via 'vk_icdGetInstanceProcAddr' for ICD libGLX_nvidia.so.0
WARNING: [Loader Message] Code 0 : terminator_CreateInstance: Failed to CreateInstance in ICD 3.  Skipping ICD.

The vulkaninfo.log doesn’t contain any references to the RTX 3060 which seems weird.

Also, running e.g. the CARLA simulator with sudo docker run --privileged --gpus all --net=host -e DISPLAY=$DISPLAY carlasim/carla:0.9.12 /bin/bash ./CarlaUE4.sh of course with the NVIDIA Container toolkit being installed throws an “Incompatible Vulkan driver found! Cannot find a compatible Vulkan driver (ICD)” exception.

Running __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia glxinfo results in the following output:

name of display: :0
X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  152 (GLX)
  Minor opcode of failed request:  24 (X_GLXCreateNewContext)
  Value in failed request:  0x0
  Serial number of failed request:  50
  Current serial number in output stream:  51

Running nvidia-smi results in this (also when being run in Pop!_OS’ “NVIDIA only” mode):

No devices were found

Running lshw -c video though shows the correct information regarding the graphics card:

  *-display                 
       description: 3D controller
       product: GA106M [GeForce RTX 3060 Mobile / Max-Q]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:01:00.0
       logical name: /dev/fb0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress bus_master cap_list rom fb
       configuration: depth=32 driver=nvidia latency=0 mode=1920x1080 visual=truecolor xres=1920 yres=1080
       resources: iomemory:fa0-f9f iomemory:fc0-fbf irq:71 memory:d0000000-d0ffffff memory:fa00000000-fbffffffff memory:fc00000000-fc01ffffff ioport:3000(size=128) memory:d1000000-d107ffff
  *-display
       description: VGA compatible controller
       product: Cezanne
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: pci@0000:05:00.0
       logical name: /dev/fb0
       version: c6
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi msix vga_controller bus_master cap_list fb
       configuration: depth=32 driver=amdgpu latency=0 resolution=1920,1080
       resources: iomemory:fc0-fbf iomemory:fc0-fbf irq:44 memory:fc10000000-fc1fffffff memory:fc20000000-fc201fffff ioport:1000(size=256) memory:d1400000-d147ffff

Thanks in advance for any help! If you need any more information do not hesitate to ask :).

nvidia-bug-report.log.gz (302.3 KB)
vulkaninfo.log (121.3 KB)

The gpu is failing:
[ 27.450338] NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x25:0xffff:1589)
Please downgrade to driver 535 to check for a driver issue. If that doesn’t help, the gpu is likely broken.

1 Like

Hey generix, thanks for your help!
Indeed we previously tried older driver versions, iirc 535, 515, 4xx and none worked; all seemed to have the same error.
We’ll try downgrading again though, maybe it helps.
Do you recommend using the distribution’s package or installing with the .run file?

So the RmInitAdapter error seems to be the root cause?

Yes.
Just use the pop_os provided nvidia packages.

1 Like

Just to let you and others know: Indeed, the hardware was the problem. Installing Windows and checking there resulted in a clear info that the card could not be contacted correctly.
Wouldn’t have thought that initially, thank you very much!