new installed thired GTX stopped beging identified

I installed a third GPU which initially worked but quite quickly started causing problem
currently it is not identified at all by nvidia-smi cmd:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 381.22                 Driver Version: 381.22                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 0000:02:00.0     Off |                  N/A |
| 23%   28C    P8     9W / 250W |      1MiB / 11172MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 0000:04:00.0      On |                  N/A |
| 27%   46C    P8    16W / 250W |    426MiB / 11171MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

It is identified by Ubuntu:

:~$ lspci | grep -i nvidia
02:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
02:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
03:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
03:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)
04:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1)
04:00.1 Audio device: NVIDIA Corporation Device 10ef (rev a1)

Running

sudo dmesg  | grep NVRM

:

[    9.075353] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  381.22  Thu May  4 00:55:03 PDT 2017 (using threaded interrupts)
[   15.193914] NVRM: RmInitAdapter failed! (0x24:0x65:1056)
[   15.193949] NVRM: rm_init_adapter failed for device bearing minor number 1
[   15.926407] NVRM: Your system is not currently configured to drive a VGA console
[   15.926408] NVRM: on the primary VGA device. The NVIDIA Linux graphics driver
[   15.926408] NVRM: requires the use of a text-mode VGA console. Use of other console
[   15.926409] NVRM: drivers including, but not limited to, vesafb, may result in
[   15.926409] NVRM: corruption and stability problems, and is not supported.
[   17.014877] NVRM: RmInitAdapter failed! (0x25:0xffff:1071)
[   17.014924] NVRM: rm_init_adapter failed for device bearing minor number 1
[   54.475359] NVRM: RmInitAdapter failed! (0x25:0xffff:1071)
[   54.475418] NVRM: rm_init_adapter failed for device bearing minor number 1
[   57.831689] NVRM: RmInitAdapter failed! (0x25:0xffff:1071)
[   57.831731] NVRM: rm_init_adapter failed for device bearing minor number 1
[  264.479688] NVRM: RmInitAdapter failed! (0x25:0xffff:1071)
[  264.479754] NVRM: rm_init_adapter failed for device bearing minor number 1
[  526.012073] NVRM: RmInitAdapter failed! (0x25:0xffff:1071)
[  526.012132] NVRM: rm_init_adapter failed for device bearing minor number 1

And running

sudo dmesg  | grep -i NVIDIA

:

[    9.068283] nvidia: loading out-of-tree module taints kernel.
[    9.068286] nvidia: module license 'NVIDIA' taints kernel.
[    9.071551] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[    9.074898] nvidia-nvlink: Nvlink Core is being initialized, major device number 245
[    9.075060] nvidia 0000:02:00.0: enabling device (0100 -> 0103)
[    9.075128] nvidia 0000:02:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[    9.075180] nvidia 0000:03:00.0: enabling device (0100 -> 0103)
[    9.075215] nvidia 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[    9.075297] nvidia 0000:04:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
[    9.075353] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  381.22  Thu May  4 00:55:03 PDT 2017 (using threaded interrupts)
[    9.091980] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  381.22  Thu May  4 00:21:48 PDT 2017
[    9.092395] [drm] [nvidia-drm] [GPU ID 0x00000200] Loading driver
[    9.092434] [drm] [nvidia-drm] [GPU ID 0x00000300] Loading driver
[    9.092471] [drm] [nvidia-drm] [GPU ID 0x00000400] Loading driver
[   10.222193] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 243
[   12.433222] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:02.0/0000:02:00.1/sound/card1/input20
[   12.433291] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:02.0/0000:02:00.1/sound/card1/input21
[   12.433344] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:02.0/0000:02:00.1/sound/card1/input22
[   12.433396] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:02.0/0000:02:00.1/sound/card1/input23
[   12.433430] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.0/0000:03:00.1/sound/card2/input12
[   12.433512] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.0/0000:03:00.1/sound/card2/input13
[   12.433556] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.2/0000:04:00.1/sound/card3/input16
[   12.433636] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.0/0000:03:00.1/sound/card2/input14
[   12.436293] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.0/0000:03:00.1/sound/card2/input15
[   12.436355] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.2/0000:04:00.1/sound/card3/input17
[   12.436413] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.2/0000:04:00.1/sound/card3/input18
[   12.436515] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.2/0000:04:00.1/sound/card3/input19
[   15.926408] NVRM: on the primary VGA device. The NVIDIA Linux graphics driver
[   15.927189] nvidia-modeset: Allocated GPU:0 (GPU-5a40a451-9f6e-ad9b-c0b3-f9158c301ef5) @ PCI:0000:04:00.0
[   16.632719] nvidia-modeset: Allocated GPU:1 (GPU-b570f669-0d90-8151-169a-7420d371bc0a) @ PCI:0000:02:00.0

Notice that drivers for 3 GPUs are loaded But then only the two (older GPUs) are allocated.

Any ideas?

That seems to be indicative of a hardware issue. Check:

(1) Proper mechanical installation (GPU seated firmly in PCIe slot, firmly secure at bracket)

(2) Sufficient power supply (PCIe power connectors correctly engaged [tab snaps into place], avoid splitters or converters in the PCIe power connector cables)

(3) Sufficient cooling (unobstructed air flow, reasonable ambient temperature; check nvidia-smi output for overheating; is the fan turning?)

Was this GPU acquired factory fresh from a reputable source? Who installed it (it is possible to cause permanent damage through improper handling during installation, although this is rare)?

@njuffa Thanks for the quick response.

I think the issue my be a temperature issue as you suggested.
When

nvidia-smi

was still displaying GPU it would write ERR! instead of Temp.
Is there a way I can check GPU fan status?

I suspect nvidia-smi displays an error because it cannot communicate with the GPU due to a hardware issue. Normally you can confirm the operation of the fan by looking at it (is it spinning?) but I just remembered that Pascal-family GPUs are so efficient that they may halt the fan when idling.

You might want to try physically switching the suspect GPU with the adjacent one. Does the failure follow the GPU or is it correlated with a particular PCIe slot? It is possible for PCIe connectors to become faulty (e.g. corrosion, hairline crack in the motherboard) but this is rare.

The reason I think the issue is some sort of hardware problem is because the GPU worked initially but then quickly deteriorated. In the simplest scenario, the GPU wasn’t seated properly in the PCIe slot and has wiggled its way out of it further, e.g. through vibration or due to its own weight. Thus my recommendation to check the mechanical fit first.

Note that diagnosing such issues remotely is about as effective as diagnosing car trouble over the phone.