Computer keeps losing GPU: Unable to determine the device handle for GPU0000:06:00.0: Unknown Error

I have a 4060ti 16g set up as eGPU for my laptop, which comes with a 3050 mobile. My computer keeps losing GPUs while running tasks. The error message from running nvidia-smi is Unable to determine the device handle for GPU0000:06:00.0: Unknown Error. A reboot could solve the problem, but sometimes it keeps happening every few hours and it’s very frustrating.

$ lspci | grep -i VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GA107BM [GeForce RTX 3050 Mobile] (rev a1)
06:00.0 VGA compatible controller: NVIDIA Corporation Device 2805 (rev ff)

The driver version is 535-open:

Running nvidia-smi when GPUs are up:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3050 ...    Off | 00000000:01:00.0  On |                  N/A |
| N/A   33C    P8               4W /  60W |   1039MiB /  4096MiB |     11%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce RTX 4060 Ti     Off | 00000000:06:00.0 Off |                  N/A |
|  0%   34C    P8               6W / 165W |     18MiB / 16380MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

OS: Kubuntu 22.04

Here is the nvidia-bug-report.log :

nvidia-bug-report.log.gz (269.2 KB)

Please limit gpu clocks
nvidia-smi -lgc 300,1500
to check for psu issues.

Thanks for your reply, how can I confirm whether this is a PSU issue or not? I haven’t experienced any GPU issue during gaming in Windows, which also gives rise to high GPU usage.