After the kvm virtual machine is restarted, a green screen appears when the VDI terminal desktop is connected to the virtual machine. After disconnecting the VDI terminal, re-establish the connection again, and the VDI terminal desktop returns to normal.
Error report:
dmesg
[1564558.977643] [nvidia-vgpu-vfio] 42d46a84-a428-4d25-9726-247710a8b10b: vGPU migration disabled
[1564560.117060] device tap99851a34-9b entered promiscuous mode
[1564566.773948] [nvidia-vgpu-vfio] 89757158-7343-4ba9-9cf4-0b280c75870e: vGPU migration disabled
[1564567.717421] device tapf54f3693-6e entered promiscuous mode
[1564575.390671] [nvidia-vgpu-vfio] 92b26d96-a00d-4fa7-b554-689d3ce14b48: vGPU migration disabled
[1564576.618408] device tap10358f00-c1 entered promiscuous mode
[1564579.439251] [nvidia-vgpu-vfio] b4a4e1b8-3719-4a8c-ab48-05d34a36f628: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564583.535130] [nvidia-vgpu-vfio] d2c3b3b5-3a37-40f2-86d1-987c00d964f3: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564583.936274] [nvidia-vgpu-vfio] d3bade46-bad2-49cd-af10-156c2cdcbeb2: vGPU migration disabled
[1564585.072074] [nvidia-vgpu-vfio] 23d2f822-6708-46fd-aac7-9a9a9e5daa91: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564586.854636] irq 740: Affinity broken due to vector space exhaustion.
[1564586.854656] irq 741: Affinity broken due to vector space exhaustion.
[1564586.854683] irq 740: Affinity broken due to vector space exhaustion.
[1564586.854696] irq 741: Affinity broken due to vector space exhaustion.
[1564588.142933] [nvidia-vgpu-vfio] 6c4b3ea3-8ad4-4217-a012-1e33ba4ce401: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564590.724790] [nvidia-vgpu-vfio] 16cf4bda-296c-4f31-ada2-76172423541a: vGPU migration disabled
[1564594.286712] [nvidia-vgpu-vfio] 42d46a84-a428-4d25-9726-247710a8b10b: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564596.230529] [nvidia-vgpu-vfio] e1581382-816e-40b8-a495-88af41240525: Power op failed
[1564597.357597] [nvidia-vgpu-vfio] 899ccd00-b9e8-4a28-9cc3-b05367c48f8a: vGPU migration disabled
[1564597.623968] [nvidia-vgpu-vfio] 34863a17-8341-4f9d-9f51-d344a21b4581: Power op failed
[1564598.200582] irq 735: Affinity broken due to vector space exhaustion.
[1564598.200608] irq 735: Affinity broken due to vector space exhaustion.
[1564600.942453] [nvidia-vgpu-vfio] 89757158-7343-4ba9-9cf4-0b280c75870e: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564604.344222] [nvidia-vgpu-vfio] 122c7704-5a08-4a46-b308-622897a1076d: vGPU migration disabled
[1564605.144715] [nvidia-vgpu-vfio] 61ef64de-1d74-4de1-9e10-37f5a6209a2d: Power op failed
[1564605.812007] [nvidia-vgpu-vfio] 7e1714b8-69fc-4a16-8c83-4802896e0757: Power op failed
[1564609.938235] [nvidia-vgpu-vfio] 11739ee5-6270-4117-a8a3-954ae0482be3: vGPU migration disabled
[1564610.158140] [nvidia-vgpu-vfio] 92b26d96-a00d-4fa7-b554-689d3ce14b48: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564613.352239] [nvidia-vgpu-vfio] f4eaf93f-e771-4230-8da9-8551967480a7: Power op failed
[1564615.336528] [nvidia-vgpu-vfio] dbd613b2-3c1e-4be1-a0d7-0fda4ad2b82c: Power op failed
[1564631.240428] irq 785: Affinity broken due to vector space exhaustion.
[1564631.240446] irq 786: Affinity broken due to vector space exhaustion.
[1564631.240470] irq 785: Affinity broken due to vector space exhaustion.
[1564631.240481] irq 786: Affinity broken due to vector space exhaustion.
[1564646.508841] [nvidia-vgpu-vfio] 11739ee5-6270-4117-a8a3-954ae0482be3: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564647.021753] [nvidia-vgpu-vfio] dbd613b2-3c1e-4be1-a0d7-0fda4ad2b82c: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564648.556706] [nvidia-vgpu-vfio] f4eaf93f-e771-4230-8da9-8551967480a7: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564650.092633] [nvidia-vgpu-vfio] c7771b68-6692-4039-913c-2a716527251c: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564652.653539] [nvidia-vgpu-vfio] 5206a441-74e6-4bc6-aa09-8dfffadd8909: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564653.164535] [nvidia-vgpu-vfio] db7fd81a-8cbb-4b10-899d-07c7b57b25f2: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564655.212444] [nvidia-vgpu-vfio] c4e38848-f6be-46f5-85a8-c08a933e42b5: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564658.284331] [nvidia-vgpu-vfio] 5df37d14-df0f-4ef2-9bea-408cca17acc0: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
[1564662.181317] irq 750: Affinity broken due to vector space exhaustion.
[1564662.181348] irq 750: Affinity broken due to vector space exhaustion.
[1564662.892157] [nvidia-vgpu-vfio] 8530e237-66ef-4b27-b812-0df953932ac8: Register write failed. index: 0 offset: 0xf004f0ac status: 0x65 Timeout occured
journalctl -u nvidia-vgpu-mgr:
Nov 03 21:27:25 jmsxyvdicluster010 nvidia-vgpu-mgr[37471]: error: vmiop_env_log: (0x0): Failed to get vgpu register access data: 0x25
Nov 03 21:27:25 jmsxyvdicluster010 nvidia-vgpu-mgr[37471]: notice: vmiop_log: (0x0): Ring (0x218) is already valid: new PA=0x23fe1e000, current PA:0x23ff78000
Nov 03 21:27:25 jmsxyvdicluster010 nvidia-vgpu-mgr[37471]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ########
Nov 03 21:27:25 jmsxyvdicluster010 nvidia-vgpu-mgr[37471]: notice: vmiop_log: Driver Version: 473.47
Nov 03 21:27:25 jmsxyvdicluster010 nvidia-vgpu-mgr[37471]: notice: vmiop_log: vGPU version: 0xb0001
Nov 03 21:27:25 jmsxyvdicluster010 nvidia-vgpu-mgr[37471]: notice: vmiop_log: (0x0): vGPU license state: Unlicensed (Unrestricted)
Nov 03 21:27:27 jmsxyvdicluster010 nvidia-vgpu-mgr[37632]: error: vmiop_env_log: (0x0): Failed to set vgpu register access data: 0x25
Nov 03 21:27:27 jmsxyvdicluster010 nvidia-vgpu-mgr[37632]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ########
Nov 03 21:27:27 jmsxyvdicluster010 nvidia-vgpu-mgr[37632]: notice: vmiop_log: Driver Version: 473.47
Nov 03 21:27:27 jmsxyvdicluster010 nvidia-vgpu-mgr[37632]: notice: vmiop_log: vGPU version: 0xb0001
Nov 03 21:27:28 jmsxyvdicluster010 nvidia-vgpu-mgr[37632]: notice: vmiop_log: (0x0): vGPU license state: Unlicensed (Unrestricted)
Nov 03 21:27:30 jmsxyvdicluster010 nvidia-vgpu-mgr[37282]: error: vmiop_env_log: (0x0): Failed to get vgpu register access data: 0x25
Nov 03 21:27:30 jmsxyvdicluster010 nvidia-vgpu-mgr[37282]: notice: vmiop_log: (0x0): Ring (0x218) is already valid: new PA=0x23830e000, current PA:0x2399bc000
Nov 03 21:27:30 jmsxyvdicluster010 nvidia-vgpu-mgr[37282]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ########
Nov 03 21:27:30 jmsxyvdicluster010 nvidia-vgpu-mgr[37282]: notice: vmiop_log: Driver Version: 473.47
Nov 03 21:27:30 jmsxyvdicluster010 nvidia-vgpu-mgr[37282]: notice: vmiop_log: vGPU version: 0xb0001
Nov 03 21:27:30 jmsxyvdicluster010 nvidia-vgpu-mgr[37282]: notice: vmiop_log: (0x0): vGPU license state: Unlicensed (Unrestricted)
Nov 03 21:27:31 jmsxyvdicluster010 nvidia-vgpu-mgr[37645]: error: vmiop_env_log: (0x0): Failed to get vgpu register access data: 0x25
Nov 03 21:27:31 jmsxyvdicluster010 nvidia-vgpu-mgr[37645]: notice: vmiop_log: (0x0): Ring (0x218) is already valid: new PA=0x23fe0b000, current PA:0x23fe7e000
Nov 03 21:27:31 jmsxyvdicluster010 nvidia-vgpu-mgr[37645]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ########
Nov 03 21:27:31 jmsxyvdicluster010 nvidia-vgpu-mgr[37645]: notice: vmiop_log: Driver Version: 473.47
Nov 03 21:27:31 jmsxyvdicluster010 nvidia-vgpu-mgr[37645]: notice: vmiop_log: vGPU version: 0xb0001
Nov 03 21:27:31 jmsxyvdicluster010 nvidia-vgpu-mgr[37645]: notice: vmiop_log: (0x0): vGPU license state: Unlicensed (Unrestricted)
Nov 03 21:27:33 jmsxyvdicluster010 nvidia-vgpu-mgr[37785]: error: vmiop_env_log: (0x0): Failed to get vgpu register access data: 0x25
Nov 03 21:27:34 jmsxyvdicluster010 nvidia-vgpu-mgr[37785]: notice: vmiop_log: (0x0): Ring (0x218) is already valid: new PA=0x239917000, current PA:0x239c1f000
Nov 03 21:27:34 jmsxyvdicluster010 nvidia-vgpu-mgr[37785]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ########
Nov 03 21:27:34 jmsxyvdicluster010 nvidia-vgpu-mgr[37785]: notice: vmiop_log: Driver Version: 473.47
Nov 03 21:27:34 jmsxyvdicluster010 nvidia-vgpu-mgr[37785]: notice: vmiop_log: vGPU version: 0xb0001
Nov 03 21:27:34 jmsxyvdicluster010 nvidia-vgpu-mgr[37785]: notice: vmiop_log: (0x0): vGPU license state: Unlicensed (Unrestricted)
Nov 03 21:27:36 jmsxyvdicluster010 nvidia-vgpu-mgr[38370]: error: vmiop_env_log: (0x0): error: failed to notify guest power operation information: 40
Nov 03 21:27:36 jmsxyvdicluster010 nvidia-vgpu-mgr[38712]: error: vmiop_env_log: (0x0): error: failed to notify guest power operation information: 40
Nov 03 21:27:38 jmsxyvdicluster010 nvidia-vgpu-mgr[37671]: error: vmiop_env_log: (0x0): Failed to get vgpu register access data: 0x25
Nov 03 21:27:39 jmsxyvdicluster010 nvidia-vgpu-mgr[37671]: notice: vmiop_log: (0x0): Ring (0x218) is already valid: new PA=0x2397c4000, current PA:0x238bab000
Nov 03 21:27:39 jmsxyvdicluster010 nvidia-vgpu-mgr[37671]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ########
Nov 03 21:27:39 jmsxyvdicluster010 nvidia-vgpu-mgr[37671]: notice: vmiop_log: Driver Version: 473.47
Nov 03 21:27:39 jmsxyvdicluster010 nvidia-vgpu-mgr[37671]: notice: vmiop_log: vGPU version: 0xb0001
Nov 03 21:27:39 jmsxyvdicluster010 nvidia-vgpu-mgr[37671]: notice: vmiop_log: (0x0): vGPU license state: Unlicensed (Unrestricted)
Nov 03 21:27:45 jmsxyvdicluster010 nvidia-vgpu-mgr[38793]: error: vmiop_env_log: (0x0): error: failed to notify guest power operation information: 40
Nov 03 21:27:45 jmsxyvdicluster010 nvidia-vgpu-mgr[37829]: error: vmiop_env_log: (0x0): Failed to get vgpu register access data: 0x25
Nov 03 21:27:45 jmsxyvdicluster010 nvidia-vgpu-mgr[37829]: notice: vmiop_log: (0x0): Ring (0x218) is already valid: new PA=0x2390ea000, current PA:0x2395b8000
Nov 03 21:27:45 jmsxyvdicluster010 nvidia-vgpu-mgr[37829]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ########
kernel:
4.18.0-305.43.27.ar.el7.x86_64 #1 SMP Wed Jun 14 10:06:07 CST 2023 x86_64 x86_64 x86_64 GNU/Linux
nvidia driver:
|±----------------------------------------------------------------------------+
| NVIDIA-SMI 470.199.03 Driver Version: 470.199.03 CUDA Version: N/A |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A16 On | 00000000:35:00.0 Off | Off |
| 0% 39C P8 16W / 62W | 15361MiB / 16123MiB | 11% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 1 NVIDIA A16 On | 00000000:36:00.0 Off | Off |
| 0% 40C P8 16W / 62W | 15361MiB / 16123MiB | 17% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 2 NVIDIA A16 On | 00000000:37:00.0 Off | Off |
| 0% 34C P8 16W / 62W | 7680MiB / 16123MiB | 4% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 3 NVIDIA A16 On | 00000000:38:00.0 Off | Off |
| 0% 33C P8 16W / 62W | 15361MiB / 16123MiB | 8% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 4 NVIDIA A16 On | 00000000:9C:00.0 Off | Off |
| 0% 35C P8 16W / 62W | 15361MiB / 16123MiB | 17% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 5 NVIDIA A16 On | 00000000:9D:00.0 Off | Off |
| 0% 37C P8 16W / 62W | 0MiB / 16123MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 6 NVIDIA A16 On | 00000000:9E:00.0 Off | Off |
| 0% 32C P8 16W / 62W | 13441MiB / 16123MiB | 8% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 7 NVIDIA A16 On | 00000000:9F:00.0 Off | Off |
| 0% 29C P8 16W / 62W | 15361MiB / 16123MiB | 12% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 4520 C+G vgpu 1920MiB |
| 0 N/A N/A 5941 C+G vgpu 1920MiB |
| 0 N/A N/A 7212 C+G vgpu 1920MiB |
| 0 N/A N/A 7390 C+G vgpu 1920MiB |
| 0 N/A N/A 7634 C+G vgpu 1920MiB |
| 0 N/A N/A 8135 C+G vgpu 1920MiB |
| 0 N/A N/A 9097 C+G vgpu 1920MiB |
| 0 N/A N/A 9354 C+G vgpu 1920MiB |
| 1 N/A N/A 4791 C+G vgpu 1920MiB |
| 1 N/A N/A 5231 C+G vgpu 1920MiB |
| 1 N/A N/A 5620 C+G vgpu 1920MiB |
| 1 N/A N/A 8467 C+G vgpu 1920MiB |
| 1 N/A N/A 8584 C+G vgpu 1920MiB |
| 1 N/A N/A 9157 C+G vgpu 1920MiB |
| 1 N/A N/A 9240 C+G vgpu 1920MiB |
| 1 N/A N/A 9514 C+G vgpu 1920MiB |
| 2 N/A N/A 37282 C+G vgpu 1920MiB |
| 2 N/A N/A 37471 C+G vgpu 1920MiB |
| 2 N/A N/A 38712 C+G vgpu 1920MiB |
| 2 N/A N/A 42248 C+G vgpu 1920MiB |
| 3 N/A N/A 621 C+G vgpu 1920MiB |
| 3 N/A N/A 4974 C+G vgpu 1920MiB |
kvm/qemu version:
qemu: 4.2.0-29
kvm: 5.5.0-6.59