Titan Xp: Unable to determine the device handle

Dear Concerned,

We are using the Titan Xp GPU in my lab for deep learning training using TF Keras.

Recently, we faced frequently the following error:

Unable to determine the device handle for GPU 0000:02:00.0: GPU is lost. Reboot the system to recover this GPU.

Result of dmesg:

[33472.926944] nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000927c:0:0:0x0000000f

Below the link for the nvidia bug report:

Please advise.