System reboot when starting any PyTorch program

Hello community,

I have a system with a Supermicro X10SRA motherboard with Intel Xeon E7 v4 CPU.
The GPU is TITAN Xp.

Running Ubuntu 20.04 with kernel version: 5.15.0-122-generic #132~20.04.1-Ubuntu SMP

The NVIDIA Driver is installed and seems working fine:

NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2

When I use normal graphics programs like CARLA simulator, there is no problem. But when a PyTorch program is started for training the system resets.

Is it possible to run the bug report program under these situations ? Should I try to run in safe mode ? Or downgrade my GPU driver to something more older say 470

Update 1:

NVIDIA 470 driver does not solve the issue:
nvidia-smi
Mon Oct 28 11:56:09 2024
±----------------------------------------------------------------------------+
| NVIDIA-SMI 470.256.02 Driver Version: 470.256.02 CUDA Version: 11.4 |
|-------------------------------±---------------------±---------------------+

It seems to be an hardware issue.