Our GPU server crashes whenever I try to interrupt my code while I am using all four GPUs on the machine

Dear Sir/Madam,

Our GPU server crashes in an inconsistent way, if I try to terminate my code using Ctrl+c when I try to use all four GPUs. I have the bug report.
nvidia-bug-report.log.gz (3.0 MB)

I would be grateful if you could help us in this regard.


Taher Naderi

Please correctly set up nvidia-persistenced to start on boot and make sure it is continuously running.