NVIDIA A100 GPU hardware problem check

nvidia A100 80G hardware check please.

From the logs, it looks like you have at least a 3rd GPU.
Can you check if it’s a hardware issue or a driver issue?
I am currently getting DRAM uncorrectable error on 3 times.
Only 3 are not working when entering a job.

Please refer to the log for more information.
Thank you.

Take a look at the log file one more time.

nvidia-bug-report_gpu.log.gz (4.8 MB)

Hi,
you are in the vGPU forums. Please contact your OEM if you believe the GPU might have a hardware related issue so they can give you advise on what to check for RMA.

regards
Simon