GPU 1070 burn test see Faulty result

We are seeing GPU 1070 faulty when to GPU-BURN test. No PCIe error.
We confirmed both GPU and FDR temperature is 55C to 60C range.

What aspect should we look into. This is intermittent failure

What’s “FDR”? What is “GPU-BURN”? Presumably a piece of software you are using for burn-in purposes. Where did you get it? Is this a commercial product, open-source software, or something you wrote yourself?

What is the specific error reported by this software in the failing case? Can you cut and paste an example of such an error message verbatim?

The temperature for the GPU looks good, especially if this is under full load. The next thing you would want to check is power supply. Proper number of PCIe power cables connected, without use of any converters or Y-splitters? Power connector tabs engaged at the GPU socket (click sound)? PSU of sufficient power rating used? Rule of thumb: For rock-solid operation you want sum of nominal power of all system components <= 60% of PSU’s nominal power.

Are the contacts of the GPU’s PCIe connector clean (both motherboard and GPU side)? Is the GPU fully inserted into PCIe slot (typically a tab engages)? GPU secured at bracket to case/enclosure? Was the GPU installed taking proper precautions (grounded wrist strap) against electrostatic discharge?

Is the GPU being operated at extreme altitude, in a humid environment, subjected to vibrations (e.g. factory floor or vehicle of any kind), or electromagnetic interference from equipment in the vicinity (e.g. large electrical motors, x-ray machines).

How often do these intermittent failures occur? If these are memory failures, on DRAM without ECC you might encounter a spurious 1-bit failure due to a cosmic ray with a chance of maybe 0.5% per full day of operation at sea level (this is based on observational data, not on hard statistics). Error rates will be higher at higher elevations.