Hi,
Do you use the PyTorch 1.8.0 package from the below comment:
If yes, could you share the detailed steps to reproduce this error?
We want to check it deeper in our environment.
More, could you run dmesg to see if any suspicious error log right after the failure?
Thanks.