We faced the similar issue with A100. We are using nvdiffrast with PyTorch.
NVIDIA Driver Version: 450.119.04
The problem occurs randomly on different cards. Looks like only some cards are affected.
Moreover, we discovered that commenting glEnable(GL_DEPTH_TEST) inside nvdiffrast/common/rasterize.cpp fix the problem.
The following example reproduces the issue:
import torch
import nvdiffrast.torch as dr
import numpy as np
from matplotlib import pyplot as plt
def tensor(*args, **kwargs):
return torch.tensor(*args, device='cuda', **kwargs)
depth = 5.1761e+00 / 6.0623e+00
pos = tensor([[[ 8.5783e-02, 9.9548e-02, 2.0576e+00, 3.0065e+00],
[-1.7052e+00, 1.3828e-01, 2.0506e+00, 2.9996e+00],
[ 6.6282e-02, -3.4532e+00, 2.0323e+00, 2.9817e+00],
[ 8.4978e-02, 9.4055e-02, 3.0781e+00, 4.0065e+00],
[-2.6015e+00, 1.5215e-01, 3.0675e+00, 3.9961e+00],
[ 1.1423e-01, 5.4232e+00, 3.1161e+00, 4.0437e+00],
[ 8.4174e-02, 8.8562e-02, 4.0986e+00, 5.0065e+00],
[ 4.1140e+00, 1.4267e-03, 4.1145e+00, 5.0221e+00],
[ 1.2805e-01, 8.0822e+00, 4.1556e+00, 5.0623e+00]]], dtype=torch.float32)
tri = torch.from_numpy(
np.arange(len(pos[0])).reshape(-1, 3)
).to(torch.int32).cuda()
glctx = dr.RasterizeGLContext()
rast, _ = dr.rasterize(glctx, pos, tri, resolution=[1024, 1251])
plt.figure(figsize=[12,12])
plt.imshow(rast.cpu()[0])
plt.show()
Results are the following