Is this normal? when I use the A30 GPU's MIG to run a torch model, the features obtained on 1g.6gb and 2g.12gb are inconsistent

Firstly I defined a simple convolutional network
class SimpleConvNet(nn.Module):
def init(self):
super(SimpleConvNet, self).init()
self.stn_fc1 = nn.Sequential(
nn.Linear(2*256, 512),
nn.BatchNorm1d(512),
nn.ReLU(inplace=True))
def forward(self, x):
batch_size, _, h, w = x.size()
x = x.view(batch_size, -1)
x = self.stn_fc1(x)
return x

Then, I used the same parameters and input, perform inference separately on 12g and 6g of VRAM to obtain the results output_12g.npy and output_6g.npy.

model_test = SimpleConvNet()
model_test.load_state_dict(torch.load(‘./model_params.pth’))
input_data = np.load(“./input_data.npy”)
input_data = torch.from_numpy(input_data)
input_data = input_data.cuda()
model_test = model_test.cuda()

##export CUDA_VISIBLE_DEVICES=‘MIG-29a86f08-9dda-59b4-a2b5-40f5dc21b648’
output_12g = model_test(input_data)
save_feature(output_12g,“./output_12g.npy”)

##export CUDA_VISIBLE_DEVICES=‘MIG-36832a9a-4921-540a-96b8-ba6ecc38e4e2’
output_6g = model_test(input_data)
save_feature(output_6g,“./output_6g.npy”)

I compared all positions of two features and found inconsistencies at the sixth decimal place in many indices. Is this normal? As shown below:

Inconsistent element at index (0, 1):
output_12g_cpu: 0.7465695738792419
output_6g_cpu: 0.7465693950653076
Inconsistent element at index (0, 2):
output_12g_cpu: 1.231195092201233
output_6g_cpu: 1.231195330619812
Inconsistent element at index (0, 4):
output_12g_cpu: 0.314302921295166
output_6g_cpu: 0.3143029808998108
Inconsistent element at index (0, 5):
output_12g_cpu: 1.0248600244522095
output_6g_cpu: 1.0248603820800781
Inconsistent element at index (0, 6):
output_12g_cpu: 1.4555572271347046
output_6g_cpu: 1.4555573463439941

Is this problem caused by computational power?