I have the error as written in the title. The error occurs on both Tesla K80 and GTX1080Ti, with pytorch 1.2 cudatoolkit=10.0, CUDA/10.0.130, and cudnn/7.6.2.24-CUDA-10.0.130. Also, this error only occurs on Linux machines.
Here’s a simple code that reproduces my error:
import torch
import torch.nn as nn
import numpy as np
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.base_n_filter = 8
self.conv1 = nn.Conv3d(1, self.base_n_filter, 3, stride=1, padding=1)
self.conv2 = nn.Conv3d(self.base_n_filter, self.base_n_filter*2, 3, stride=1, padding=1)
self.conv3 = nn.Conv3d(self.base_n_filter*2, 1, 3, stride=1, padding=1)
def forward(self, img):
output1 = self.conv1(img)
output2 = self.conv2(output1)
output3 = self.conv3(output2)
return output3
if __name__ == '__main__':
img = torch.zeros([1,1,248,248,140]).cuda()
label = torch.rand_like(img, device=img.device)
model = Net().cuda()
output = model(img)
loss_fn = nn.MSELoss()
loss = loss_fn(output, label)
loss.backward()
The error only occurs when I have an input image of a specific size. For example, if I change the input to shape [1,1,250,250,150], the error no longer occurs. Additionally, this error also only occurs when self.base_n_filter is 8 or greater (with input shape [1,1,248,248,140]).
I have already posted this question to the pytorch discussion forum, but I am posting again here in case NVIDIA can help.
Any help would be truly appreciated!
Edit: posting the output of
nvidia-smi
and
nvcc --version
in case it’s helpful.
±----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108… On | 00000000:82:00.0 Off | N/A |
| 28% 21C P8 8W / 250W | 1MiB / 11178MiB | 0% Default |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130