Description
Hi, here is the simplified version of my model. When I build it from onnx, it will raise a error about workspace size not enough. Although the error has been sloved in TRT7, it is hard to using TRT7 on Pegasus. Is it a internal bug in TRT6?
My model with pytorch:
import torch
import torch.nn.functional as F
class Test(torch.nn.Module):
def __init__(self):
super().__init__()
self.conv0 = torch.nn.Conv3d(128, 64, 3, 2, 1)
self.conv1 = torch.nn.Conv3d(128, 64, 3, 2, 1)
self.conv2 = torch.nn.Conv3d(128, 64, 3, 2, 1)
def forward(self, x):
y0 = self.conv0(x)
y1 = self.conv1(x)
y2 = self.conv2(x)
y0 = F.relu(y0)
y1 = F.relu(y1)
y2 = F.relu(y2)
# here also can be 'squeeze'
y0 = y0.view(y0.shape[:2])
y1 = y1.view(y1.shape[:2])
y2 = y2.view(y2.shape[:2])
return y0, y1, y2
if __name__ == '__main__':
t = Test()
x = torch.randn((50, 128, 2, 2, 2))
y0, y1, y2 = t(x)
torch.onnx.export(
t,
(x),
'test.onnx',
verbose=True,
opset_version=11
)
The VERBOSE:
INFO: --------------- Layers running on GPU:
INFO: Conv_0 + Relu_3 || Conv_1 + Relu_4 || Conv_2 + Relu_5, Squeeze_6 + Squeeze_9 + Squeeze_12, Squeeze_7 + Squeeze_10 + Squeeze_13, Squeeze_8 + Squeeze_11 + Squeeze_14,
VERBOSE: Constructing optimization profile number 0 out of 1
VERBOSE: *************** Autotuning format combination: Float(1,2,4,8,1024) -> Float(1,1,1,192,12288) ***************
VERBOSE: --------------- Timing Runner: Conv_0 + Relu_3 || Conv_1 + Relu_4 || Conv_2 + Relu_5 (FusedConvActConvolution)
VERBOSE: FusedConvActConvolution has no valid tactics for this config, skipping
VERBOSE: --------------- Timing Runner: Conv_0 + Relu_3 || Conv_1 + Relu_4 || Conv_2 + Relu_5 (CaskConvolution)
VERBOSE: CaskConvolution has no valid tactics for this config, skipping
VERBOSE: --------------- Timing Runner: Conv_0 + Relu_3 || Conv_1 + Relu_4 || Conv_2 + Relu_5 (CudaConvolution)
VERBOSE: CudaConvolution has no valid tactics for this config, skipping
VERBOSE: --------------- Timing Runner: Conv_0 + Relu_3 || Conv_1 + Relu_4 || Conv_2 + Relu_5 (CudaDepthwiseConvolution)
VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping
ERROR: Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() if using IBuilder::buildEngineWithConfig, or IBuilder::setMaxWorkspaceSize() if using IBuilder::buildCudaEngine.
ERROR: ../builder/tacticOptimizer.cpp (1786) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node Conv_0 + Relu_3 || Conv_1 + Relu_4 || Conv_2 + Relu_5.)
VERBOSE: Builder timing cache: created 0 entries, 0 hit(s)
Environment
TensorRT Version: 6.3.1
GPU Type: First device on Pegasus
CUDA Version: 10.2
CUDNN Version: 7.6.6
Operating System + Version: Ubuntu 18.04.2 LTS
Python Version (if applicable): 3.7
PyTorch Version (if applicable): 1.9