OOM of conv layer

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version: 7
GPU Type: 1070
Nvidia Driver Version: 410
CUDA Version: 10.0
CUDNN Version: 7.6
Operating System + Version: ubuntu16
Python Version (if applicable): 3.6
TensorFlow Version (if applicable): 1.13.2
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Hi NV,

I have tried 1G 2G,5G 7G 10G workspace, but it still doesn’t work. Any idea?

[TensorRT] VERBOSE: After concat removal: 437 layers
[TensorRT] VERBOSE: Graph construction and optimization completed in 0.335001 seconds.
[TensorRT] VERBOSE: Constructing optimization profile number 0 out of 1
*************** Autotuning format combination: Float(1,1216,428032,1284096) -> Float(1,608,107008,6848512) ***************
[TensorRT] ERROR: …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[TensorRT] WARNING: GPU memory allocation error during getBestTactic: model/encoder/densenet121/conv1/Conv2D + model/encoder/densenet121/conv1/Relu
[TensorRT] ERROR: Internal error: could not find any implementation for node model/encoder/densenet121/conv1/Conv2D + model/encoder/densenet121/conv1/Relu, try increasing the workspace size with IBuilder::setMaxWorkspaceSize()
[TensorRT] ERROR: …/builder/tacticOptimizer.cpp (1523) - OutOfMemory Error in computeCosts: 0

snippet code:
# convert to TRT
G_LOGGER = trt.Logger(trt.Logger.VERBOSE)
trt.init_libnvinfer_plugins(G_LOGGER, “”)
builder = trt.Builder(G_LOGGER)
builder.max_batch_size = 1
builder.max_workspace_size = 1 << 30
network = builder.create_network()
parser = trt.UffParser()
parser.register_input(“model/focal”, trt.Dims([1]))
parser.register_input(“model/Placeholder”, trt.Dims3([3, 352, 1216]), trt.UffInputOrder.NCHW)
parser.register_output(“model/decoder/truediv_12”)
parser.parse_buffer(uff_model, network, trt.float32)
engine = builder.build_cuda_engine(network)

Hi,

You can try few things:

Thanks

Thanks ur reply.

I check my tensorflow test pipeline. it takes ~7G.
my GPU is 1070 8G. so how to limit memory consumption during building engine?

Thanks.

Hi SunilJB,

I set
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)

then set the max_workspace_size as 2G. it works.
Thanks for ur help.

Derek