Building TensorRT int8 engine fails

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version: 5.1.5.0
GPU Type: RTX-2080
Nvidia Driver Version: 450.102.04
CUDA Version: 10.1
CUDNN Version: 7.5
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.4
Baremetal or Container (if container which image + tag):

Hello,
I’m working on building building different types of TensorRT engines for my custom model.
I successfully built and ran full and half precision ones, working with maximum workspace of 2**20 and maximum batch of 6.
However, while calibrating for int8 engine I ran into the following error:

[TensorRT] ERROR: runtime.cpp (25) - Cuda Error in allocate: 2 (GPU memory allocation failed during allocation of workspace. Try decreasing batch size.)

I’m wondering, why this appears, as I had no problems with running batch of 6 previously.

P.S. There is no problem with the calibration process itself, as after reducing the batch size it runs without any problems.

Thank you,
Alex.

Hi @spivakoa,

Looks like you have older version of Tensorrt. Please try with latest release.
https://developer.nvidia.com/nvidia-tensorrt-7x-download
For your reference, document of sample inference in INT8 using calibration

Thank you.