A clear and concise description of the bug or issue.
TensorRT Version: 188.8.131.52
GPU Type: RTX-2080
Nvidia Driver Version: 450.102.04
CUDA Version: 10.1
CUDNN Version: 7.5
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.4
Baremetal or Container (if container which image + tag):
I’m working on building building different types of TensorRT engines for my custom model.
I successfully built and ran full and half precision ones, working with maximum workspace of 2**20 and maximum batch of 6.
However, while calibrating for int8 engine I ran into the following error:
[TensorRT] ERROR: runtime.cpp (25) - Cuda Error in allocate: 2 (GPU memory allocation failed during allocation of workspace. Try decreasing batch size.)
I’m wondering, why this appears, as I had no problems with running batch of 6 previously.
P.S. There is no problem with the calibration process itself, as after reducing the batch size it runs without any problems.