Can't perform inference using Python API of TensorRT


While following the instructions at TensorRT docs to run inference on using Python API, I am getting memory allocation error.

pycuda._driver.MemoryError: cuMemHostAlloc failed: out of memory


TensorRT Version:
GPU Type: Titan X
Nvidia Driver Version: 450.51.06
CUDA Version: 11.0
CUDNN Version:
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.6
Baremetal or Container (if container which image + tag): Baremetal

Hi @mmaaz60,
Can you please try running your model using trtexec with verbose, and share the logs with us?