PyCUDA ERROR: The context stack was not empty upon module cleanup

Description

I have an engine and create a context in a class, when I calling the inference with ProcessPoolExecutor(NUMBER), got an error information even the NUMBER=1:

PyCUDA ERROR: The context stack was not empty upon module cleanup.

However, when I call infrence code with out processing, everything is ok.

And, what should I do when I have 4 engine in a system when using multiprocessing.

Environment

TensorRT Version: 8.4.0.6
GPU Type: A10
Nvidia Driver Version: 460.106.00
CUDA Version: 11.2
CUDNN Version: 8.0
Operating System + Version: Unbuntu 18.04
Python Version (if applicable): 3.7
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

toy.7z (19.8 MB)
Here is the demo scripts and according engine andplugins

Steps To Reproduce

bash run.sh