CUDA error: unspecified launch failure

I’m trying to run TensorRT inference of 7 streams in parallel using multiprocessing
one of the process in multiprocessing goes down while trying to load any tensor on CUDA or performing any CUDA related operation
throwing RuntimeError: CUDA error: unspecified launch failure.
It only affects one process out 7 processes running concurrently

How to reproduce:
After running inference for 4-5 hrs we get CUDA failure error for one random process
GPU utilization ~ 90%

Server specification :
GPU : Nvidia Tesla T4 16 GB
CPU : AMD 7262
cuda 11.0
cudnn 8.1
TRTorch 0.2.0
Ubuntu 18.04.6
Python 3.7

The below link might be useful for you
For multi threading/streaming, will suggest you to use Deepstream or TRITON
For more details, we recommend you to raise the query to the Deepstream or TRITON forum.