Can not run two tensorrt models (two dockers) on same GPU


I have an AI module including 2 docker containers: the first container has 2 CNN models, the second one has 3 CNN models. It works well with native TensorFlow. But when I convert all CNN models to tf-trt format, I only run one of two containers, second containers with message:

I tensorflow/core/common_runtime/gpu/] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 6043 MB memory:  -> device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5
 I tensorflow/compiler/mlir/] None of the MLIR Optimization Passes are enabled (registered 2)
 E tensorflow/compiler/tf2tensorrt/utils/] DefaultLogger coreReadArchive.cpp (41) - Serialization Error in verifyHeader: 0 (Version tag does not match. Note: Current Version: 96, Serialized Engine Version: 97)
 E tensorflow/compiler/tf2tensorrt/utils/] DefaultLogger INVALID_STATE: std::exception
 E tensorflow/compiler/tf2tensorrt/utils/] DefaultLogger INVALID_CONFIG: Deserialize the cuda engine failed.


TensorRT Version:
GPU Type: T4
TensorFlow Version: 2.4.2

