GPU memory difference between 1070 and 2070 for YOLOv3

Description

we found that for tensorRT implementation of YOLOv3, the GPU memory used in 1070 and 2070 is different, in 1070, about 400 mb , in 2070, about 600mb.

I’d like to know is there any way we can make the gpu memory comsumption of 2070 same as the 1070 for 400mb? or is this totally relating to hardware device and tensorRT internal mechanism and can not be reduced at software layer.

Environment

TensorRT Version: 5.0.2.6
GPU Type: 1070
Nvidia Driver Version: 384.130
CUDA Version: 9.0
CUDNN Version: 7.3.1
Operating System + Version: Ubuntu 16.04

TensorRT Version: 5.1.5.0
GPU Type: 2070
Nvidia Driver Version: 430.340
CUDA Version: 10.1
CUDNN Version: 7.6.1
Operating System + Version: Ubuntu 16.04

The memory usage is dependent on the device and kernel used to optimize the model based on precision and other factors.
To determine the amount of memory a model will use, please below link question How do I determine how much device memory will be required by my network?
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-710-ea/developer-guide/index.html#faq

You can try to changing the max workspace size while creating engine to reduce the memory consumption, but may degrade performance when setting it too small.
Please refer to below link
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-710-ea/developer-guide/index.html#build_engine_python

Thanks

Thank you very much for your reply.

We have separated the engine build and inference things, so the consumption is totally related to the run time inference and weights loading. We think it may be hardware relevant, because we try the 1070 with the same software environment (driver, CUDA, cudnn, etc.) as the 2070, there is still above 200 mb consumption for pure inference.

Yes, as i mentioned in earlier comment. Device is also one of the factor in memory usages since optimized TRT kernel may vary from device to device.

Thanks