Can huge model with huge total weights size be infered by tensorrt at A10?

When I use trtexec to infer opt13 onnx model file, internal errors happen. The opt30 model is a huge model with 60 GB weights size. What is the root cause? Detailed log as below.

TensorRT version: 8.4.0.6

Hi,

We are unable to access the images. Could you please share with us the trtexec --verbose logs.
FYI, Please refer,

Thank you.

Each tensor size in opt13 model is no more than 2^31 -1 elements. But the total tensors size of the model is more than 120 GB .

Hi,

There has been a known issue similar to this and fixed in recent versions.
We recommend you to please upgrade the TensorRT version to the latest(8.5) and try again by increasing the workspace. If you still face this issue, please share with us the complete trtexec --verbose logs and if possible minimal issue repro ONNX model.

Thank you.

I try again by using the lastest TensoRT version v8.5, It reports similar error. Log as blow.

Hi,

Could you please share with us the ONNX model (here or via DM) for better debugging.

Thank you.

maybe you can try prohibit cublas and cudnn when use trtexec

@410069103 It does not work when I add the option “–tacticSource=-CUBLAS,-CUDNN”.

@spolisetty The model is too big to upload. I just want to know How TensorRT to infer the huge model with total weights size 60GB more than A10 available device memory size 22GB. And how to use trtexec to infer the huge model?

From the log, we found TensorRT try to allocate 50GB . It’s incomprehensible.

Hi,

This is not possible. The model has to fit within device memory.

Thank you.