Can't build TensorRT engine from ONNX due to insufficient memory

I am trying to translate a network from Pytorch to TensorRT. This network is really huge, but simplifying it is not an easy task…
I set max workspace size 1 << 50 while building engine, and it is effectively infinity.

Can you look to onnx and build logs, and maybe you can give me some advice what to do with this?

This network requires some custom plugins, so you can’t build this ONNX with trtexec. I am trying to build it from my C++ project, using ONNX parser.

TensorRT 8.6
Ubuntu 20.04
NVIDIA GeForce RTX 3060
Driver Version: 515.105.01
CUDA Version: 11.7

fsdv2_split.onnx.zip (59.3 MB)
fsdv2_split.onnx.z01.zip (90 MB)

Zipped ONNX is about 160 MB, so I splitted it. Rename fsdv2_split.onnx.z01.zip to fsdv2_split.onnx.z01 and merge it with fsdv2_split.onnx.zip
This ONNX is already simplified with onnx-simplifier.

trt_build.log (533.2 KB)

I’ve had similar issues that I’ve seemed to work around by using one of the two commands:

TensorRT/trtexec --onnx=model.onnx --saveEngine=model.plan --buildOnly --verbose --memPoolSize=workspace:1,dlaSRAM:1,dlaLocalDRAM:1,dlaGlobalDRAM:1 --device=0 --refit

or

TensorRT/trtexec --onnx=model.onnx --saveEngine=model.plan --buildOnly --verbose --memPoolSize=workspace:100000000,dlaSRAM:100000000,dlaLocalDRAM:100000000,dlaGlobalDRAM:100000000 --device=0 --refit

Halt your other services and try it again

Please look into new log.

2: Assertion formulas failed. Has no upper bound: (VALUE 2486)

2: [trainStationBuilder.cpp::replace::313] Error Code 2: Internal Error (Assertion formulas failed. Has no upper bound: (VALUE 2486))

I have no idea, what it can be.

trt_build.log (1.0 MB)