Hi,
I have a very dummy model that simply concatenates the output of a transposed convolution with another tensor. Both tensors have compatible dimensions and the concatenation is set to happen on the channel dimension (dim = 1; NCHW ordering).
However, when compiling the model to DLA using TRTExec (TRT 10.3, JetPack 6.1), I’ve found that the model experiences a warning during the conversion that says:
[W] [TRT] Failed to add layer: Assertion actualOutputDims == expectedOutputDims failed.
After the warning, the model still gets compiled, but when it is used in inference it never runs on DLA, instead, it runs on fully on GPU. I’ve attached a dummy version of the ONNX model and the log from TRTExec.
I’ve also seen that the model defaults “fully” to the GPU, even when other parts of the model could run on the DLA without problems (so, if you try to add more layers, before or after the concatenation, the model will be defaulted entirely to DLA). I wanted to know whether there is something wrong with the ONNX generation itself, or whether there is some sort of internal error in TRT10.3 that gets fixed in a future version. As far as I know, DLA supports concatenations over the channels axis, so I’m not sure what the source of the problem is.
Note: I know that it defaults entirely because Tegrastats reports no DLA usage when running inference (both with TRTExec and custom script based on TRT Python API). This happens for the model I attached, as well as any other model that contains the same layers, and some additional more (so, for example, if we add convolutional layers at the beginning there is no DLA usage reported on tegrastats).
Thanks a lot for your assistance.
Regards,
model.zip (6.9 MB)