CBUF Validation Error During Conversion For DLA

Issue

I encountered a CBUF validation error while converting a PyTorch model (.pth) to ONNX and then to a TensorRT engine for DLA. Despite the error, the conversion process finished successfully. However, I am concerned about the error’s meaning and potential consequences on the model’s performance and behavior.

The error is as follows

CBUF validation failed because the total number of weight and data banks exceeds the maximum allotted number of banks.
Number of weight banks
    = roundUp(numChannels * kernelHeight * kernelWidth * 32, 128) / (cbufEntryWidth * cbufEntriesPerBank)
    = roundUp(960 * 4 * 4 * 32, 128) / (128 * 256)
    = 15
Number of data banks
    = (entriesPerDataSlice * dilatedKernelHeight) / cbufEntriesPerBank
    = (300 * 4) / 256
    = 5,
where:
    entriesPerDataSlice
    = ceil(ceil(numChannels * bytesPerElement / 32) * kernelWidth / 4)
    = ceil(ceil(960 * 2 / 32) * 4 / 4)
    = 300
Maximum allotted banks = 16, which is less than 15 + 5.

Environment

TensorRT Version: 8.5.2-1+cuda11.4
PyTorch Version: 2.1.0a0+41361538.nv23.06
Hardware: Jetson Xavier NX
Operating System: Ubuntu 20.04.6 LTS
CUDA Version: cuda_11.4.r11.4/compiler.31964100_0
cuDNN Version: 8.6.0.166-1+cuda11.4

Input Dimensions:

[ [ 3, 512, 512 ],
[ 3, 512, 512 ],
[ 1, 512, 512 ] ]

Architecture: V2mobilenet


Questions

Meaning: What does this error indicate? Does it suggest a limitation in hardware resources, model compatibility, or optimization constraints during the conversion process?
Potential Consequences: Could this error impact runtime performance, accuracy, or stability of the TensorRT engine?

Additional Notes

Despite the error, the conversion completed, and the engine runs without immediate signs of failure. However, I would like to understand the root cause and whether further action is required to ensure the robustness of the deployed model.

Thank you for your assistance.

Dear @alex318 ,
Could you share the repro steps?