CBUF Validation Error During Conversion For DLA

alex318 · January 7, 2025, 12:50pm

Issue

I encountered a CBUF validation error while converting a PyTorch model (.pth) to ONNX and then to a TensorRT engine for DLA. Despite the error, the conversion process finished successfully. However, I am concerned about the error’s meaning and potential consequences on the model’s performance and behavior.

The error is as follows

CBUF validation failed because the total number of weight and data banks exceeds the maximum allotted number of banks.
Number of weight banks
    = roundUp(numChannels * kernelHeight * kernelWidth * 32, 128) / (cbufEntryWidth * cbufEntriesPerBank)
    = roundUp(960 * 4 * 4 * 32, 128) / (128 * 256)
    = 15
Number of data banks
    = (entriesPerDataSlice * dilatedKernelHeight) / cbufEntriesPerBank
    = (300 * 4) / 256
    = 5,
where:
    entriesPerDataSlice
    = ceil(ceil(numChannels * bytesPerElement / 32) * kernelWidth / 4)
    = ceil(ceil(960 * 2 / 32) * 4 / 4)
    = 300
Maximum allotted banks = 16, which is less than 15 + 5.

Environment

TensorRT Version: 8.5.2-1+cuda11.4
PyTorch Version: 2.1.0a0+41361538.nv23.06
Hardware: Jetson Xavier NX
Operating System: Ubuntu 20.04.6 LTS
CUDA Version: cuda_11.4.r11.4/compiler.31964100_0
cuDNN Version: 8.6.0.166-1+cuda11.4

Input Dimensions:

[ [ 3, 512, 512 ],
[ 3, 512, 512 ],
[ 1, 512, 512 ] ]

Architecture: V2mobilenet

Questions

Meaning: What does this error indicate? Does it suggest a limitation in hardware resources, model compatibility, or optimization constraints during the conversion process?
Potential Consequences: Could this error impact runtime performance, accuracy, or stability of the TensorRT engine?

Additional Notes

Despite the error, the conversion completed, and the engine runs without immediate signs of failure. However, I would like to understand the root cause and whether further action is required to ensure the robustness of the deployed model.

Thank you for your assistance.

SivaRamaKrishnaNV · January 9, 2025, 7:46am

Dear @alex318 ,
Could you share the repro steps?

Topic		Replies	Views
Getting error while using tensorRT engine Jetson Nano tensorrt	5	1693	October 15, 2021
Hello, may I ask why this error occurred when converting the model TensorRT tensorrt , cudnn , jetson	1	32	December 31, 2024
Convert the TRT model with FP16 Jetson TX2 jetpack , tensorrt , jetson-inference	7	2391	October 18, 2021
CUBLAS Error During ONNX -> TRT conversion TensorRT	7	2523	February 21, 2022
Meet some problem with --precisionConstraints=obey --layerPrecisions TensorRT	1	1381	January 17, 2023
Onnx to tennorrt engin quantization failing with _Map_base::at during assignment of tensor scales TensorRT cudnn	2	414	June 26, 2024
Error while converting Pytorch model to TensorRT TensorRT	5	1536	December 18, 2020
Error occurred while running the Tensorrt samples: [reformat.cpp::executeCutensor::385] TensorRT tensorrt	3	1175	December 12, 2023
Cuda Runtime (no kernel image is available for execution on the device) Jetson Xavier NX cuda	4	163	July 31, 2024
Tlt-converter failed on Jetson Nano with Cuda Error in loadKernel: 702 TensorRT	1	529	March 26, 2021