CBUF size of NVDLA on Orin

sunlingyu · July 8, 2024, 10:45am

Hi. I am a Ph.D student working on computer architecture, especially Deep Neural Network Accelerator design. We are now interested in the NVDLA accelerator open-sourced by NVIDIA. We are particularly interested in its total on-chip SRAM size to facilitate a fair comparison to NVDLA for one of our research projects.

I have known that the NVDLAs equipped on AGX Orin and AGX Xavier have an SRAM. The SRAM size data is published according to TensorRT developer guide.

However, I have also read from the open-sourced NVDLA Unit description that each NVDLA contains a convolution buffer (CBUF), which is another on-chip SRAM component. The NVDLA Unit Description says that the CBUF is 512KB.

Is the NVDLA equipped on AGX Orin and AGX Xavier both has 512KB CBUF? If not, could you kindly provide the CBUF size of the NVDLA accelerators on AGX Orin and AGX Xavier, respectively? If the data is not publicly available, could you please provide the data to us in private (e.g. email)? Thank you very much!

AastaLLL · July 9, 2024, 5:45am

Hi,

They are the same thing. DLA internal SRAM is called CBUF.
So you can find the CBUF limit for Xavier and Orin in the TensorRT guide you shared above.

Thanks.

sunlingyu · July 9, 2024, 6:54am

Thanks for your reply. However, I still doubt that the CBUF and the (dedicated) SRAM are distinct hardware components.

According to NVDLA Primer, a large NVDLA implementation contains a dedicated SRAM other than the internal CBUF. The dedicated SRAM connects to NVDLA through a dedicated interface (second DBB in Figure 2 in NVDLA Primer).

Combined with certain descriptions in TensorRT developer guide, I feel that this document is describing the dedicated SRAM. For example, it says:

On Xavier, 4 MiB of SRAM is shared across multiple cores including the 2 DLA cores.

This seems to imply that the 4MiB SRAM is not an internal component of a DLA (i.e. not a CBUF).

So could you please investigate it further? Thank you very much for your invaluable assistance!

AastaLLL · July 10, 2024, 8:20am

Hi,

Sorry for the missing.

Indeed, there are two SRAMs on the DLA called UBuf and CBuf.
UBuf size can be found in the TensorRT document:

On Orin, each DLA core has 1 MiB of dedicated SRAM. On Xavier, 4 MiB of SRAM is shared across multiple cores including the 2 DLA cores.

But for CBuf, unfortunately, we don’t have public info can share.

Thanks.

system · July 31, 2024, 5:35am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Memory Management NVDLA and GPU Jetson AGX Xavier dla	4	583	October 18, 2021
CBUF size limit calculation Jetson Xavier NX tensorrt , dla	9	1171	December 28, 2022
DRIVE AGX Orin Development Kit System RAM config DRIVE AGX Orin General drive-platform-configuration	6	1164	January 26, 2023
NVDLA/sw compilation on Xavier Jetson AGX Xavier	3	917	October 18, 2021
How to increase the DLA memory pool limits? Jetson AGX Xavier python , dla	2	363	April 8, 2024
Fail at runing conv layer on DLA Jetson AGX Orin dla	13	1369	November 9, 2022
NVIDIA Deep Learning Accelerator IP now released at NVDLA.org Announcements	4	44814	February 27, 2018
[Xavier NX + DLA] does not support dynamic shapes, and CBUF size requirement Jetson Xavier NX tensorrt , nvbugs , dla	9	1983	October 18, 2021
The specifications of the CPU cache in DRIVE AGX ORIN Jetson AGX Orin hw	2	95	November 6, 2024
DLA-v2 is slower than DLA-v1 Jetson AGX Orin tensorrt , jetson-inference	8	2854	July 6, 2022

CBUF size of NVDLA on Orin

Related topics