Hi. I am a Ph.D student working on computer architecture, especially Deep Neural Network Accelerator design. We are now interested in the NVDLA accelerator open-sourced by NVIDIA. We are particularly interested in its total on-chip SRAM size to facilitate a fair comparison to NVDLA for one of our research projects.
I have known that the NVDLAs equipped on AGX Orin and AGX Xavier have an SRAM. The SRAM size data is published according to TensorRT developer guide.
However, I have also read from the open-sourced NVDLA Unit description that each NVDLA contains a convolution buffer (CBUF), which is another on-chip SRAM component. The NVDLA Unit Description says that the CBUF is 512KB.
Is the NVDLA equipped on AGX Orin and AGX Xavier both has 512KB CBUF? If not, could you kindly provide the CBUF size of the NVDLA accelerators on AGX Orin and AGX Xavier, respectively? If the data is not publicly available, could you please provide the data to us in private (e.g. email)? Thank you very much!
Thanks for your reply. However, I still doubt that the CBUF and the (dedicated) SRAM are distinct hardware components.
According to NVDLA Primer, a large NVDLA implementation contains a dedicated SRAM other than the internal CBUF. The dedicated SRAM connects to NVDLA through a dedicated interface (second DBB in Figure 2 in NVDLA Primer).
Combined with certain descriptions in TensorRT developer guide, I feel that this document is describing the dedicated SRAM. For example, it says:
On Xavier, 4 MiB of SRAM is shared across multiple cores including the 2 DLA cores.
This seems to imply that the 4MiB SRAM is not an internal component of a DLA (i.e. not a CBUF).
So could you please investigate it further? Thank you very much for your invaluable assistance!