Is bluefield-smartnic L3 cache divide into some slices?

From Intel Sandy Bridge and forward, the last level cache is divided into some slices. I wonder if ARM L3 cache has a similar non-uniform characteristic and if bluefield-smartnic (e.g., BlueField MBF1L516A-CSCAT) has a a similar non-uniform characteristic.

Hello,

In BF-2 there is 6MB L3 cache that supports all cores.
We are not aware of ‘slices’.
Can you elaborate your question please?

Best Regards,
Viki

I am very sorry for not explaining clearly the meaning of “slice”. In fact, this may be the concept in Intel. Intel’s micro-architecture, from Sandy Bridge and forward, re-designed the LLC by dividing the LLC into multiple slices.

image

The total size of the LLC in the above figure is 20MB, and the size of each slice is 2.5MB (20MB/8). The CPU cores and all LLC slices are interconnected by a bi-directional bus. However, due to the differences in paths between a given core and the different slices (aka non-uniform cache architecture), accessing data stored in a closer slice is faster than accessing data stored in other slices (e.g., CPU 0 access slice 0 faster than other slices).

I wonder if ARM L3 cache has a similar non-uniform characteristic and if bluefield-smartnic (e.g., BlueField MBF1L516A-CSCAT) has a similar non-uniform characteristic.

Regards,

I am very sorry for not explaining the “slices” clearly. I have re-uploaded the detailed explanation about “slices”.

Hello,

From the image you shared, you are asking about BF-1 and not BF-2 (MBF1L516A-CSCAT part number) which have 2 L3 caches (have 2 DRAM controllers), and it also has 16 cores.
It may be that there are some minor “distance” latency differences between “close” and “far” cores.

Best Regards,
Viki

Could you share some manuals or papers that introduce the L3 cache structure of the bluefield-smartnic (preferably BlueField MBF1L516A-CSCAT) in detail?

Best Regards