JetsonAGX Orin: System-level Cache

Hello,

In the Jetson AGX Orin Technical brief document, I see that the 4MB system-level cache is accessible from both the CPU and the GPU according to Figure 2:

However, when we see the figure 4. the GPU has no access to the system-level cache:

Figure 8 shows that the system-level cache is part of the CPU complex:

Then we come to Figure 9 and see that the GPU has direct access to the system-level cache and memory controller interface:

I saw one old question (the linkorin-system-cache) where it turns out that the system-level cache is actually L4 for the CPU. But what about the GPU-side? Is it L3 for the GPU?

Another question: If we allocate memory using cudaMalloc(), whenever accessing this memory from GPU (in CUDA application), do we go through GPU L2 → LPDDR5 or GPU L2 → system-level cache → LPDDR5?
Is there any way to do so if not going through the system-level cache?

I would be happy if someone with knowledge clarified these.

Best. Thanks in advance.

Hi,
Here are some suggestions for the common issues:

1. Performance

Please run the below command before benchmarking deep learning use case:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

2. Installation

Installation guide of deep learning frameworks on Jetson:

3. Tutorial

Startup deep learning tutorial:

4. Report issue

If these suggestions don’t help and you want to report an issue to us, please attach the model, command/step, and the customized app (if any) with us to reproduce locally.

Thanks!

Hi,

I don’t think this answer is related.

Best.
Regards.

Any help?

Hi,

Sorry for the late update.

You can find the memory management info in the link below:

The memory allocated by cudaMalloc is cached on the GPU side.
However, we cannot disclose more details about cache handling here.

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.