According to roofline, L1 is compute bound, L2 and DRAM is memory bound?

According to roofline, L1 is compute bound, L2 and DRAM is memory bound?

According to Kernel Profiling Guide :: Nsight Compute Documentation your understanding is correct.

1 Like

Emmm… Well, first time to know that… So what is the takeaway from it? Maybe…I should increase L2 and DRAM’s data reuse and increase L1’s compute speed? Something like that?

Sorry. I forgot to guide you to the roofline part directly. Kernel Profiling Guide :: Nsight Compute Documentation

Regarding to the takeaways, let me check more with dev expert directly and reply if I have update.

1 Like