Hello,
I’m recently working with a100 GPU clusters. I studied the architecture of it and then found out that L2 cache inside the GPU is divided into 2 parts(the parts seem to be interconnecting with each other tho).
So I wonder if there is any way to allocate each part to different process(e.g process1 can only read or write on L2 cache part1), or mb it is blocked to partition L2 cache at all?
Thank you in advance!