How 3D stack is used? How TOM is used?

Hi! I am learning a paper:

Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems

And I find out this can be … quite famous??? I am wondering does NVIDIA really use this technique in GPU compiler?

Also, does NVIDIA used 3D stack memory in real GPUs? How? Any papers or blogs? Which structure corresponds to the “3d memory stack”?

Also, how TOM is used?

Thank you!!!

especially like…in the paper, there exists a “main GPU” and “3d stack”? What is main GPU, what is 3d stack? I have learnt the traditional GPU structure, does this 3d stack means…some tensor cores? Or all tensor cores are 3d stack? Can CUDA cores be 3d stack? Or some CUDA/TENSOR core is 3d stack which is faster than others? Can we control them?

Thanks!! Too many questions!!!

Hi, this seems more like hardware-related questions. I am moving this to the GPU Hardware category for visibility.

1 Like

Could you provide a link? Thanks!!!

Here is the link: Latest General Topics and Other SDKs/GPU - Hardware topics - NVIDIA Developer Forums