Hello everyone, I might need some help from you all, I’m having some problems learning the gpu for the hopper architecture.
The diagram is doing a 64b swizzle, but a square is 128b in size.This is where it comes from.1. Introduction — PTX ISA 8.7 documentation
I can see it compressing two rows into one and then doing a swizzle operation.But like I gave in the link, why does it have an 8x4 shape?Maybe 4x8 should be more straightforward.
Maybe I don’t understand the concept of Swizzle atom layout and don’t know what it does. Hope someone can help me.Thank you.