Jetson TX2: Pascal GPU's Load/Store Units

Hello everyone,

I came across the Pascal GP100 architecture and noticed that it has 8 load/store units per seperated processing block which can be seen in the image I attached to this post.

According to the datasheet, the GP10b Pascal architecture of the Jetson TX2 platform offers 4 seperated processing blocks with 32 CUDA cores each per Streaming Multiprocessor. Now my question is: Does the Jetson TX2 GPU also have 8 load/store units per seperated processing block, i.e. 8 load/store units per 32 CUDA cores?

Thank you for any help!

Best regards.


TX2 is also Pascal family so the design is similar.
Here is TX2 module datasheet for your reference: