Hello everyone,
I came across the Pascal GP100 architecture and noticed that it has 8 load/store units per seperated processing block which can be seen in the image I attached to this post.
According to the datasheet, the GP10b Pascal architecture of the Jetson TX2 platform offers 4 seperated processing blocks with 32 CUDA cores each per Streaming Multiprocessor. Now my question is: Does the Jetson TX2 GPU also have 8 load/store units per seperated processing block, i.e. 8 load/store units per 32 CUDA cores?
Thank you for any help!
Best regards.