I’m currently reading the documentation on the GA102 architecture to understand its underlying hardware details. As I delve deeper into the reading, there are a few points that confuse me:
- What is a CUDA core?
In the article cuda core, there is a diagram illustrating the structure of a CUDA core, where each CUDA core includes both an FP unit and an INT unit.
This implies that a CUDA core handles both floating-point and integer operations. However, in the GA102 documentation, it specifically mentions that each block within an SM is divided into two data paths. One data path consists of 16 FP32 CUDA cores, while the other data path includes 16 FP32 CUDA cores and 16 INT cores. Does this mean that in the Ampere architecture, a CUDA core no longer includes an INT unit, and the INT unit is now part of a separate INT core?
In the Ampere architecture, the FMA pipeline is now merely a logical pipeline. In reality, it consists of two physical pipelines: fmalite and fmaheavy. Is fmalite the data path mentioned earlier that includes only 16 FP32 CUDA cores? And is fmaheavy another pipeline consisting of 16 FP32 CUDA cores and 16 INT cores?
Additionally, I have a question regarding which physical units are included in a pipeline.
I apologize for my limited hardware knowledge, which has led to numerous areas of confusion.
Thank you very much for your response.