Which pipeline does FP32-to-FP16 conversion?

On the page: Kernel Profiling Guide :: Nsight Compute Documentation
it is stated:

fp16 pipeline: […] It also contains a fast FP32-to-FP16 and FP16-to-FP32 converter. Starting with GA10x chips, this functionality is part of the FMA pipeline.
alu pipeline: […] On NVIDIA Ampere architecture chips, the ALU pipeline performs fast FP32-to-FP16 conversion.

My understanding is that GA10x is an Ampere architecture. So which pipe does the FP32-to-FP16 conversion, the FMA pipeline, the ALU pipeline or FP16 pipeline?

I’m double checking with our architecture experts and I will let you know what I hear.

I have some more info. While GA100 and GA10X are both referred to as Ampere architecture, they do have differences, particularly in the SM. To answer your questions, on GA100 the FP32-to-FP16 and FP16-to-FP32 converters are in the fp16 pipe.

I was mostly wondering for the GA102. Is it the same case for that this?

For GA102 it’s in the FMA pipe.

Alright. Thank you!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.