Problems with compiler

If I use registers in half type (like 16 half registers), how can I know if they will finally occupy 16 registers or they will be optimized to 8 registers (like been conbined into half2 type)

You can know that by studying the SASS. It may also be discoverable by looking at the -Xptxas=-v output that the compiler can generate.

If you wanted to address the problem a-priori, it seems like you could just use half2 everywhere.

Also think about how the half registers are transferred or used, e.g. how they are loaded from or stored into memory, or whether the underlying computation instructions work on half or half2. Then you can better guide the conversions.

Please be aware that half and half2 have different alignment (2 vs 4 bytes), so sometimes for accessing memory the compiler cannot do all optimizations by itself without further hints.

Thanks!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.