I was recently checking on what the newest compute capabilities (CC) are on my test GPUs so I could compile with the latest “CC” available.
One of my GPUs has a CC of 6.1 but the NV compiler (NVFORTRAN 20.11) has 6.0 and 6.2, but no 6.1.
Why does it not have 6.1 (Quadro P-series)?
Also, comparing the list of CCs at “CUDA - Wikipedia” to the list of available targets in the NV compiler, it seems that the CCs 3.7, 5.2, 5.3, and 8.6 are also missing.
I assume it is best practice to use the largest CC less-than-or-equal to the GPU being targeted?
Am I missing out on any optimizations by using 6.0 for a 6.1 GPU?
I believe that there’s aren’t any architectural differences between the two (just configuration) so no, you are missing out on any additional optimizations.
Why does it not have 6.1 (Quadro P-series)?
Since the compiler will by default create a unified binary (i.e. target multiple target devices), we do try to keep the number of targets to a minimum (especially when the differences between the CC versions are minor). There is some overhead in compilation time and binary size, so including all targets would add to this overhead.
I assume it is best practice to use the largest CC less-than-or-equal to the GPU being targeted?