Hello all,
anyone could explain me why in CUDA_Occupancy_calculator, value B34 (MyRegsPerBlock) results from rounding (MyWarpsPerBlock2) to multiples of 4
and then multiplication to 16MyRegCount?
Why just not MyThreadCount * MyRegCount?
Davide.
Hello all,
anyone could explain me why in CUDA_Occupancy_calculator, value B34 (MyRegsPerBlock) results from rounding (MyWarpsPerBlock2) to multiples of 4
and then multiplication to 16MyRegCount?
Why just not MyThreadCount * MyRegCount?
Davide.