I am using NSIGHT Compute to debug and optimize the kernels. Under Scheduler Statistics, I get the following message:
“Every scheduler is capable of issuing one instruction per cycle, but for this kernel each scheduler only issues an instruction every 16.4 cycles. This might leave hardware resources underutilized and may lead to less optimal performance. Out of the maximum of 16 warps per scheduler, this kernel allocates an average of 3.24 active warps per scheduler, but only an average of 0.07 warps were eligible per cycle. Eligible warps are the subset of active warps that are ready to issue their next instruction. Every cycle with no eligible warp results in no instruction being issued and the issue slot remains unused. To increase the number of eligible warps either increase the number of active warps or reduce the time the active warps are stalled.”
How do I increase the number of active warps?