I know that thread blocks are scheduled as warps by the warp schedulers of each SM.
My question is:
a) Does the order of the warps ID stay the same with every run of the same kernel?
if let’s say warp0, warp1, warp2 and warp3 are colocated on SM0 and the order of running warps is: warp0, warp3, warp2, warp0, warp3, warp1… Will the order be always the same when running this kernel?
b) Additionally, does the warps ID that are colocated on a specific SM affect the kernel’s performance?
Let’s say SM0 will be occupied by warp0, warp1, warp2 and warp3. Will the performance be different if SM0 would be occupied by warp0, warp1, warp9,and warp10? In both cases 4 warps would occupy SM0, but does the ID of each warp (and consequently the data that each warp accesses) affect the performance of the warp scheduler and the total’s kernel?
Thank you in advance!