Total Register Usage WarpAllocationGranularity & AllocationSize

According to the occupancy calculator, the total kernel register usage is defined as:

Registers = CEILING(CEILING(MyWarpsPerBlock,myWarpAllocationGranularity)*MyRegCount*32,myAllocationSize)

I’d have thought it should simply be:

Registers = MyWarpsPerBlock32MyRegCount

As such, I have 2 questions:

1. What is ‘warp allocation granularity’ and why is it 2?

2. What is ‘allocation size’? Why is the register usage need to be a multiple of 512?