- Even if a sequence of instructions have no dependencies, there’s only 1 dispatch unit. Is one dispatch unit capable of issuing multiple instructions at the same cycle to achieve ILP? Or there’s some other methods for this?
Instruction parallelism is a form of Instruction Level Parallelism. Multi-instruction dispatch is not a requirement of instruction level parallelism.
- So a load/store instruction is first issued to this shared unit and then they’re executed in sub-partition’s LSU? Is it possible that memory instruction in sub-partition 1 get executed in sub-partition 2?
The instruction is issued to an instruction queue (along with registers and constants) and executed on the shared SM level execution unit. In the logical SM model in the whitepaper the LDST boxes should be closer to the Tex boxes. LSU and TEX are shared execution units that are timesliced between sub-partitions.
Is it possible that memory instruction in sub-partition 1 get executed in sub-partition 2?
No. LSU is an SM level shared execution unit.