SIMD Versus SIMT What is the difference between SIMT vs SIMD

Thanks for the links. I did not realize that Gen6 was using per-channel IPs (and that it actually works).

My understanding of Larrabee is that it was expected to use a stack-based (or counter-based) technique in software. Its ISA makes it easy to allocate a 16-bit register for each predicate mask of the stack, and use conditional jump instructions to bypass if-else or else-endif blocks when the mask is all-zero.
Since these operations are performed in the scalar x86 portion of the core which should stand idle most of the time when running SIMT code, the performance impact may not be significant.
Also, if the compiler is good enough, it can detect uniform branches and highly-divergent branches, and generate more efficient code for these cases.

I think Andy’s main point is that we should explore tradeoffs between SIMD and MIMD: try to achieve close-to-MIMD performance on irregular codes, and close-to-SIMD power efficiency on regular codes (and all variations in between)…