I don’t think this is well documented anywhere but I believe my previous comment in this post may have significant inaccuracies, so I would like to take the opportunity to revise it. To preserve the record I will write my revision below.
The GK210 whitepaper:
“Each of the Kepler GK110/210 SMX units feature 192 single-precision CUDA cores, and each core has
fully pipelined floating-point and integer arithmetic logic units.”
as its only reference to integer processing description.
reports a throughput of 160 for integer add and 32 for integer multiply. This doesn’t seem to quite line up with the statement in the whitepaper. However I think both are probably correct (160+32 = 192 may be significant)
Revision to previous statement:
The GPU SM has a large collection of functional units. Different functional units service different types of instructions. The thing you are calling compute units, more commonly called “cores” are typically actually single-precision floating-point units, however as noted above in the case of Kepler they service both integer instructions and floating point instructions such as FADD, FMUL, FFMA. (There are different functional units, in different quantities, for 64-bit floating point arithmetic.)
I apologize for the previous error(s).
I also acknowledge that this does not address the questions raised in the most recent prior post in this thread.