According to the instruction set reference, integer multiply-add (IMAD) has been part of the instruction set since the beginning (compute capability 1.x). The performance of multiply-add is the same as multiply. When you disassemble a kernel (cuobjdump -sass) you will see a lot of IMAD instructions. Apparently the compiler is already smart enough to schedule these IMAD instructions, so no need for intrinsics here.
For those interested, a list of integer intrinsics can be found here. I’m not sure how all these intrinsics map to actual instructions, some, like __sad() map to a single instruction.
For more integer performance there are also the SIMD Video Instructions which are described best in the PTX documentation. However, according to SPWorley’s forum post these instructions are no longer natively supported in Maxwell.