profiler instruction count

Hi,

I am new to using the CUDA profiler and the documentation is as minimal as it can be. I am trying to make sense of the instruction counter.
From the previous posts I came to know the following and wanted a confirmation:

  1. The profiler data is for 1 multi-processor and is not representative of the entire processor.

  2. Instruction count is for every block.

  3. I would also like to know whether
    instruction count = integer ops + floating point instructions ??

  4. Is
    instruction count = arithmetic ops
    or
    instruction count = arithemtic ops + branches + loads + stores + …

Thanks!!!