With nvbit, I see the total number of miscellaneous instructions classified as described in the reference  is much less than nvprof.
BAR.SYNC = 123816 NOP = 123816 S2R = 3216
BAR.SYNC = 3962112 NOP = 3962112 S2R = 102912
I counted these instructions based on the reference .
But the nvprof number is much larger than that.
==40910== Metric result: Invocations Metric Name Metric Description Min Max Avg Device "TITAN V (0)" Kernel: gen_hists(unsigned long*, float*, float*, float*, int, int) 1 inst_misc Misc Instructions 1.5831e+11 1.5831e+11 1.5831e+11
I know that nvprof counts at thread level, but as you can see in nvbit, warp-level and threa-level values are really lower than nvprof.
Any idea about the type of instructions considered as MISC?