I profile my program with nvprof nvprof --csv --events inst_executed BINARY
If my kernel runs to long inst_executed overflow, therefor the nvprof give a warning (normaly). But if I scale my kernel runtime step by step the counter overflow but nvprof give no warning. I know that my kernel scale linear in time and instructions.
- kernel runtime 396ms inst_executed = 8533701716
- kernel runtime 2060ms inst_executed = 41264789384
- kernel runtime 3150ms inst_executed = 5969758844 > overflow no warning
- kernel runtime 4760ms inst_executed = 37530017538 > double time than 2. but not double instruction count and no warning
At the moment I have no mini example, all runs are done with our big simulation. Can it be that inst_executed is only 36 Bit and cupti not found small overflows. If I run my kernel 16000ms nvprof give me a warning.