I wanted some clarification on nvprof and power readings.
Assume my CUDA binary is foocode with a command line argument of 10000
Now, I do the following :
nvprof --csv --system-profiling on --devices 0 --log-file human-readable-output_A.log ./foocode 10000
Now in the file human-readable-output_A.log I get :
==1202776== System profiling result:
“Power (mW)”,“Tesla K40m (0)”,4,64493.250000,63287.000000,68015.000000
Now, is the Avg reading of 64493.250000 the average over 4 counts of sampling the TOTAL POWER of the whole execution time of the code?
I just want to confirm this.
Another question is, that if it is indeed the average over 4 counts of sampling the TOTAL POWER of the whole execution time of the code how logical is it if I run my code say 3 times and then take an average of the 3 Avg reading’s?
I am asking this because, I do plan on running my code 3 times and taking an average of the execution time, therefore I am wondering if it makes sense for me to also do the same for the Avg power reading. Would setting the sample count to 1 and then taking the average of the power reading be a better idea if that is possible?
Please note I am calculating the kernel execution time with the CUDA timers.
Thanks and Regards,