I am trying to take metrics on my kernel and keep getting a warning when i try and run several of the experiments. The warning reads:
“The following input counter for this experiment have overflowed. Please treat the presented with caution. To avoid this problem consider profiling this kernel with a subset of the kernel grid only.
Zero Eligible Warps
I can post a picture of the nsight report if that makes things clearer. My question is, should i shrink the size of the kernel being executed, or is there some option in nsight to sample only a subset of the kernel grid? What is the root cause of this warning, and can i still trust the results in the report?
Well, it’s kind of self-explanatory. The data type that stores the input counts overflowed. I’d imagine in then lists what each counter was for (warp info). If it was an int or something, it exceeded whatever INT_MAX is.
You might have to change the size of the data you’re processing which would ideally reduce the grid size and the subsequent amount of warp cycles.
Granted, I’m kind a of newb here so anything I say, I feel like needs a disclaimer but that’s what I got from the warning.
The only available workaround is to shorten the kernel’s run time to less than a second or so to avoid the counter overflows.
The root cause of these “counter overflowed” warnings is that an insufficient number of bits was provisioned for the performance counter registers in the hardware by the designers of the GPU. In older GPUs these counters were only 32 bits wide, in the latest ones I think they are 40 bits wide. Some events that are being counted can cause a counter to increment by more than one unit per each clock.