I am currently optimizing a kernel but I have an issue the gdl_efficiency returned by nvvp or nvprof.
When running the profiler I have an gld_efficiency of 15%, which is very low, but when I am looking at global memory access pattern with nvvp, the profiler tells me that “No issue has been found”.
I checked the code and I cannot not find any uncoalesced memorty read access.
Is is possible to have a very low gld_efficiency but “no issue” in global memory access pattern.
I am using an NVidia quadro K4200 (CC 3.0) with CUDA 7.5
Thank you very much for your help.