I am trying to profile the global memory access in some code.
While the collected metrics are great I do want to have a more in-depth data and wondered if it is possible
for instance I saw that a function took X amount of time and had a certain read and write bandwidth.
Is it possible to know how much of the run time was spent on the reads or writes to the memory? Can we know which happened first, or did they happen at the same time?
Or in other words, can I tell if the bandwidth access is always at peak value for some of the time or is it sustain for all the function’s runtime?
Also I saw that I can see the memory coalescence, are there other metrics that can indicate the addresses/strides accessed by the code
Thanks in advance