I am looking for a way to measure the device memory access and some parts of my kernel code in general. I found out some ways to do it such as Create CPU timers or GPU timers and tune my kernel only with the parts that i wanna measure. However, there should be another precise method to do that inside the kernel.
Any ideas about that?
Thank you very much.