Bandwidth

Hi,
My algorithm is memory bounded. In this excelent article:
http://www.astrogpu.org/talks/NVIDIA/Astro…tion.Harris.pdf
the author measures the bandwidth of the algorithm implemented in cuda.
How do I do this? just calculate the amount of data read? is there something
automatic that does this?

thanks
eyal

Pretty sure he knows a priori how much data he’s reading/writing and computing based on that. There’s no tool to do this automatically yet.