Hi,
My algorithm is memory bounded. In this excelent article:
[url=“http://www.astrogpu.org/talks/NVIDIA/AstroGPU.4.Optimization.Harris.pdf”]http://www.astrogpu.org/talks/NVIDIA/Astro...tion.Harris.pdf[/url]
the author measures the bandwidth of the algorithm implemented in cuda.
How do I do this? just calculate the amount of data read? is there something
automatic that does this?
thanks
eyal