Measuring Kernel Bandwidth

haridy · September 19, 2010, 7:15pm

Hello everybody

is there a way to measure the bandwidth utilized by a kernel ? to be more precise,i have a kernel,a bit complicated with a lot of memory and computing instructions interleaving, i need to know the memory bandwidth i am actually using ? i know that the profiler can calculate that but i am using GTX480 and the profiler dont give any information about the bandwidth for Fermi (as far as i know), anybody has a solution for measuring that in kernel ?

Thanks.

ONeill · September 20, 2010, 2:05pm

You could count all accesses to global memory in bytes (or GB) inside this kernel (* number of threads) and divide this by the kernel’s total runtime. You can do this by hand which gives a good estimation on how close you are to theoretical max bandwidth as long as your kernel is really limited by it.

ONeill · September 20, 2010, 2:05pm

You could count all accesses to global memory in bytes (or GB) inside this kernel (* number of threads) and divide this by the kernel’s total runtime. You can do this by hand which gives a good estimation on how close you are to theoretical max bandwidth as long as your kernel is really limited by it.

haridy · September 20, 2010, 10:07pm

i am not sure if that would be a good measure, because there is a big gap of time between reading the input and writing it back,so i kinda need to know the bandwidth in each part alone,if i calculate after both (and after the calculations)then im taking a lot of non-memory operations in account, the calculated bandwidth would be way less than the real thing.,…am i right ? please correct me if my idea is wrong,all help is appreciated

haridy · September 20, 2010, 10:07pm

i am not sure if that would be a good measure, because there is a big gap of time between reading the input and writing it back,so i kinda need to know the bandwidth in each part alone,if i calculate after both (and after the calculations)then im taking a lot of non-memory operations in account, the calculated bandwidth would be way less than the real thing.,…am i right ? please correct me if my idea is wrong,all help is appreciated

ONeill · September 21, 2010, 8:44am

If you really want to measure each part separately and neglect the calcs you could split it up into two kernels. Otherwise if you try to find out if your optimizations get your kernel somewhere near peak bandwidth my suggestion would work fine. Its also explained in the best practices guide. But if this gap is really big than maybe its limited by computations.

ONeill · September 21, 2010, 8:44am

If you really want to measure each part separately and neglect the calcs you could split it up into two kernels. Otherwise if you try to find out if your optimizations get your kernel somewhere near peak bandwidth my suggestion would work fine. Its also explained in the best practices guide. But if this gap is really big than maybe its limited by computations.

Topic		Replies	Views
simple question measure Flops, Bandwidth CUDA Programming and Performance	0	2003	January 28, 2011
benchmarks CUDA Programming and Performance	2	453	June 10, 2019
bandwidthTest CUDA Programming and Performance	1	2103	September 29, 2008
Bandwidth measurement Theortical bandwidth vs BandwidthTest(SDK) results CUDA Programming and Performance	4	1557	May 30, 2011
Bandwidth calculation Newbie question... CUDA Programming and Performance	10	5389	August 1, 2008
Effective memory bandwidth? CUDA Programming and Performance	9	3789	July 26, 2021
Measuring GFLOPS for a kernel CUDA Programming and Performance	1	4537	March 26, 2009
A couple of questions CUDA Programming and Performance	5	2057	December 2, 2008
Measuring Effective Bandwidth CUDA Programming and Performance	1	4643	February 20, 2011
Maximum bandwith? CUDA Programming and Performance	4	4423	April 16, 2008

Measuring Kernel Bandwidth

Related topics