What information does "gld_request" provide? (cudaProf Counter)

mayank · February 17, 2010, 4:17am

Hi All,

I am trying to figure out the difference between gld_32/64/128 and gld_request (performance counters in cudaProf).

For doing so, I wrote a very simple kernel:


__global__ void kernel(float *d_arr)
{
        int i=(blockIdx.x*blockDim.x)+threadIdx.x;
        if(i<SIZE);
                d_arr[i]++;
}

I am just incrementing each value of a float array of size, say 32768.

My kernel launch configuration is 60, 512. Register usage per thread is 2. Thus, occupancy is 1.

The output from cudaProf is:
gld_64=192 and gld_request=32.

The value of gld_64 makes sense, since it is the sum of 3 SMs. Thus, each SM has 64 global memory load transactions (all access being coalesced). As on each SM 1024 threads can run which mean 64 half warps, and 1 memory transaction for each half warp, thus the value of gld_64 = 64. Perfect.

But what does gld_request mean? And, is there any way to find out the number of coalesced and uncoalesced memory accesses on GPU of compute capability 1.3, e.g, GTX 280?

I am at a fix. I want to come up with an analytical model for performance characterization on GPUs, hence understanding of these counters is very important to me.

Please help.

Topic		Replies	Views
what does gld_request really mean? CUDA Programming and Performance	5	2179	January 18, 2010
coalesce counter meaning CUDA Programming and Performance	5	4287	April 15, 2009
something wrong with cuda visual profiler CUDA Programming and Performance	3	11840	October 21, 2009
gld counter - visual profiler question CUDA Programming and Performance	1	2266	June 12, 2009
Visual profiler and compute capability 1.3 CUDA Programming and Performance	4	9945	May 3, 2010
Questions about profiling CUDA Programming and Performance	0	792	September 20, 2010
problem about "gld 64b" reported by profilier the number is different from what I expect CUDA Programming and Performance	2	3824	June 12, 2010
problems about cudaprof CUDA Programming and Performance	2	1356	February 18, 2010
gld_coherent and gst_coherent inconsistent? CUDA Programming and Performance	0	6781	February 24, 2009
how to find the amount of data transfer? CUDA Programming and Performance	1	1681	September 1, 2009

What information does "gld_request" provide? (cudaProf Counter)

Related topics