Illogical Bandwidth results Read and Write Globalmemory

lanzelot · July 7, 2008, 1:44pm

Hello

I did some bandwidthtests with global memory.

There are 3 different tests:

read only
write only
read and write

I supposed the “read and write” test would be the avarage of the “read only” and “write only”. The result was another: “read and write” was match more faster than “read only” or “write only”. The bandwidth of the “read and write” I multiplied with 2 because there are 2 transfers. The datatype in all measurements is float.

The kernels:

//copy data from global to global memory (global to global or "read and write")

template <class T> __global__ void copy_gmem(T* g_idata, T* g_odata, T c)

	{

	const unsigned int idx = threadIdx.x + blockIdx.x * blockDim.x;

	g_odata[idx] = g_idata[idx];

	}

//copy data from global to shared memory (read only)

template <class T> __global__ void read_only_gmem(T* g_idata, T* g_odata, T c)

	{

	const unsigned int idx = threadIdx.x + blockIdx.x * blockDim.x;

	__shared__ T shared[BLOCK_SIZE];

	shared[threadIdx.x] = g_idata[idx];

	*((float *)(&shared[(threadIdx.x + 1) & (BLOCK_SIZE-1)])) += 1.0;

	}

//writes a constant to the global memory (write only)

template <class T> __global__ void write_only(T* g_idata, T* g_odata, T c)

	{

	const unsigned int idx = threadIdx.x + blockIdx.x * blockDim.x;

	g_odata[idx] = c;

	}

The results are in the attachment.

Question:

What is the reason for the higher bandwidth in the “read and write” test? (Why it is not the average of “read only” and “write only”)

greetings lanzelot

Topic		Replies	Views
global memory bandwidth problem CUDA Programming and Performance	4	1464	March 2, 2010
Performance test sharedmemory <-> globalmemory CUDA Programming and Performance	2	3988	May 30, 2008
Bandwidth of reading data from global device memory CUDA Programming and Performance	1	3461	June 27, 2011
How much faster is shared memory vs global memory? has anyone run some tests? CUDA Programming and Performance	4	9056	December 11, 2007
Effective bandwidth between using shared memory and global memory CUDA Programming and Performance	0	395	August 2, 2020
How to get peak rate with simple opeartion Question about performance optimization CUDA Programming and Performance	17	13772	June 2, 2008
What is the detail of memory operations? CUDA Programming and Performance	2	302	April 20, 2024
Effective global memory bandwidth? CUDA Programming and Performance	17	17703	September 18, 2007
Squeasing max d2d memory bandwidth (GTX 480) CUDA Programming and Performance	15	7133	November 2, 2010
Constant Memory Bandwidth Program CUDA Programming and Performance	1	1633	May 19, 2011

Illogical Bandwidth results Read and Write Globalmemory

Related topics