Copying data from global memory to shared memory by each thread

_DK · May 15, 2009, 10:57am

Make each thread copy a chunk of your array.

__global__ testkernel(float *gArray)

{

  __shared__ float sArray[256];

if (threadIdx.x < 256)

  {

	  sArray[threadIdx.x] = gArray[threadIdx.x];

  }

  __syncthreads();// wait for each thread to copy its elemenet

  ...

}

If you have less than 256 threads, each of them should be copying a few elements of your array:

...

 sArray[i] = gArray[i];

 sArray[i + THREADS_NUM] = gArray[i + THREADS_NUM];

 ...

Topic		Replies	Views
copying to shared block mem CUDA Programming and Performance	11	4173	April 6, 2008
Shared memory vs global memory CUDA Programming and Performance	6	3443	April 30, 2007
Question regarding transfer from global to shared memory CUDA Programming and Performance	5	5961	November 27, 2010
Shared Memory question CUDA Programming and Performance	5	2885	November 25, 2016
From Global to Shared Copy some data from Global mem to Shared mem CUDA Programming and Performance	2	3333	November 25, 2011
Dynamic Shared memory CUDA Programming and Performance	3	6103	June 4, 2009
copy global memory by CUDA threads CUDA Programming and Performance	3	1208	January 17, 2011
copy global memory by kernel threads CUDA Programming and Performance	1	5957	January 23, 2011
CUDA: Using shared memory between different kernels.. CUDA Programming and Performance	4	16229	July 21, 2017
Copy to shared memory CUDA Programming and Performance	2	1658	May 18, 2010

Copying data from global memory to shared memory by each thread

Related topics