Global memory access requests ordered..?

little_jimmy · May 7, 2014, 5:35pm

Hello,

I seem to constantly forget that gpu cache is not cpu cache

My kernel executes a number of functions sequentially; the functions share some data via shared memory, and some of the data that can not fit in shared memory, via global memory

I believe subsequent functions manage to read the global memory data before preceding functions have finished updating the data - the former half of the data array read by the subsequent function is updated, the latter part not

Is this because of the kernel using functions? I presume one can not expect global memory accesses to be ordered?
Changing the functions to kernels or child kernels should solve the matter, not so?
Would inlining the functions help?

seibert · May 7, 2014, 6:14pm

Within a CUDA thread, you should see a consistent view of memory, regardless of caching. This makes me suspect a race condition between threads accessing the same memory locations. Is one thread reading a memory location that is written to by another thread?

little_jimmy · May 8, 2014, 4:55am

A block of threads of the preceding function process data and write to an array in global memory
The threads of the subsequent function then need to read the array in global memory to execute their part
I think I might have taken the snapshot of the array prior to the __syncthreads() barrier whilst debugging
At least I now know that it must be a bug, rather than a principal design flaw

Thanks, seibert

Topic		Replies	Views
global memory consistency same address accessed by multiple blocks CUDA Programming and Performance	1	1089	March 12, 2010
How the access of the same global memory address is performed by threads from different kernels? CUDA Programming and Performance	2	794	January 23, 2013
CUDA Memory Consistency CUDA Programming and Performance	23	55983	March 8, 2007
Question on Global memory access CUDA Programming and Performance	0	847	June 25, 2010
sequential memory read and write in a thread block CUDA Programming and Performance	3	4133	May 22, 2009
General Shared Memory Question CUDA Programming and Performance	5	6708	March 4, 2010
Threads in global functions, write on shared memory CUDA Programming and Performance	0	847	March 5, 2009
concurrent memory access CUDA Programming and Performance	1	2160	September 17, 2009
Access in kernel function memory in parallel or in sequential? CUDA Programming and Performance	1	3432	December 19, 2011
Avoiding global memory ordering by __syncthreads CUDA Programming and Performance	1	640	May 30, 2017

Global memory access requests ordered..?

Related topics