Best way to coalesce memory access

Skyd · July 9, 2009, 8:36pm

Hey, so I was wondering if anyone had any ideas on how to coalesce memory access for a particular program I’m writing.

The problem is this: Imagine a 2D grid of floats (currently I’m using simple linear array float* to represent this). For every element on this 2D grid, I need to perform a calculation thats based on the 4 “surrounding” neighbors of that element.

Here’s a visual representation:

OOOOOOO
OOOAXOO
OOOXXOO
OOOOOOO

The ‘A’ is the element of interest. For that element, I need to read in the values of ‘A’ as well as the 3 'X’s around it and perform calculations based on those 4 values. I then take my result and place it into a new matrix at the same indexed location as where A was.

This is being done for every element in the matrix/grid, not just for A, so it makes sense to use CUDA for parallel computation. Only problem is, accessing the ‘A’ and the 3 ‘X’ values arn’t exactly coalesced, so the process is very slow. ie. here’s what I’m currently doing:

float value1 = array[tid];
float value2 = array[tid+1];
float value3 = array[tid+width];
float value4 = array[tid+width+1];

Any suggestions how to get the memory accesses to coalesce? Thanks

avidday · July 11, 2009, 6:32am

You might find this [post=“563715”]post[/post] useful.

Topic		Replies	Views
Coalesced memory access in a matrix of coefficients CUDA Programming and Performance	5	458	August 15, 2024
Need some help to understand how to coalesce memory access CUDA Programming and Performance	4	1050	June 30, 2010
Coalescing memory accesses Need help with coalescing CUDA Programming and Performance	2	1224	March 30, 2009
Help Avoiding Un-Coalesced Memory Access CUDA Programming and Performance	9	9331	October 4, 2010
How to resolve this Coalescing problem? CUDA Programming and Performance	11	2288	May 28, 2009
Memory Coalescing CUDA Programming and Performance	5	9358	October 15, 2011
Problem about Coalesced Access CUDA Programming and Performance	1	4180	July 8, 2008
Whether this is coalescing access several cases to decide CUDA Programming and Performance	0	1596	August 2, 2011
How to understand the alignment of 2D array and fully coalesce of the memory access CUDA Programming and Performance	7	3602	July 27, 2016
Coalesced access for 2D Matrix CUDA Setup and Installation cuda , numba	0	1002	February 11, 2022

Best way to coalesce memory access

Related topics