Block indexing

armstrongguy · June 18, 2008, 5:44pm

Hi Guys,

This is armstrong - A newbie in CUDA Programming - and sending you the first post.

I am trying to learn convolution separable code in CUDA SDK.

Its been a week since i am trying to understand. But no real breakthrough.

Very diffcult to understand the coalesced acess.

How will this code segment, facilitate coalesced acess.

const int apronStartAligned = tileStart - KERNEL_RADIUS_ALIGNED;

const int loadPos = apronStartAligned + threadIdx.x;

.

Furhter where can I find simple programming tutorial than the hard to crack ones in SDK.

David Armstrong

Ailleur · June 18, 2008, 6:22pm

Havent spent time on this one in particular but take a look at the transpose exemple.
The naive kernel is each to understand and the coalesced kernel will show you how to tackle the job of coalescing something that isnt not naturaly.

kristleifur · June 18, 2008, 7:05pm

Hi,

this is from the convolutionSeparable example, right?

The rule is this:

(globalArrayIndex modulo 16) == (threadIdx.x modulo 16) —> coalesced access

tileStart is always a multiple of 16. KERNEL_RADIUS_ALIGNED is always a multiple of 16 - it’s “aligned up” to the next 16. Hence, apronStartAligned is always a multiple of 16.

So, Position (A):

(apronStartAligned + 0) modulo 16 → 0.

Then, threadIdx.x must be 0 when reading that position.

Similarly, let’s take a look at position (B),

(apronStartAligned + 1) modulo 16 → 1.

threadIdx.x must be 1 when reading position (B).

Hope this explanation works. Keep at it, it WILL click into place.

Topic		Replies	Views
Memory coalescing CUDA Programming and Performance	0	8416	June 10, 2007
Coalesced Memory access related doubt CUDA Programming and Performance	13	2238	December 9, 2010
Interpretation of Coalesced Global memory access for 3d Block Is it coalesced only if tid is used?? CUDA Programming and Performance	2	3208	November 23, 2011
Correct understanding coalesced memory loading? CUDA Programming and Performance	7	5413	July 30, 2008
Need help on non-coalesced access CUDA Programming and Performance	0	1167	May 9, 2009
Coalesced memory access in a matrix of coefficients CUDA Programming and Performance	5	511	August 15, 2024
Need some help to understand how to coalesce memory access CUDA Programming and Performance	4	1063	June 30, 2010
Coalescence CUDA Programming and Performance	3	827	January 9, 2018
Problem with proper coalesced indexing CUDA Programming and Performance	0	349	September 18, 2021
about coalescing access CUDA Programming and Performance	2	736	January 4, 2016

Block indexing

Related topics