This is armstrong - A newbie in CUDA Programming - and sending you the first post.
I am trying to learn convolution separable code in CUDA SDK.
Its been a week since i am trying to understand. But no real breakthrough.
Very diffcult to understand the coalesced acess.
How will this code segment, facilitate coalesced acess.
const int apronStartAligned = tileStart - KERNEL_RADIUS_ALIGNED; const int loadPos = apronStartAligned + threadIdx.x;
Furhter where can I find simple programming tutorial than the hard to crack ones in SDK.