Hi, i have a few questions:
-
about texture
At some point in my program, all threads are executing tex2D() function with exactly the same index.
Is there any lock for reading the same cell, or can all threads access it at the same time? (if this would be normal array, there would be all bank conflicts i guess?) -
about simple conversion from uncoalesced to coalesced memory access
At this point i have kernel which accepts one big array like: 1 2 3 4 5 6 7 8 9 10 11 12 13 …
Lets say when i execute program, each thread gets 4 elements from it’s subarray, this elements are then passed to the calling functions as offset from the original array.
Something like this: fcall( Arr), fcall( &Arr[4] ), fcall( &Arr[8] )…
When i am accessing this array from fcall, this is clearly uncoalesced read/write access.
What i would like to do here is to get those array reads/writes coalesced from inside of those fcalls.
Would something like that be coalesced, if i fill original array like this:
1 2 3 4 0 0 0 0 0 0 0 0 0 0 0 0
0 5 6 7 8 0 0 0 0 0 0 0 0 0 0 0
0 0 9 10 11 12 0 0 0 0 0 0 0 0 0 0
…
and then pass to each thread
fcall( Arr ), fcall( &Arr[16+1] ), fcall( &Arr[16+2] ) …?
I realize here is a lot of wasted memory, but that is not a problem at this point.
Thanks for answers.
PS: Sorry for my bad english and probably an awkward explanation >.<