reading and writing texture data

captain_kopeikin · August 8, 2009, 2:25am

I’ve been working with textures a little bit, and I’m unclear on a couple of points that I’m hoping somebody could help me with.

Is it ever possible that, in the code below, the first thread to access a piece of data loads the value into the texture cache, then updates the value, so that the next thread to read that data gets it from the cache, which is actually out of sync with the real value stored in global memory?

global void kernel(int* ptr_x)
{
int my_id = threadIdx.x;

int val = tex1Dfetch(my_tex,my_id);
ptr_x[my_id] = val+1;

int val2 = tex1Dfetch(my_tex,my_id+1); //val2 == val because the texture cache doesnt realize the data was updated via ptr_x ???
}

I’d like to do something similar to my example above where I have a block of data I’d like my kernel to modify so I pass a pointer to it and index it using something like my_id = x + ydimx + zdimx*dimy. However, I’d like to bind a texture to this same data to take advantage of the caching, since data access is spatially very regular (left, right up, down, back, forward).

Since there is no cudaBindTexture3D, is my only option a cudaArray? I’ve read elsewhere that you cannot write data to a cudaArray., In an ideal world, I would have a 3D texture referring to my data, and I would pass a pointer to the data that I then use to edit it, but maybe this isn’t possible??

MisterAnderson42 · August 8, 2009, 11:49am

Yes. The programming guide explicitly states that reading a value from a texture also written to in the same kernel will result in undefined behavior.

If you have 3D spatially local accesses among threads in a warp, a cudaArray based 3D texture will likely provide the best performance unless you can possibly do coalesced reads into shared memory and work from there.

There is no cache update problem with cudaArray based textures because you cannot write to a cudaArray directly from the kernel.

captain_kopeikin · August 8, 2009, 4:48pm

ok, it sounds like read → edit -->transfer updated data to cudaArray -->repeat is the best way to do things, then. thanks for the response.