currently I am trying to use linear memory bound to a texture. The problem now is, that I get an error when executing the simplest kernel: “fetch from texture failed”. The search in this forum did not succeed nor did it with Google. So somewhere there must be a mistake in my code, but I cannot figure out where it could be. The solution must be very simple and so is my test-program which shows the error …
The texture is defined as follows at global scope:
texture<ushort1, 1, cudaReadModeElementType> tex;
It is bound like this:
cudaBindTexture(0, tex, deviceMemorySource, requiredSpace);
And in the kernel I try to get the values like this:
dst[imageW * iy + ix] = tex1D(tex, imageW * iy + ix).x;
The code in the attachment including the complete test-program can be directly compiled.
I read from this texture tex and copy the values to the destination-array and write a 1 to another array to report the visited pixels.
As output the program shows how many values have been written and visited in a demo-texture that is initialized by cudaMemset.
The actual behaviour is exactly the same under WinXP and Linux 2.6.25:
Compiled in emulation mode, no value is written back to the destination and only one index is visited, so only one thread had been executed. This also gives the above error message.
Compiled without flags, all threads are executed, but only one value is written to position 0 of the destination array.
Thanks in advance for your reply
test.zip (1.11 KB)