So changing the kernel to:
__global__ void kernel(float *dev_ouput, int width, int height)
{
int x = threadIdx.x + blockIdx.x * blockDim.x;
int y = threadIdx.y + blockIdx.y * blockDim.y;
int offset = x + y * blockDim.x * gridDim.x;
if ( x >= width || y >= height ) return;
float4 pixel = tex2D(texRef, x / (float)width, y / (float)height);
dev_ouput[offset*4 + 0] = pixel.x;
dev_ouput[offset*4 + 1] = pixel.y;
dev_ouput[offset*4 + 2] = pixel.z;
dev_ouput[offset*4 + 3] = pixel.w;
}
Sadly gives me the same result.
Oh I should say that I am using SDK 5.5, looking at your link the interop stuff has changed yet again, but I am stuck with 5.5.
I saw the textureReference.normalized value earlier in the docs, it is set to the default 0.