Hello guys, need some help with a (hopefully trivial) problem.
I want to read values from a 2D Texture and updating them by using a device pointer. I use the same approach as in the book “CUDA by example” to achieve this.
float* device_ptr;
texture<float,2> tex;
int main()
{
const int DIM = 4;
int size = DIM * DIM * sizeof(float);
cudaMalloc( (void**)&device_ptr , size );
cudaChannelFormatDesc desc = cudaCreateChannelDesc<float>();
cudaBindTexture2D( NULL , tex , device_ptr , desc , DIM , DIM , sizeof(float) * DIM ) ;
float temp[16];
for (int y = 0; y < DIM ; y++)
for (int x = 0; x < DIM ; x++)
temp[ x + y * DIM ] = x + y * DIM;
cudaMemcpy( device_ptr , temp , size , cudaMemcpyHostToDevice );
dim3 grids(DIM/2,DIM/2);
dim3 threads(2,2);
kernel<<<grids,threads>>>(device_ptr);
cudaUnbindTexture( tex );
cudaFree( device_ptr );
return 0;
}
The problem is that somehow the 2D texture doesn’t seem to work properly for me. Instead of getting a 4x4 Texture, I only seem to get a 1x4 Texture. When I try to fetch values from row 2 3 4, it seems I get clamped values from row 1.
I used cuPrintf (to write values in the console) together with following kernel:
__global__ void kernel (float* device_ptr)
{
int x = threadIdx.x + blockIdx.x * blockDim.x;
int y = threadIdx.y + blockIdx.y * blockDim.y;
int offset = x + y * blockDim.x * gridDim.x;
cuPrintf("Texture value at x:%i y:%i = %f\n" , x , y, tex2D( tex , x , y ) );
cuPrintf("device_ptr value at x:%i y:%i = %f \n\n" , x , y , device_ptr[offset] );
}
The output:
[0, 0]: Texture value at x:0 y:0 = 0.000000
[0, 0]: device_ptr value at x:0 y:0 = 0.000000
[0, 1]: Texture value at x:1 y:0 = 1.000000
[0, 1]: device_ptr value at x:1 y:0 = 1.000000
[0, 2]: Texture value at x:0 y:1 = 0.000000
[0, 2]: device_ptr value at x:0 y:1 = 4.000000
[0, 3]: Texture value at x:1 y:1 = 1.000000
[0, 3]: device_ptr value at x:1 y:1 = 5.000000
[1, 0]: Texture value at x:2 y:0 = 2.000000
[1, 0]: device_ptr value at x:2 y:0 = 2.000000
[1, 1]: Texture value at x:3 y:0 = 3.000000
[1, 1]: device_ptr value at x:3 y:0 = 3.000000
[1, 2]: Texture value at x:2 y:1 = 2.000000
[1, 2]: device_ptr value at x:2 y:1 = 6.000000
[1, 3]: Texture value at x:3 y:1 = 3.000000
[1, 3]: device_ptr value at x:3 y:1 = 7.000000
[2, 0]: Texture value at x:0 y:2 = 0.000000
[2, 0]: device_ptr value at x:0 y:2 = 8.000000
[2, 1]: Texture value at x:1 y:2 = 1.000000
[2, 1]: device_ptr value at x:1 y:2 = 9.000000
[2, 2]: Texture value at x:0 y:3 = 0.000000
[2, 2]: device_ptr value at x:0 y:3 = 12.000000
[2, 3]: Texture value at x:1 y:3 = 1.000000
[2, 3]: device_ptr value at x:1 y:3 = 13.000000
[3, 0]: Texture value at x:2 y:2 = 2.000000
[3, 0]: device_ptr value at x:2 y:2 = 10.000000
[3, 1]: Texture value at x:3 y:2 = 3.000000
[3, 1]: device_ptr value at x:3 y:2 = 11.000000
[3, 2]: Texture value at x:2 y:3 = 2.000000
[3, 2]: device_ptr value at x:2 y:3 = 14.000000
[3, 3]: Texture value at x:3 y:3 = 3.000000
[3, 3]: device_ptr value at x:3 y:3 = 15.000000
So, did I miss anything?